Over the years, I've wanted to tweak Minecraft Mods slightly for balance adjustments and custom behaviors. At first, I only wanted to change a few static constants, but my requirements evolved over time to include new behaviors and even rewriting some functions. In the process of making python scripts and Jupyter Notebooks to patch these files, I slowly built a bytecode editing library to make the task simpler. I finally reached the point where it made more sense to make a fully-featured library than repeated ad-hoc solutions.
There are no other python libraries for automatically manipulating JVM files, so I combined all the knowledge I had accumulated into one library and command-line utility. The basics were mostly done - parsing, simple modifications, and repackaging the edited class files into a jar. I created a patching system that used regex to decide which files to pull out of a jar and which patches to apply to those files. I made a backup system to save copies of the clean jar file before patching, and a command to restore the backup if the patches went wrong.
The difficult part was stack frame analysis. Each Java method contains a table of stack frames, which are used to typecheck the bytecode. Unlike the rest of the class file format, the stack map table and stack frames are very poorly documented. Many details are only specified in terms of a Prolog verification program, which I had to reverse engineer to reveal the specified behavior. Adding to the difficulty, stack frames are encoded in an extremely strange, highly compressed format, with several special-cased behaviors.
Even once the stack frames were parsed and re-serialized correctly, it was very difficult to expose them to patch code in a convenient way. The placement of the stack frames roughly follows a Basic Block breakdown of the assembly code: each instruction that is a target of a branch must have a stack frame entry. Frames are also needed whenever a new local variable is defined, and local variables are stored in a stack format, so defining a new local variable is not a simple process. The new local variable needs to be put into a scope as defined by its placement in the local variable stack.
One extra challenge imposed by my testing suite was that the class file had to be generated in the same way as the original compiler did. This is because parsing the file into a convenient representation still needed to be serialized exactly the same as it was originally. That meant I couldn't take shortcuts with bytecode instruction generation; there are many short instructions for encoding common constants, common local variables, and common stack manipulations. So for example, parsing the bytecode turned "fconst1" into a python float 1.0; but serializing the bytecode, instead of the more general "ldc" opcode for loading constants, the bytecode had to be serialized back into "fconst1". In this instance that is a simple check, however there are 20 special constant opcodes, 40 special local variable loading and storing opcodes, 5 special branches for comparing with zero, 4 variants of each arithmetic opcode, 5 variants of each array manipulation opcode, 3 different ways to create an array, etc. The JVM's bytecode has a lot of redundancy that needs to be removed for patches, and re-applied for serializing.
The library is currently being used by a patching harness based around Minecraft's folder structure and the particular goal of mod adjustment. However, the library was designed to stand on its own merits and is not specific to Minecraft in any way.