Bytecode engineering with BCEL

The Apache Byte Code Engineering Library (BCEL) lets you dig into the bytecode of Java classes. You can use it to transform existing class representations or construct new ones, and because BCEL works at the level of individual JVM instructions, it gives you the utmost power over your code. That power comes with a cost in complexity, though. In this article, Java consultant Dennis Sosnoski gives you the BCEL basics and guides you through an example BCEL application so you can decide for yourself if the power justifies the complexity.

In the last three articles of this series, I've shown you how to use the Javassist framework for classworking. This time I'm going to cover a very different approach to bytecode manipulation, using the Apache Byte Code Engineering Library (BCEL). BCEL operates at the level of actual JVM instructions, unlike the source code interface supported by Javassist. The low-level approach makes BCEL very good for when you really want to control every step of the program execution, but it also makes working with BCEL a lot more complex than using Javassist for cases where both will work.

I'm going to start out by covering the basic BCEL architecture, then devote most of this article to rebuilding my first Javassist classworking example with BCEL. I'll finish up with a quick look at some of the tools included in the BCEL package and some of the applications developers have built on top of BCEL.

BCEL class access

BCEL gives you all the same basic capabilities as Javassist to inspect, edit, and create Java binary classes. The obvious difference with BCEL is that everything is designed to work at the level of JVM assembler language, rather than the source code interface provided by Javassist. There are some deeper differences under the covers, including the use of two separate hierarchies of components within BCEL -- one for inspecting existing code and the other for creating new code. I'm going to assume you're familiar with Javassist from the previous articles in this series . I'll therefore concentrate on the differences that are likely to confuse you when you start working with BCEL.

As with Javassist, the class inspection aspect of BCEL basically duplicates what's available directly in the Java platform through the Reflection API. This duplication is necessary in a classworking toolkit because you generally don't want to load the classes you're working with until after they've been modified. BCEL provides some basic constant definitions in the org.apache.bcel package, but aside from these definitions all the inspection-related code is in the org.apache.bcel.classfile package. The starting point within this package is the JavaClass class. This class plays about the same role in accessing class information using BCEL as java.lang.Class does when using regular Java reflection. JavaClass defines methods to get the field and method information for the class, as well as structural information about superclass and interfaces. Unlike java.lang.Class, JavaClass also provides access to the internal information for the class, including the constant pool and attributes, and the complete binary class representation as a byte stream.

JavaClass instances are usually created by parsing the actual binary class. BCEL provides the org.apache.bcel.Repository class to handle the parsing for you. By default, BCEL parses and caches the representations of classes found in the JVM classpath, getting the actual binary class representations from an org.apache.bcel.util.Repository instance (note the difference in the package name). org.apache.bcel.util.Repository is actually an interface for a source of binary class representations. You can substitute other paths for looking up class files, or other ways of accessing class information, in place of the default source that uses the classpath.

Changing classes

Besides reflection-style access to class components, org.apache.bcel.classfile.JavaClass also provides methods for altering the class. You can use these methods to set any of the class components to new values. They're not generally of much direct use, though, because the other classes in the package don't provide support for constructing new versions of the components in any reasonable manner. Instead, there's an entire separate set of classes in the org.apache.bcel.generic package that provides editable versions of the same components represented by org.apache.bcel.classfile classes.

Just as org.apache.bcel.classfile.JavaClass is the starting point for using BCEL to inspect existing classes, org.apache.bcel.generic.ClassGen is your starting point for creating new classes. It also works for modifying existing classes -- to handle that case, there's a constructor that takes a JavaClass instance and uses it to initialize the ClassGen class information. Once you're done with your class modifications, you can get a usable class representation from the ClassGen instance by calling a method that returns a JavaClass, which can in turn be converted to a binary class representation. Sound confusing? I think it is. In fact, going back and forth between the two packages is one of the most awkward aspects of working with BCEL. The duplicate class structures tend to get in the way, so if you're doing much with BCEL, it may be worthwhile to write wrapper classes that can hide some of these differences. For this article, I'll work mainly with the org.apache.bcel.generic package classes and avoid the use of wrappers, but it's something for you to keep in mind for your own work.

Join now!

Besides ClassGen, the org.apache.bcel.generic package defines classes to manage the construction of various class components. These construction classes include ConstantPoolGen for handling the constant pool, FieldGen and MethodGen for fields and methods, and InstructionList for working with sequences of JVM instructions. Finally, the org.apache.bcel.generic package also defines classes to represent every type of JVM instruction. You can create instances of these classes directly, or in some cases by using the org.apache.bcel.generic.InstructionFactory helper class. The advantage of using InstructionFactory is that it handles many of the bookkeeping details of instruction building for you (including adding items to the constant pool as needed for the instructions). You'll see how to ...

This is a preview of the whole essay