Have a Bite of Bytecode!!

This topic is very different from other Java topics.  I am sure a lot of developer think that why should I learn about byte code at all. Well I can give only two reasons of that

(1)    Either you are a Byte code engineer for example using a library like BCEL which is used to change/enhance byte code generated by Java compiler.

(2)    You are passionate to learn thing under the hood (Under the hood is used mostly by Bill Venners a great writer of a great book Inside the Java Virtual machine worth reading it.)

What is byte code after all??

Just like C and C++ generates assembler code. Java compiler generates byte codes.  This byte code is used by JVM to perform instructions written by Programmer in Java language.

What byte code consists of?

Byte code is basically combination of instructions to be executed at run time by JVM. The main things byte code has:

  • Opcodes: Opcode is an instruction which will get executed by JVM at run time.
  • *Information of reference to the objects *
  • Information about the local variables
    There is one command line tool given by sun which comes with Java installer. That command line tool is used to read byte codes generated by Java compiler. For Example we have following java code:

[sourcecode language="css"]public class BiteOfByteCode{

public void sayHello(String name){

System.out.println(name);

}

}[/sourcecode]

After compilation following command will give you result as shown below in screen shot.

javap  -c BiteOfByteCode* *

* *

* * Now as you can see above things doesn’t look familiar at all. Everything under* Code:* is byte codes generated by javac command i.e. Java compiler. You can see that some weird start of commands. For example any one can understand that load_0 is trying to load something but what the hell is a at the beginning of that command??

This is the character which JVM used to understand the type of variable.

Here is one table that explains few of them:

Character Description
a This means the loaded variable is an object
i Integer
f Float
c Char
b Byte

Before understanding the byte code let’s understand the memory management of JVM. JVM is stack based. For each method it has a separate area called Frame. Below we have a sample display of a frame:

I like to draw such pictures on the board and use them for articles. :)

Each frame has main 3 components:

  • Local Variable array: This array consists of all the local variables/ reference to the local variables and also it consists of a reference to this (Object on which method is invoked) stored at index 0. In case of static method first parameter value will be stored at index 0.
  • Oprand stack: This is used by all the instruction to perform specific tasks. For example when an instruction will have to add two variables the data will be popped from the local array of variables and pushed in to the operand stack and result will be pushed back.
  • Reference to the Constant pool: We all know that in java we have a concept called constant pool for example for java.lang.String. This reference refers to the constant pool required for the method.
    Let’s go through the byte codes now



In the first part it executes constructor of class.  As you can see that BiteOfByteCode class does not extend any class so it will extend java.lang.Object class implicitly.

*aload_0: *This command fetches reference of this from the local array and pushes it to the operand stack. As mentioned earlier reference to this is stored at index 0. That is why it says load_0.  Where 0 represents the index of local array and “a” at the beginning represents that instruction is applicable to the object reference.

invokespecial #1: *will invoke constructor of super class in our case java.lang.Object.  #1 is used to build an index into the *runtime constant pool of the class where the reference to name is stored.

From JVM specification:

Runtime Constant Pool

runtime constant pool is a per-class or per-interface runtime representation of the constant_pool table in a class file (§4.4). It contains several kinds of constants, ranging from numeric literals known at compile time to method and field references that must be resolved at run time. The runtime constant pool serves a function similar to that of a symbol table for a conventional programming language, although it contains a wider range of data than a typical symbol table.

Each runtime constant pool is allocated from the Java virtual machine's method area (§3.5.4). The runtime constant pool for a class or interface is constructed when the class or interface is created (§5.3) by the Java virtual machine.
*return: *This will simply return from the constructor.

Let’s look into the method’s byte code now:

getstatic: This will get static object of out because in our example we have used System.out to print value passed to the method.

*aload_1: *This is similar to the very first instruction in constructor. This will pop the value from local array at index 1 and will load in the operand stack In this case value at index 1 is String object reference which is passed as parameter to the method. Let us say we have int variable instead of String then this instruction will be iload_0 where I represents that Integer needs to be loaded. * *

Then last one will simply make a call to println method.

This is all about the above byte code structure.


One more thing which is worth knowing is the values coming on the left hand for example in case of constructor first opcode is at 0 then next one at 1 then 4. One may ask how come 4 after 1. This is fairly simple to understand this is basically index as you can see first one takes only 1 memory location second one takes 3 (one for invokespecial, second for #1 and third of object reference).

All this is really very basic idea and there is much more to learn about byte code. I hope you must have enjoyed this tutorial and I am looking forward for your comments/feedback. Keep visiting my blog I will do my best to make it worth reading it.

Share on : Twitter, Facebook or Google+