The Intermediate form could be “machine code” for an abstract or virtual machine. Usually, the abstract machine selected, closely models the constructs and operations of the High Level language being processed. P-code used with many PASCAL compilers and the Bytecode used by Java compilers are good examples.
The abstract machine models primitive operations and data types in a High Level language. Usually, the instruction set of the abstract machine consists of single-address instructions.
Although defining an abstract machine for a particular High Level language is relatively easy, defining a single abstract machine for a number of diverse languages, e.g. C, Java, Lisp and Prolog, is a difficult task.
The abstract machine for PASCAL is called P-machine and the code which it executes is called P-code. It is a stack machine, i.e. the instruction set is defined for data operations with respect to an operand stack. It consists of a stack, five registers and a memory (see Fig. 8.8). The five registers are explained below:
PC – Program Counter: Keeps track of the next code position.
NP – New pointer: Top of the Heap, location from where next dynamically, explicitly, allocated memory will be issued.
SP – Stack pointer: Top of the execution stack.
MP – Mark pointer: Base of the stack-frame of current function.
EP – Extreme stack pointer: The maximum possible size of each stack-frame is known at compile-time and this is denoted by EP. SP cannot go beyond EP.
The memory is viewed as a linear array of words and has two major parts – code and store. The stack contains a series of stack-frames (Activation Records, see Sections 7.1.3 and 9.6.3.) In P-code, the stack-frame contains:
The “Mark stack” part contains implicit parameters:
The P-machine has several classes of instructions, like integer and real arithmetic, logical, relational, conditional and unconditional branches, subroutine call, etc. Some of them are:
ABR: Absolute value of real, pop the real on the TOS and push back its absolute value.
ADI: Adds two integers popped from the top of the stack and leaves an integer result.
DVR: Real division.
CUP: Call user procedure.
INN: Test set membership.
UJP: Unconditional jump.
Java Bytecode is the input language for the Java Virtual Machine (JVM). Remember that Java is a strongly typed language and many of the automatic upgrading and downgrading of value types are not available in it.
The JVM is expected to correctly perform operations given in a Java class file. Operations like memory layout, garbage collection and internal optimization are not part of JVM specifications and are left to the implementer.
There are several differences between Java language and the JVM language, for example Boolean in Java is represented by an integer in JVM.
The following data areas are considered:
At start-up:
Heap: Shared between threads, the Garbage Collector operates on this area (need not be contiguous).
Methods area: Similar to text area in Unix/Linux systems.
Per thread: Created when a thread is created.
PC: A pointer to the Bytecode list. It is undefined for the native code, but it can hold the return address or pointer to a native code subroutine.
VM stack: Stores locals, temporaries, invocation return, etc. It is allocated on the Heap and need not be contiguous.
Run-time constants pool: Per class or interface.
Native methods stacks: Used, for example, implementation of VM in C.
Frames (AR): Activation Records (Chapter 7) created on the JVM stack. Contains an operand-stack, local variables, etc.
Instruction set of JVM consists of instructions with one-byte op-code plus zero or more bytes for operands (Big-endian).
We illustrate the nature of Java Bytecode program (i.e. contents of a class file) by a simple example of a Java program to calculate Fibonacci numbers.
public class fibonacci{
static int fibonacci(int n){
if(n > 1) return fibonacci(n − 1) + fibonacci(n − 2);
else return 1;
}
public static void main(String[] args){
for(int i = 0; i < 10; i++)
System.out.println(“fibonacci(“+i+”) = “+fibonacci(i));
}
}
We obtained a human-readable version of the corresponding Bytecode program by a Java system utility javap:
static int fibonacci(int);
Code:
0: iload_0
1: iconst_1
2: if_icmple 19
5: iload_0
6: iconst_1
7: isub
8: invokestatic #2; //Method fibonacci:(I)I
11: iload_0
12: iconst_2
13: isub
14: invokestatic #2; //Method fibonacci:(I)I
17: iadd
18: ireturn
19: iconst_1
20: ireturn
}
We have not shown the Bytecode the main() method. A careful scrutiny of the Bytecode will tell you that:
Thus, we see that Java Bytecode is in Reverse Polish notation and the JVM is a stack-oriented machine.
3.141.42.116