Java virtual machine architecture

I. data type

Similar to data types in the Java programming language, The data types that Java virtual machine can operate can be divided into two categories: primitive types (which are often translated into native types or basic types) and reference types. Corresponding to them, there are primitive values and reference values (reference values) two types of values can be used for variable assignment, parameter passing, method return and operation.

II. Original type and value

The original data types supported by the Java virtual machine include numeric types, Boolean types and returnaddress types. The numeric types are divided into integral types and floating point types,

The integer types include:

Floating point types include:

2.1 integer type and integer value

The value range of integer type in Java virtual machine is as follows:

2.2 floating point type, value set and floating point value

Floating point types include float type and double type, which are conceptually consistent with the 32-bit single precision and 64 bit double precision IEEE 754 format values and operations defined in IEEE standard for binary floating point arithmetic ANSI / IEEE STD. 754-1985 (IEEE, New York).

The contents of IEEE 754 standard include not only sign magnitude numbers, but also positive and negative zero, positive and negative infinity and a special "not-a-number" (hereinafter represented by Nan). Nan value is used to represent some invalid operations, such as divisor is zero.

All Java virtual machine implementations must support two standard sets of floating-point values: single precision floating-point sets and double precision floating-point sets. In addition, the Java virtual machine implementation can freely choose whether to support single precision extended index set and double precision extended index set, or one or all of them. These extended exponential sets may represent values of type float and double instead of standard floating-point sets in some specific cases.

2.3 returnaddress type and value

The returnaddress type is used by the JSR, RET, and JSR of the Java virtual machine_ Used by the w command. The value of type returnaddress points to the opcode of a virtual machine instruction. Different from the original types of the numerical classes described earlier, the returnaddress type does not have a corresponding type in the Java language, and the value of the returnaddress type cannot be changed during the running of the program.

2.4 boolean type

Although the Java virtual machine defines the data type Boolean, it only provides very limited support for it. There is no bytecode instruction dedicated to Boolean value in the Java virtual machine. The operation of boolean type value is involved in the Java language. After compilation, the int data type in the Java virtual machine is used instead. The Java virtual machine directly supports Boolean arrays, which can be created by the newarray instruction of the virtual machine. The access and modification of Boolean array type share the baload and bastore instructions of byte type array.

Three reference types and values

There are three reference types in the virtual machine: class types, array types and interface types. The values of these reference types are dynamically created by class instances, array instances and class or array instances that implement an interface.

Among them, The array type also contains a component type with a single dimension (that is, the length is not determined by its type) (component type), the component type of an array can also be an array. However, starting from any array, if it is found that its component type is also an array type, continue to repeatedly fetch the component type of the array. In this way, the operation will continue. Finally, when the component type is not an array, this type will become the element type of the array type (element type). The element type of the array must be one of the original type, class type or interface type.

There is also a special value in the value of reference type: null. When a reference does not point to any object, its value is represented by null. A null reference does not have any actual type without context, but it can be transformed into any reference type when there is a specific context. The default value of the reference type is null.

The Java virtual machine specification does not specify how null should be encoded in the virtual machine implementation.

IV. runtime data area Java

Virtual machine defines several runtime data areas that will be used during program running, some of which will be created as the virtual machine starts and destroyed as the virtual machine exits. Others correspond to threads one by one. These data areas corresponding to threads will be created and destroyed as the thread starts and ends.

4.1 PC register

Java virtual machine can support multiple threads to execute at the same time (refer to Chapter 17 of Java language specification). Each Java virtual machine thread has its own PC (program counter) register. At any time, a Java virtual machine thread will only execute the code of one method. The method being executed by the thread is called the current method of the thread (current method). If this method is not native, the PC register stores the address of the bytecode instruction being executed by the Java virtual machine. If this method is native, the value of the PC register is undefined. The capacity of the PC register should be able to store at least one data of returnaddress type or the value of a local pointer related to the platform.

4.2 Java virtual machine stack

Each Java virtual machine thread has its own private Java virtual machine stack, which is created simultaneously with the thread to store stack frames. The role of Java virtual machine stack is the same as that of traditional language The stack in (e.g. C language) is very similar, which is used to store local variables and some process results. In addition, it also plays a very important role in method call and return. Because the Java virtual machine stack will not be affected by other factors except the out of stack and in stack of stack frames, so stack frames can be allocated in the heap, which is used by the Java virtual machine stack Storage does not need to be guaranteed to be continuous.

The Java virtual machine specification allows the Java virtual machine stack to be implemented as a fixed size or dynamically expanded and shrunk according to computing. If a fixed size Java virtual machine stack design is adopted, the Java virtual machine stack capacity of each thread should be selected independently when the thread is created. Java virtual machine implementation should provide a means for programmers or end users to adjust the initial capacity of the virtual machine stack. For Java virtual machine stack that can be dynamically expanded and shrunk, it should provide a means to adjust its maximum and minimum capacity. The following exceptions may occur in the Java virtual machine stack:

4.3 Java heap

In the Java virtual machine, heap is a runtime memory area that can be shared by all threads, and it is also an area for all class instances and array objects to allocate memory.

The Java heap is created when the virtual machine starts, It is stored by the automatic storage management system (also known as "garbage collector") )Managed objects that do not need to be and cannot be explicitly destroyed. The Java virtual machine described in this specification does not assume any specific technology to implement the automatic memory management system. Virtual machine implementers can choose automatic memory management technology according to the actual needs of the system. The capacity of Java heap can be fixed or dynamically expanded according to the requirements of program execution, and automatically shrinks when too much space is not required. The memory used by the Java heap does not need to be guaranteed to be continuous.

The implementation of Java virtual machine should provide a means for programmers or end users to adjust the initial capacity of Java heap. For Java heap that can be dynamically expanded and shrunk, it should provide a means to adjust its maximum and minimum capacity.

The following exceptions may occur in the Java heap:

4.4 method area

In the Java virtual machine, The method area is a runtime memory area that can be shared by various threads. The function of the method area is very similar to the storage area of compiled code in traditional languages or the text segment of operating system processes. It stores the structural information of each class, such as the runtime constant pool (runtime constant pool), field and method data, bytecode contents of constructors and ordinary methods, as well as some special methods used in class, instance and interface initialization.

The method area is created when the virtual machine starts. Although the method area is a logical part of the heap, a simple virtual machine implementation can choose not to implement garbage collection in this area. This version of the Java virtual machine specification does not limit the memory location of the implementation method area and the management strategy of compiled code. The capacity of the method area can be fixed or dynamically expanded according to the requirements of program execution, and automatically shrinks when too much space is not required. The method area can be discontinuous in the actual memory space. The Java virtual machine implementation should provide a means for programmers or end users to adjust the initial capacity of the method area. For the method area that can be dynamically expanded and contracted, it should provide a means to adjust its maximum and minimum capacity. The following exceptions may occur in the method area:

4.5 runtime constant pool

The runtime constant pool is the runtime representation of the constant_pool of each class or interface. It includes several different constants: from the numeric literal known at compile time to the method or field reference that can only be obtained after parsing at run time. The runtime constant pool plays a role similar to the symbol table in traditional languages (symbol table), but it stores a wider range of data than the symbol table in the usual sense.

Each runtime constant pool is allocated in the method area of the Java virtual machine. After the classes and interfaces are loaded into the virtual machine, the corresponding runtime constant pool is created.

When creating the runtime constant pool of classes and interfaces, the following exceptions may occur:

4.6 local method stack

The Java virtual machine implementation may use the traditional stack (commonly referred to as "C stacks") to support the execution of native methods (Methods written in languages other than Java). This stack is the native method stack. When the Java virtual machine uses other languages (for example, C language) will also use the local method stack when implementing the instruction set interpreter. If the Java virtual machine does not support the natvie method and does not rely on the traditional stack, it does not need to support the local method stack. If it supports the local method stack, the stack will generally be allocated by thread when the thread is created.

The Java virtual machine specification allows the local method stack to be implemented as a fixed size or dynamically expanded and shrunk according to the calculation. If a fixed size local method stack is used, the local method stack capacity of each thread should be selected independently when the stack is created. Generally, the Java virtual machine implementation should provide a means for programmers or end users to adjust the initial capacity of the virtual machine stack. For the local method stack whose length can change dynamically, it should provide a means to adjust its maximum and minimum capacity. The following exceptions may occur in the local method stack:

Five stack frames

Stack frame is a data structure used to store data and some process results. It is also used to handle dynamic linking, method return value and dispatch exception.

Stack frames are created with method calls, Destroy as the method ends - whether the method completes normally or abnormally (throws an exception that is not captured in the method) is counted as the end of the method. The storage space of stack frames is allocated in the Java virtual machine stack, and each stack frame has its own local variables table (§ 2.6.1) and operand stack (operator stack) and a reference to the runtime constant pool of the class to which the current method belongs.

The capacity of the local variable table and operand stack is determined at compile time, saved and provided to the stack frame through the code attribute of the method. Therefore, the size of stack frame capacity only depends on the implementation of Java virtual machine and the memory that can be allocated during method call.

In a thread, only the stack frame of the method currently executing is active. This stack frame is called the current stack frame, the method corresponding to this stack frame is called the current method, and the class defining this method is called the current class. Various operations on the local variable table and operand stack usually refer to the operations on the local variable table and operand stack of the current stack frame.

If the current method calls other methods, or the current method execution ends, the stack frame of this method will no longer be the current stack frame. When a new method is called, a new stack frame will be created and become a new current stack frame as program control is transferred to the new method. When the method returns, the current stack frame will return the execution result of this method to the previous stack frame. After the method returns, the current stack frame will be discarded, and the previous stack frame will become the current stack frame again.

Please note that the stack frame is the thread's local private data, and it is impossible to reference the stack frame of another thread in one stack frame.

5.1 table of local variables

Each stack frame contains a set of tables called local variables (local variables). The length of the local variable table in the stack frame is determined by the compilation time and stored in the binary representation of the class and interface. It is saved and provided to the stack frame through the code attribute of the method. A local variable can save a data type of Boolean, byte, char, short, float, reference and returnaddress Variables can hold data of types long and double. Local variables use indexes for location access. The index value of the first local variable is zero, and the index value of local variables is all integers from zero to less than the maximum capacity of the local variable table. Long and double data occupy two consecutive local variables. The data values of these two types are located by the smaller index value of the two local variables. For example, we say that a double type value is stored in a local variable with an index value of N, which actually means that both local variables with an index value of N and N + 1 are used to store this value. A local variable with an index value of N + 1 cannot be read directly, but may be written. However, if this operation is performed, the content of the local variable n will be invalidated.

The n value of the local variable n mentioned above is not required to be even, and the Java virtual machine does not require double and long data to be stored in continuous local variables in a 64 bit pair. The virtual machine implementer can freely choose an appropriate way to store a double or long value through two local variables.

The Java virtual machine uses the local variable table to complete the parameter transfer during method call. When a method is called, its parameters will be transferred to the continuous local variable table starting from 0. In particular, when an instance method is called, the 0th local variable must be used to store the reference of the object where the called instance method is located (i.e. "this" keyword in Java language). Other subsequent parameters will be passed to the continuous local variable table starting from 1.

5.2 operand stack

Each stack frame (§ 2.6) contains a last in first out (LIFO) stack called the operand stack. The length of the operand stack in the stack frame is determined by the compilation time and stored in the binary representation of classes and interfaces, both through the code attribute of the method (§ 4.7.3) save and provide to stack frame for use. On the premise that the top and bottom are civilized and there is no misunderstanding, we often directly refer to "operand stack of current stack frame" as "operand stack" 。 When the stack frame to which the operand stack belongs is just created, the operand stack is empty. The Java virtual machine provides some bytecode instructions to copy constants or variable values from the fields of the local variable table or object instance to the operand stack. It also provides some instructions to take data and operation data from the operand stack and re stack the operation results. When a method is called, the operand stack is also used to prepare the parameters of the calling method and receive the results returned by the method.

For example, the function of Iadd bytecode instruction is to add two int type values. It requires that there are two int type values put by other instructions at the top of the operand stack before execution. When the Iadd instruction is executed, the two int values are taken out of the operation stack, added and summed, and then the summation result is put back on the stack. In the operand stack, an operation is often nested by multiple sub operations, and the result of a sub operation process can be used by other peripheral operations.

Each operand stack member (entry) can store the value of any data type defined in a Java virtual machine, including long and double types.

The data in the operand stack must be operated correctly. Here, correct operation means that the operation on the operand stack must match the data type at the top of the operand stack. For example, you can't stack two int types of data, and then operate them as a long type, or stack two float types of data, and then use the Iadd instruction to sum them. A small number of Java virtual machine instructions (such as DUP and swap instructions) can ignore the specific data types of operands and treat all the data in the runtime data area as raw types (raw type) data. These instructions can not be used to modify data, nor can they be used to break up originally inseparable data. The correctness of these operations will be forcibly guaranteed through the verification process of class file.

At any time, the operand stack will have a certain stack depth. Data of one long or double type will occupy two units of stack depth, and other data types will occupy one unit of stack depth.

Vi. dynamic link

Each stack frame contains a reference to the runtime constant pool to support the code of the current method to realize dynamic linking. In the class file, it describes that a method calls other methods or accesses its member variables through symbolic reference (symbolic reference). The function of dynamic link is to convert the methods represented by these symbolic references into direct references of actual methods. During class loading, the unresolved symbolic references will be resolved, and the variable access will be converted into the correct offset of the runtime memory location where the storage structure accessing these variables is located.

Due to the existence of dynamic links, when the methods and variables of other classes used through late binding change, they will not affect the methods calling them.

VII. Special naming of initialization methods

At the Java virtual machine level, The constructor in Java language is in the Java language specification (Third Edition) (hereinafter referred to as jls3) is in the form of a special instance initialization method named. This method name is named by the compiler because it is not a legal Java method name and cannot be implemented through program coding. The instance initialization method can only be called through the invokespecial instruction of Java virtual machine during instance initialization, The instance initialization method can be called and accessed only when the instance is being constructed (jls3).

A class or interface can contain at most no more than one class or interface initialization method, and the class or interface is initialized through this method. This method is a static method with no parameters, named < clinit >. This name is also named by the compiler because it is not a legal Java method name and cannot be implemented by program coding. The initialization method of a class or interface is implicitly called by the Java virtual machine itself. No virtual machine bytecode instruction can call this method. It will only be called by the virtual machine itself in the initialization phase of the class.

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>