JVM ABCs Explained

Write once, run anywhere was a 1995 slogan created by Sun Microsystems to illustrate the Java language’s cross-platform benefits. Java was introduced as a way of developing operating system-independent “applets” for the desktop. Then the focus moved to the server-side, and later, Java has become a key technology for the Web. Java Virtual Machine (JVM) is the engine that provides a runtime environment to drive the Java applications. This article presents the JVM, its components, and how it works.

Before this, let’s learn more about Runtime and VM concepts.

What Is a Runtime?

A runtime provides an environment to translate the code written in a high-level language like Java to machine code and understandable by the CPU.

We can distinguish these types of translators:

What is a Virtual Machine?

A VM is the representation of a physical computer but in a virtual manner. The physical computer is the host machine and runs the guest machine called the VM.

A physical machine can run multiple VMs, each with its own OS and applications. These VMs are isolated from each other.

What is the Java Virtual Machine?

Similar to VMs, the JVM creates an isolated space on a host machine. It is used to execute Java programs irrespective of the machine’s platform or operating system.

The JVM is used to run Java desktop, server, and web applications.

Java code .java is compiled inside the JVM to an intermediary format called Java bytecode .class. Then, the JVM parses the resulting Java bytecode and translates it to binaries.

The JVM uses the Just-In-Time (JIT) Compiler during the runtime.

The JVM is a stack-based VM where all the arithmetic and logic operations are carried out via push and pop operands, and results are stored on the stack. The stack is also the data structure to store methods. Stack-based VM bytecode is very compact because the location of operands is implicitly on the operand stack. Register-based VM bytecode requires all the implicit operands to be part of an instruction. That indicates that the Register-based code size will usually be much larger than Stack-based bytecode. On the other hand, register-based VM’s can express computations using fewer VM instructions than a corresponding stack-based VM. Dispatching a VM instruction is costly, so reducing executed VM instructions is likely to improve the Register-based VM’s speed significantly.

Java Virtual Machine Components

The JVM consists of three distinct components:

  1. ClassLoader
  2. Runtime Memory/Data Area
  3. Execution Engine
  4. Native Method Interface JNI
  5. Native Method Library

Class Loader

When we compile a .java file, it is converted into byte code inside the .class file. The class loader loads the code into the main memory when we try to use this class in a program. The first loaded class into memory is usually the class that contains the main() method.

There are three steps in the class loading phase: loading, linking, and initialization.

Loading: involves considering the binary representation (bytecode) of a class with a particular name and generating the original class from that. There are three built-in class loaders in Java:

The JVM calls the ClassLoader.loadClass() method to load the class into memory based on a fully qualified name and the different Loader classes. If the last child class loader cannot load the class, it throws NoClassDefFoundError or ClassNotFoundException.

Linking: Linking process involves combining the different classes and dependencies of the program. Linking includes the following steps:

Preparation: the JVM allocates memory for a class’s static fields and initializes them with default values.

Initialization: involves the execution of the initialization method of the class (known as <clinit>) [class’s constructor, the static block, and assign values to all the static variables].

Runtime Data Area

The JVM contains five principal components inside the runtime data area:

It is created on the start-up, and there is only one area per JVM.

Execution Engine

The execution engine is used for performing functions such as garbage collection and compilation to machine code after loading the bytecode into the main memory and the data about the code in the runtime data area.

However, before executing the program, the bytecode is converted into machine language instructions. The Execution Engine uses the interpreter to execute the byte code at first, but when it finds some duplication, it uses the JIT compiler.

The JIT Compiler has the following components:

  1. Intermediate Code Generator - generates intermediate code
  2. Code Optimizer - optimizes the intermediate code for better performance
  3. Target Code Generator - converts intermediate code to native machine code
  4. Profiler - finds the hotspots (code that is executed repeatedly)

The Garbage collection makes Java memory-efficient because it removes the unreferenced objects from heap memory and makes free space for new objects. It involves two phases:

  1. Mark - in this step, the GC identifies the unused objects in the memory
  2. Sweep - in this step, the GC removes the objects identified during the previous phase

Garbage Collections are done automatically by the JVM at regular intervals and do not need to be handled separately. It can also be triggered by calling System.gc().

Java Native Interface (JNI)

Java promotes the execution of native code via the Java Native Interface (JNI) to interact with hardware or overcome memory management and performance constraints in Java.

JNI behaves as a bridge for permitting the supporting packages for other programming languages such as C, C++, etc.

Native Method Libraries

Native Method Libraries component is a set of binaries in other programming languages, such as C, C++, and assembly. These libraries are loaded through JNI.

Conclusion

In this article, we discussed the Java Virtual Machine’s components. Often we omit how the JVM works while our code is working. This can be useful when tweaking the JVM or fix a memory leak or understand its internal mechanics.