Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Java virtual machine
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==JVM specification== The Java virtual machine is an abstract (virtual) computer defined by a specification. It is a part of the Java runtime environment. The [[Garbage collection (computer science)|garbage collection]] algorithm used and any internal optimization of the Java virtual machine instructions (their translation into [[machine code]]) are not specified. The main reason for this omission is to not unnecessarily constrain implementers. Any Java application can be run only inside some concrete implementation of the abstract specification of the Java virtual machine.<ref>Bill Venners, ''[http://www.artima.com/insidejvm/ed2/index.html Inside the Java Virtual Machine] {{Webarchive|url=https://web.archive.org/web/20210125092727/http://www.artima.com/insidejvm/ed2/index.html |date=2021-01-25 }}'' Chapter 5</ref> Starting with [[Java Platform, Standard Edition]] (J2SE) 5.0, changes to the JVM specification have been developed under the [[Java Community Process]] as JSR 924.<ref>{{cite web |url=http://www.jcp.org/en/jsr/detail?id=924 |title=The Java Community Process(SM) Program - JSRs: Java Specification Requests - detail JSR# 924 |publisher=Jcp.org |access-date=2015-06-26 |archive-date=2020-12-24 |archive-url=https://web.archive.org/web/20201224125612/https://jcp.org/en/jsr/detail?id=924 |url-status=live }}</ref> {{As of|2006}}, changes to the specification to support changes proposed to the [[class (file format)|class file format]] (JSR 202)<ref>{{cite web |url=http://www.jcp.org/en/jsr/detail?id=202 |title=The Java Community Process(SM) Program - JSRs: Java Specification Requests - detail JSR# 202 |publisher=Jcp.org |access-date=2015-06-26 |archive-date=2012-02-26 |archive-url=https://web.archive.org/web/20120226185155/http://www.jcp.org/en/jsr/detail?id=202 |url-status=live }}</ref> are being done as a maintenance release of JSR 924. The specification for the JVM was published as the ''blue book'',<ref>''[http://java.sun.com/docs/books/vmspec/ The Java Virtual Machine Specification] {{Webarchive|url=https://web.archive.org/web/20080709010412/http://java.sun.com/docs/books/vmspec/. |date=2008-07-09 }}'' (the [http://java.sun.com/docs/books/vmspec/html/VMSpecTOC.doc.html first] {{Webarchive|url=https://web.archive.org/web/20081012060813/http://java.sun.com/docs/books/vmspec/html/VMSpecTOC.doc.html |date=2008-10-12 }} and [http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html second] {{Webarchive|url=https://web.archive.org/web/20110925050249/http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html |date=2011-09-25 }} editions are also available online).</ref> whose preface states: {{Blockquote|[[Sun Microsystems|We]] intend that this specification should sufficiently document the Java Virtual Machine to make possible compatible clean-room implementations. Oracle provides tests that verify the proper operation of implementations of the Java Virtual Machine.}} The most commonly used Java virtual machine is Oracle's [[HotSpot (virtual machine)|HotSpot]]. Oracle owns the Java trademark and may allow its use to certify implementation suites as fully compatible with Oracle's specification. ===Garbage collectors=== {{main|Garbage collection (computer science)#Java}} {| class="wikitable" |+ Java versions and their Garbage Collectors |- ! Version !! Default GC !! Available GCs |- | 6u14 || rowspan="2" | Serial /<br />Parallel ([[Multiprocessor system architecture|MP]]) || Serial, Parallel, [[Concurrent mark sweep collector|CMS]], ''[[Garbage-first collector|G1]] (E)'' |- | 7u4 - 8 || rowspan="2" | Serial, Parallel, CMS, G1 |- | 9 - 10 || rowspan="8" | G1 |- | 11 || Serial, Parallel, CMS, G1, ''Epsilon (E)'', ''ZGC (E)'' |- | 12 - 13 || Serial, Parallel, CMS, G1, ''Epsilon (E)'', ''ZGC (E)'', ''Shenandoah (E)'' |- | 14 || Serial, Parallel, G1, ''Epsilon (E)'', ''ZGC (E)'', ''Shenandoah (E)'' |- | 15 - 20 || Serial, Parallel, G1, ''Epsilon (E)'', ZGC, Shenandoah |- | 21 - 22 || Serial, Parallel, G1, ''Epsilon (E)'', ZGC, Shenandoah, ''GenZGC (E)'' |- | 23 || Serial, Parallel, G1, ''Epsilon (E)'', ZGC, Shenandoah, GenZGC (default ZGC) |- | 24 || Serial, Parallel, G1, ''Epsilon (E)'', Shenandoah, GenZGC, ''GenShen (E)'' |- | colspan="3" | <small> ''(E)'' = ''experimental''</small> |} ===Class loader=== {{Main|Java class file}} One of the organizational units of JVM byte code is a [[Class (computer programming)|class]]. A class loader implementation must be able to recognize and load anything that conforms to the Java class [[file format]]. Any implementation is free to recognize other binary forms besides ''class'' files, but it must recognize ''class'' files. The class loader performs three basic activities in this strict order: #Loading: finds and imports the binary data for a type #Linking: performs verification, preparation, and (optionally) resolution #*Verification: ensures the correctness of the imported type #*Preparation: allocates memory for class variables and initializing the memory to default values #*Resolution: transforms symbolic references from the type into direct references. #Initialization: invokes Java code that initializes class variables to their proper starting values. In general, there are three types of class loader: bootstrap class loader, extension class loader and System / Application class loader. Every Java virtual machine implementation must have a bootstrap class loader that is capable of loading trusted classes, as well as an extension class loader or application class loader. The Java virtual machine specification does not specify how a class loader should locate classes. ===Virtual machine architecture=== The JVM operates on specific types of data as specified in Java Virtual Machine specifications. The data types can be divided<ref>{{Cite web|url=https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.2|title=Chapter 2. The Structure of the Java Virtual Machine|access-date=2021-09-15|archive-date=2021-09-15|archive-url=https://web.archive.org/web/20210915050448/https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.2|url-status=live}}</ref> into primitive types ([[integer]]s, Floating-point, long etc.) and Reference types. The earlier JVM were only [[32-bit computing|32-bit]] machines. <code>long</code> and <code>double</code> types, which are [[64-bit computing|64-bits]], are supported natively, but consume two units of storage in a frame's local variables or operand stack, since each unit is 32 bits. <code>boolean</code>, <code>byte</code>, <code>short</code>, and <code>char</code> types are all [[sign-extended]] (except <code>char</code> which is [[Sign extension#zero-extended|zero-extended]]) and operated on as 32-bit integers, the same as <code>int</code> types. The smaller types only have a few type-specific instructions for loading, storing, and type conversion. <code>boolean</code> is operated on as 8-bit <code>byte</code> values, with 0 representing <code>false</code> and 1 representing <code>true</code>. (Although <code>boolean</code> has been treated as a type since ''The Java Virtual Machine Specification, Second Edition'' clarified this issue, in compiled and executed code there is little difference between a <code>boolean</code> and a <code>byte</code> except for [[Name mangling#Java|name mangling]] in [[method signature]]s and the type of boolean arrays. <code>boolean</code>s in method signatures are mangled as <code>Z</code> while <code>byte</code>s are mangled as <code>B</code>. Boolean arrays carry the type <code>boolean[]</code> but use 8 bits per element, and the JVM has no built-in capability to pack booleans into a [[bit array]], so except for the type they perform and behave the same as <code>byte</code> arrays. In all other uses, the <code>boolean</code> type is effectively unknown to the JVM as all instructions to operate on booleans are also used to operate on <code>byte</code>s.) However, newer JVM releases, such as the OpenJDK HotSpot JVM, support 64-bit architecture. Consequently, you can install a 32-bit or 64-bit JVM on a 64-bit operating system. The primary advantage of running Java in a 64-bit environment is the larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads, which is needed for certain kinds of large applications; however there is a performance hit in using 64-bit JVM compared to 32-bit JVM. The JVM has a garbage-collected heap for storing objects and arrays. Code, constants, and other class data are stored in the "method area". The method area is logically part of the heap, but implementations may treat the method area separately from the heap, and for example might not garbage collect it. Each JVM thread also has its own [[call stack]] (called a "Java Virtual Machine stack" for clarity), which stores [[Call stack#STACK-FRAME|frames]]. A new frame is created each time a method is called, and the frame is destroyed when that method exits. Each frame provides an "operand stack" and an array of "local variables". The operand stack is used for operands to run computations and for receiving the return value of a called method, while local variables serve the same purpose as [[Processor register|registers]] and are also used to pass method arguments. Thus, the JVM is both a [[stack machine]] and a [[register machine]]. In practice, HotSpot eliminates every stack besides the native thread/call stack even when running in Interpreted mode, as its Templating Interpreter technically functions as a compiler. ===Bytecode instructions=== {{Main|Java bytecode}} The JVM has [[instruction (computer science)|instructions]] for the following groups of tasks: {{flatlist| * [[Load/store architecture|Load and store]] * [[Arithmetic]] * [[Type conversion]] * [[dynamic memory allocation|Object creation and manipulation]] * [[stack (abstract data type)|Operand stack management (push / pop)]] * [[branch (computer science)|Control transfer (branching)]] * [[subroutine|Method invocation and return]] * [[exception handling|Throwing exceptions]] * [[monitor (synchronization)|Monitor-based concurrency]] }} The aim is binary compatibility. Each particular host [[operating system]] needs its own implementation of the JVM and runtime. These JVMs interpret the bytecode semantically the same way, but the actual implementation may be different. More complex than just emulating bytecode is compatibly and efficiently implementing the [[Java Class Library|Java core API]] that must be mapped to each host operating system. These instructions operate on a set of common {{vanchor|abstracted [[data type]]s|DATA_TYPE}} rather the [[native data type]]s of any specific [[instruction set architecture]]. ===JVM languages=== {{Main|List of JVM languages}} A JVM language is any language with functionality that can be expressed in terms of a valid class file which can be hosted by the Java Virtual Machine. A class file contains Java Virtual Machine instructions ([[Java byte code]]) and a symbol table, as well as other ancillary information. The class file format is the hardware- and operating system-independent binary format used to represent compiled classes and interfaces.<ref>{{cite web |url=http://docs.oracle.com/javase/specs/jvms/se7/jvms7.pdf |title=The Java Virtual Machine Specification : Java SE 7 Edition |publisher=Docs.oracle.com |access-date=2015-06-26 |archive-date=2021-02-04 |archive-url=https://web.archive.org/web/20210204093304/https://docs.oracle.com/javase/specs/jvms/se7/jvms7.pdf |url-status=live }}</ref> There are several JVM languages, both old languages ported to JVM and completely new languages. [[JRuby]] and [[Jython]] are perhaps the most well-known ports of existing languages, i.e. [[Ruby (programming language)|Ruby]] and [[Python (programming language)|Python]] respectively. Of the new languages that have been created from scratch to compile to Java bytecode, [[Clojure]], [[Apache Groovy|Groovy]], [[Scala (programming language)|Scala]] and [[Kotlin (programming language)|Kotlin]] may be the most popular ones. A notable feature with the JVM languages is that they are [[Language interoperability|compatible with each other]], so that, for example, Scala libraries can be used with Java programs and vice versa.<ref>{{cite web |url=http://www.scala-lang.org/old/faq/4 |title=Frequently Asked Questions - Java Interoperability |author=<!--Staff writer(s); no by-line.--> |website=scala-lang.org |access-date=2015-11-18 |archive-date=2020-08-09 |archive-url=https://web.archive.org/web/20200809214018/https://www.scala-lang.org/old/faq/4 |url-status=live }}</ref> Java 7 JVM implements ''JSR 292: Supporting Dynamically Typed Languages''<ref>{{cite web |url=https://jcp.org/en/jsr/detail?id=292 |title=The Java Community Process(SM) Program - JSRs: Java Specification Requests - detail JSR# 292 |publisher=Jcp.org |access-date=2015-06-26 |archive-date=2020-12-20 |archive-url=https://web.archive.org/web/20201220200733/https://jcp.org/en/jsr/detail?id=292 |url-status=live }}</ref> on the Java Platform, a new feature which supports dynamically typed languages in the JVM. This feature is developed within the [[Da Vinci Machine]] project whose mission is to extend the JVM so that it supports languages other than Java.<ref>{{cite web |url=http://openjdk.java.net/projects/mlvm/ |title=Da Vinci Machine project |publisher=Openjdk.java.net |access-date=2015-06-26 |archive-date=2020-11-11 |archive-url=https://web.archive.org/web/20201111162302/https://openjdk.java.net/projects/mlvm/ |url-status=live }}</ref><ref>{{cite web |url=http://www.oracle.com/technetwork/articles/javase/dyntypelang-142348.html |title=New JDK 7 Feature: Support for Dynamically Typed Languages in the Java Virtual Machine |publisher=Oracle.com |access-date=2015-06-26 |archive-date=2018-09-13 |archive-url=https://web.archive.org/web/20180913101203/http://www.oracle.com/technetwork/articles/javase/dyntypelang-142348.html |url-status=live }}</ref> ===Bytecode verifier=== A basic philosophy of Java is that it is inherently safe from the standpoint that no user program can crash the host machine or otherwise interfere inappropriately with other operations on the host machine, and that it is possible to protect certain methods and data structures belonging to trusted code from access or corruption by untrusted code executing within the same JVM. Furthermore, common programmer errors that often led to data corruption or unpredictable behavior such as accessing off the end of an array or using an uninitialized pointer are not allowed to occur. Several features of Java combine to provide this safety, including the class model, the garbage-collected [[#Heap|heap]], and the verifier. The JVM verifies all bytecode before it is executed. This verification consists primarily of three types of checks: * Branches are always to valid locations * Data is always initialized and references are always type-safe * Access to private or package private data and methods is rigidly controlled The first two of these checks take place primarily during the verification step that occurs when a class is loaded and made eligible for use. The third is primarily performed dynamically, when data items or methods of a class are first accessed by another class. The verifier permits only some bytecode sequences in valid programs, e.g. a [[branch (computer science)|jump (branch) instruction]] can only target an instruction within the same [[method (computer programming)|method]]. Furthermore, the verifier ensures that any given instruction operates on a fixed stack location,<ref>{{cite web |title=The Verification process |url=http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#9766 |work=The Java Virtual Machine Specification |publisher=Sun Microsystems |year=1999 |access-date=2009-05-31 |archive-date=2011-03-21 |archive-url=https://web.archive.org/web/20110321165204/http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#9766 |url-status=live }}</ref> allowing the JIT compiler to transform stack accesses into fixed register accesses. Because of this, that the JVM is a stack architecture does not imply a speed penalty for emulation on [[register machine|register-based architectures]] when using a JIT compiler. In the face of the code-verified JVM architecture, it makes no difference to a JIT compiler whether it gets named imaginary registers or imaginary stack positions that must be allocated to the target architecture's registers. In fact, code verification makes the JVM different from a classic stack architecture, of which efficient emulation with a JIT compiler is more complicated and typically carried out by a slower interpreter. Additionally, the Interpreter used by the default JVM is a special type known as a Template Interpreter, which translates bytecode directly to native, register based machine language rather than emulate a stack like a typical interpreter.<ref>{{Cite web |url=https://openjdk.java.net/groups/hotspot/docs/RuntimeOverview.html#Interpreter |title=HotSpot Runtime Overview - Interpreter|website=OpenJDK|access-date=2021-05-24 |archive-date=2022-05-21 |archive-url=https://web.archive.org/web/20220521024017/https://openjdk.java.net/groups/hotspot/docs/RuntimeOverview.html#Interpreter |url-status=live }}</ref> In many aspects the HotSpot Interpreter can be considered a JIT compiler rather than a true interpreter, meaning the stack architecture that the bytecode targets is not actually used in the implementation, but merely a specification for the intermediate representation that can well be implemented in a register based architecture. Another instance of a stack architecture being merely a specification and implemented in a register based virtual machine is the [[Common Language Runtime]].<ref>{{Cite web|url=https://github.com/dotnet/runtime/issues/4775|title=Why not make CLR register-based? · Issue #4775 · dotnet/runtime|website=GitHub|access-date=2021-05-24|archive-date=2023-04-20|archive-url=https://web.archive.org/web/20230420122729/https://github.com/dotnet/runtime/issues/4775|url-status=live}}</ref> The original specification for the bytecode verifier used natural language that was incomplete or incorrect in some respects. A number of attempts have been made to specify the JVM as a formal system. By doing this, the security of current JVM implementations can more thoroughly be analyzed, and potential security exploits prevented. It will also be possible to optimize the JVM by skipping unnecessary safety checks, if the application being run is proven to be safe.<ref>{{Cite book |doi=10.1145/320384.320397|chapter=A formal framework for the Java bytecode language and verifier|title=Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications - OOPSLA '99|pages=147–166|year=1999|last1=Freund|first1=Stephen N.|last2=Mitchell|first2=John C.|isbn=978-1581132380|citeseerx=10.1.1.2.4663|s2cid=14302964}}</ref> ====Secure execution of remote code==== A virtual machine architecture allows very fine-grained control over the actions that code within the machine is permitted to take. It assumes the code is "semantically" correct, that is, it successfully passed the (formal) bytecode verifier process, materialized by a tool, possibly off-board the virtual machine. This is designed to allow safe execution of untrusted code from remote sources, a model used by [[Java applet]]s, and other secure code downloads. Once bytecode-verified, the downloaded code runs in a restricted "[[sandbox (computer security)|sandbox]]", which is designed to protect the user from misbehaving or malicious code. As an addition to the bytecode verification process, publishers can purchase a certificate with which to [[digital signature|digitally sign]] applets as safe, giving them permission to ask the user to break out of the sandbox and access the local file system, [[clipboard (software)|clipboard]], execute external pieces of software, or network. Formal proof of bytecode verifiers have been done by the Javacard industry (Formal Development of an Embedded Verifier for Java Card Byte Code<ref>{{cite web|last1=Casset|first1=Ludovic|last2=Burdy|first2=Lilian|last3=Requet|first3=Antoine|date=10 April 2002|title=Formal Development of an Embedded Verifier for Java Card Byte Code|website=Inria - National Institute for Research in Digital Science and Technology at [[Côte d'Azur University#National_research_organizations|Côte d'Azur University]]|url-status=live|archive-date=3 October 2022|url=http://www-sop.inria.fr/everest/Lilian.Burdy/CBR02dsn.pdf|archive-url=https://web.archive.org/web/20221003184410/http://www-sop.inria.fr/everest/Lilian.Burdy/CBR02dsn.pdf}}</ref>) ===Bytecode interpreter and just-in-time compiler=== For each [[hardware architecture]] a different Java bytecode [[Interpreter (computing)|interpreter]] is needed. When a computer has a Java bytecode interpreter, it can run any Java bytecode program, and the same program can be run on any computer that has such an interpreter. When Java bytecode is executed by an interpreter, the execution will always be slower than the execution of the same program compiled into native machine language. This problem is mitigated by [[Just-in-time compilation|just-in-time (JIT) compilers]] for executing Java bytecode. A JIT compiler may translate Java bytecode into native machine language while executing the program. The translated parts of the program can then be executed much more quickly than they could be interpreted. This technique gets applied to those parts of a program frequently executed. This way a JIT compiler can significantly speed up the overall execution time. There is no necessary connection between the Java programming language and Java bytecode. A program written in Java can be compiled directly into the machine language of a real computer and programs written in other languages than Java can be compiled into Java bytecode. Java bytecode is intended to be platform-independent and secure.<ref>David J. Eck, ''[http://math.hws.edu/javanotes/c1/s3.html Introduction to Programming Using Java] {{Webarchive|url=https://web.archive.org/web/20141011192544/http://math.hws.edu/javanotes/c1/s3.html |date=2014-10-11 }}'', Seventh Edition, Version 7.0, August 2014 at Section 1.3 "The Java Virtual Machine"</ref> Some JVM implementations do not include an interpreter, but consist only of a just-in-time compiler.<ref>''[http://docs.oracle.com/cd/E15289_01/doc.40/e15058/underst_jit.htm Oracle JRockit Introduction] {{Webarchive|url=https://web.archive.org/web/20150906145705/http://docs.oracle.com/cd/E15289_01/doc.40/e15058/underst_jit.htm |date=2015-09-06 }}'' Release R28 at 2. "Understanding Just-In-Time Compilation and Optimization"</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)