Just-in-time compilation
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, just-in-time compilation (JIT), also known as dynamic translation, is a method to improve the runtime performance of computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

s. Historically, computer programs had two modes of runtime operation, either interpreted
Interpreter (computing)
In computer science, an interpreter normally means a computer program that executes, i.e. performs, instructions written in a programming language...

 or static (ahead-of-time
AOT compiler
An ahead-of-time compiler is a compiler that implements ahead-of-time compilation. This refers to the act of compiling an intermediate language, such as Java bytecode, .NET Common Intermediate Language , or IBM System/38 or IBM System i "Technology Independent Machine Interface" code, into a...

) compilation. Interpreted code is translated from a high-level language to a machine code continuously during every execution, whereas statically compiled code is translated into machine code before execution, and only requires this translation once.

JIT compilers represent a hybrid approach, with translation occurring continuously, as with interpreters, but with caching of translated code to minimize performance degradation. It also offers other advantages over statically compiled code at development time, such as handling of late-bound data types and the ability to enforce security guarantees.

JIT builds upon two earlier ideas in run-time environments: bytecode compilation and dynamic compilation
Dynamic compilation
Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in the Self programming language, the best-known language that uses this technique is Java...

. It converts code at runtime prior to executing it natively, for example bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

 into native machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

.

Several modern runtime environments, such as Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's .NET Framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 and most implementations of Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, rely on JIT compilation for high-speed code execution.

Overview

In a bytecode-compiled system, source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 is translated to an intermediate representation known as bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

. Bytecode is not the machine code for any particular computer, and may be portable among computer architectures. The bytecode may then be interpreted by, or run on, a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

. The JIT compiler reads the bytecodes in many sections (or in full rarely) and compiles them interactively into machine language so the program can run faster. Java performs runtime checks on various sections of the code and this is the reason the entire code is not compiled at once. This can be done per-file, per-function or even on any arbitrary code fragment; the code can be compiled when it is about to be executed (hence the name "just-in-time"), and then cached and reused later without needing to be recompiled.

In contrast, a traditional interpreted virtual machine will simply interpret the bytecode, generally with much lower performance. Some interpreters even interpret source code, without the step of first compiling to bytecode, with even worse performance. Statically compiled code or native code is compiled prior to deployment. A dynamic compilation environment is one in which the compiler can be used during execution. For instance, most Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

 systems have a compile function which can compile new functions created during the run. This provides many of the advantages of JIT, but the programmer, rather than the runtime, is in control of what parts of the code are compiled. This can also compile dynamically generated code, which can, in many scenarios, provide substantial performance advantages over statically compiled code, as well as over most JIT systems.

A common goal of using JIT techniques is to reach or surpass the performance of static compilation, while maintaining the advantages of bytecode interpretation: Much of the "heavy lifting" of parsing the original source code and performing basic optimization is often handled at compile time, prior to deployment: compilation from bytecode to machine code is much faster than compiling from source. The deployed bytecode is portable, unlike native code. Since the runtime has control over the compilation, like interpreted bytecode, it can run in a secure sandbox. Compilers from bytecode to machine code are easier to write, because the portable bytecode compiler has already done much of the work.

JIT code generally offers far better performance than interpreters. In addition, it can in some cases offer better performance than static compilation, as many optimizations are only feasible at run-time:
  1. The compilation can be optimized to the targeted CPU and the operating system model where the application runs. For example JIT can choose SSE2
    SSE2
    SSE2, Streaming SIMD Extensions 2, is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2001. It extends the earlier SSE instruction set, and is intended to fully supplant MMX. Intel extended SSE2 to create SSE3...

     CPU instructions when it detects that the CPU supports them. To obtain this level of optimization specificity with a static compiler, one must either compile a binary for each intended platform/architecture, or else include multiple versions of portions of the code within a single binary.
  2. The system is able to collect statistics about how the program is actually running in the environment it is in, and it can rearrange and recompile for optimum performance. However, some static compilers can also take profile information as input.
  3. The system can do global code optimizations (e.g. inlining of library functions) without losing the advantages of dynamic linking and without the overheads inherent to static compilers and linkers. Specifically, when doing global inline substitutions, a static compilation process may need run-time checks and ensure that a virtual call would occur if the actual class of the object overrides the inlined method, and boundary condition checks on array accesses may need to be processed within loops. With just-in-time compilation in many cases this processing can be moved out of loops, often giving large increases of speed.
  4. Although this is possible with statically compiled garbage collected languages, a bytecode system can more easily rearrange executed code for better cache utilization.

Startup delay and optimizations

JIT typically causes a slight delay in initial execution of an application, due to the time taken to load and compile the bytecode. Sometimes this delay is called "startup time delay". In general, the more optimization JIT performs, the better the code it will generate, but the initial delay will also increase. A JIT compiler therefore has to make a trade-off between the compilation time and the quality of the code it hopes to generate. However, it seems that much of the startup time is sometimes due to IO-bound operations rather than JIT compilation (for example, the rt.jar class data file for the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 is 40 MB and the JVM must seek a lot of data in this contextually huge file).

One possible optimization, used by Sun's HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 Java Virtual Machine, is to combine interpretation and JIT compilation. The application code is initially interpreted, but the JVM monitors which sequences of bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

 are frequently executed and translates them to machine code for direct execution on the hardware. For bytecode which is executed only a few times, this saves the compilation time and reduces the initial latency; for frequently executed bytecode, JIT compilation is used to run at high speed, after an initial phase of slow interpretation. Additionally, since a program spends most time executing a minority of its code, the reduced compilation time is significant. Finally, during the initial code interpretation, execution statistics can be collected before compilation, which helps to perform better optimization.

The correct tradeoff can vary due to circumstances. For example, Sun's Java Virtual Machine has two major modes—client and server. In client mode, minimal compilation and optimization is performed, to reduce startup time. In server mode, extensive compilation and optimization is performed, to maximize performance once the application is running by sacrificing startup time. Other Java just-in-time compilers have used a runtime measurement of the number of times a method has executed combined with the bytecode size of a method as a heuristic to decide when to compile. Still another uses the number of times executed combined with the detection of loops. In general, it is much harder to accurately predict which methods to optimize in short-running applications than in long-running ones.

Native Image Generator
Native Image Generator
The Native Image Generator, or simply NGEN is the Ahead-of-time compilation service of the .NET Framework. It allows a .NET assembly to be pre-compiled instead of letting the Common Language Runtime do a Just-in-time compilation at runtime...

 (Ngen) by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 is another approach at reducing the initial delay. Ngen pre-compiles (or "pre-jits") bytecode in a Common Intermediate Language
Common Intermediate Language
Common Intermediate Language is the lowest-level human-readable programming language defined by the Common Language Infrastructure specification and is used by the .NET Framework and Mono...

 image into machine native code. As a result, no runtime compilation is needed. .NET framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 2.0 shipped with Visual Studio 2005 runs Ngen on all of the Microsoft library DLLs right after the installation. Pre-jitting provides a way to improve the startup time. However, the quality of code it generates might not be as good as the one that is jitted, for the same reasons why code compiled statically, without profile-guided optimization, cannot be as good as JIT compiled code in the extreme case: the lack of profiling data to drive, for instance, inline caching.

There also exist Java implementations that combine an AOT (ahead-of-time) compiler
AOT compiler
An ahead-of-time compiler is a compiler that implements ahead-of-time compilation. This refers to the act of compiling an intermediate language, such as Java bytecode, .NET Common Intermediate Language , or IBM System/38 or IBM System i "Technology Independent Machine Interface" code, into a...

 with either a JIT compiler (Excelsior JET) or interpreter (GNU Compiler for Java.)

History

The earliest published JIT compiler is generally attributed to work on LISP by McCarthy in 1960. In his seminal paper Recursive functions of symbolic expressions and their computation by machine, Part I, he mentions functions that are translated during runtime, thereby sparing the need to save the compiler output to punch card
Punch card
A punched card, punch card, IBM card, or Hollerith card is a piece of stiff paper that contains digital information represented by the presence or absence of holes in predefined positions...

s. In 1968, Thompson presented a method to automatically compile regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s to machine code, which is then executed in order to perform the matching on an input text. An influential technique for deriving compiled code from interpretation was pioneered by Mitchell in 1970, which he implemented for the experimental language LC².

Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 pioneered new aspects of JIT compilations. For example, translation to machine code was done on demand, and the result was cached for later use. When memory became scarce, the system would delete some of this code and regenerate it when it was needed again. Sun's Self language improved these techniques extensively and was at one point the fastest Smalltalk system in the world; achieving up to half the speed of optimized C but with a fully object-oriented language.

Self was abandoned by Sun, but the research went into the Java language, and currently it is used by most implementations of the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

, as HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 builds on, and extensively uses, this research base.

The HP project Dynamo was an experimental JIT compiler where the 'bytecode' format and the machine code format were the same; the system turned HPA-8000 machine code into HPA-8000 machine code. Counterintuitively, this resulted in speed ups, in some cases of 30% since doing this permitted optimizations at the machine code level, for example, inlining code for better cache usage and optimizations of calls to dynamic libraries and many other run-time optimizations which conventional compilers are not able to attempt.

See also

  • Binary translation
    Binary translation
    In computing, binary translation is the emulation of one instruction set by another through translation of code. Sequences of instructions are translated from the source to the target instruction set...

  • HotSpot
    HotSpot
    HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

  • Common Language Runtime
    Common Language Runtime
    The Common Language Runtime is the virtual machine component of Microsoft's .NET framework and is responsible for managing the execution of .NET programs. In a process known as just-in-time compilation, the CLR compiles the intermediate language code known as CIL into the machine instructions...

  • Crusoe, a microprocessor that essentially performs just-in-time compilation from x86 code to microcode within the microprocessor
  • GNU lightning
    GNU lightning
    GNU lightning is a free software library that generates assembly language code at run-time. Supported backends are SPARC , x86 and PowerPC . An ARM port is under way.- Advantages Over Other Libraries :...

     — A library that generates assembly language code at run-time
  • LLVM
    Low Level Virtual Machine
    The Low Level Virtual Machine is a compiler infrastructure written in C++ that is designed for compile-time, link-time, run-time, and "idle-time" optimization of programs written in arbitrary programming languages...

  • Self-modifying code
    Self-modifying code
    In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...


External links

— A library by Rhys Weatherley, Klaus Treichel, Aleksey Demakov, and Kirill Kononenko for development of Just-In-Time compilers in Virtual Machine implementations, Dynamic programming languages and Scripting languages.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK