Binary translation
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, binary translation is the emulation
Emulator
In computing, an emulator is hardware or software or both that duplicates the functions of a first computer system in a different second computer system, so that the behavior of the second system closely resembles the behavior of the first system...

 of one instruction set
Instruction set
An instruction set, or instruction set architecture , is the part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O...

 by another through translation of code. Sequences of instructions are translated from the source to the target instruction set. In some cases such as instruction set simulation
Instruction Set Simulator
An instruction set simulator is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent the processor's registers.Instruction simulation is a...

, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot
Hot spot (computer science)
A hot spot in computer science is most usually defined as a region of a computer program where a high proportion of executed instructions occur or where most time is spent during the program's execution .If a program is stopped randomly, the program counter...

 detection.

The two main types are static and dynamic binary translation.

Static binary translation

A translator using static binary translation aims to convert all of the code of an executable file
Executable
In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU...

 into code that runs on the target architecture without having to run the code first, as is done in dynamic binary translation. This is very difficult to do correctly, since not all the code can be discovered by the translator. For example, some parts of the executable may be reachable only through indirect branch
Indirect branch
An indirect branch is a type of program control instruction present in some machine language instruction sets. Rather than specifying the address of the next instruction to execute, as in a direct branch, the argument specifies where the address is located...

es, whose value is known only at run-time.

One such static binary translator uses universal superoptimizer
Superoptimization
Superoptimization is the task of finding the optimal code sequence for a single, loop-free sequence of instructions. While garden-variety compiler optimizations really just improve code , a superoptimizer's goal is to find the optimal sequence at the outset.The term superoptimization was first...

 peephole
Peephole optimization
In compiler theory, peephole optimization is a kind of optimization performed over a very small set of instructions in a segment of generated code. The set is called a "peephole" or a "window"...

 technology (developed by Sorav Bansal, and Alex Aiken from Stanford University
Stanford University
The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is a private research university on an campus located near Palo Alto, California. It is situated in the northwestern Santa Clara Valley on the San Francisco Peninsula, approximately northwest of San...

) to perform efficient translation between possibly many source and target pairs, with considerably low development costs and high performance of the target binary. In experiments of PowerPC-to-x86 translations, some binaries even outperformed native versions, but on average they ran at two-thirds of native speed.

Dynamic binary translation

Dynamic binary translation looks at a short sequence of code—typically on the order of a single basic block
Basic block
In computing, a basic block is a portion of the code within a program with certain desirable properties that make it highly amenable to analysis. Compilers usually decompose programs into their basic blocks as a first step in the analysis process...

—then translates it and caches the resulting sequence. Code is only translated as it is discovered and when possible, and branch instructions are made to point to already translated and saved code (memoization
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...

).

Dynamic binary translation differs from simple emulation (eliminating the emulator's main read-decode-execute loop—a major performance bottleneck), paying for this by large overhead during translation time. This overhead is hopefully amortized as translated code sequences are executed multiple times.

More advanced dynamic translators employ dynamic recompilation
Dynamic recompilation
In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution...

 where the translated code is instrumented to find out what portions are executed a large number of times, and these portions are optimized
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

 aggressively. This technique is reminiscent of a JIT compiler, and in fact such compilers (e.g. Sun
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

's HotSpot technology) can be viewed as dynamic translators from a virtual instruction set (the bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

) to a real one.
  • Apple Computer
    Apple Computer
    Apple Inc. is an American multinational corporation that designs and markets consumer electronics, computer software, and personal computers. The company's best-known hardware products include the Macintosh line of computers, the iPod, the iPhone and the iPad...

     implemented a dynamic translating emulator
    Emulator
    In computing, an emulator is hardware or software or both that duplicates the functions of a first computer system in a different second computer system, so that the behavior of the second system closely resembles the behavior of the first system...

     for M68K
    Motorola 68000
    The Motorola 68000 is a 16/32-bit CISC microprocessor core designed and marketed by Freescale Semiconductor...

     code in their PowerPC
    PowerPC
    PowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...

     line of Macintoshes, which achieved a very high level of reliability, performance and compatibility (see Mac 68K emulator
    Mac 68K emulator
    The Mac 68K emulator was a software emulator built into all versions of the Mac OS for PowerPC. This emulator permitted the running of applications and system code that were originally written for the 680x0 based Macintosh models. The emulator was completely seamless for users, and reasonably...

    ). This allowed Apple to bring the machines to market with only a partially native operating system
    Operating system
    An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

    , and end users could adopt the new, faster architecture without risking their investment in software. Partly because the emulator was so successful, many parts of the operating system remained emulated. A full transition to a PowerPC native operating system
    Operating system
    An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

     (OS) was not made until the release of Mac OS X
    Mac OS X
    Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

     (10.0) in 2001. (The OS X "Classic
    Classic (Mac OS X)
    Classic, or Classic Environment, was a hardware and software abstraction layer in Mac OS X that allowed applications compatible with Mac OS 9 to run on the Mac OS X operating system...

    " runtime environment continued to offer this emulation capability on PowerPC Macs until OS X 10.5.)

  • Mac OS X 10.4.4 for Intel-based Macs introduced the Rosetta
    Rosetta (software)
    Rosetta was a lightweight and dynamic binary translator for Mac OS X which Apple released in 2006 when it transitioned the Macintosh from PowerPC to Intel processors. It allowed pre-existing software to run on the new systems without modification....

     dynamic translation layer to ease Apple's transition from PPC-based hardware to x86. Developed for Apple by Transitive Corporation, the Rosetta software is an implementation of Transitive's QuickTransit
    QuickTransit
    QuickTransit was a cross-platform virtualization program developed by Transitive Corporation. It allowed software compiled for one specific processor and operating system combination to be executed on a different processor and/or operating system architecture without source code or binary...

     solution.

  • QuickTransit during its product lifespan also provided SPARC
    SPARC
    SPARC is a RISC instruction set architecture developed by Sun Microsystems and introduced in mid-1987....

    →x86, x86→Power Architecture
    Power Architecture
    Power Architecture is a broad term to describe similar RISC instruction sets for microprocessors developed and manufactured by such companies as IBM, Freescale, AMCC, Tundra and P.A. Semi...

     and MIPS
    MIPS architecture
    MIPS is a reduced instruction set computer instruction set architecture developed by MIPS Technologies . The early MIPS architectures were 32-bit, and later versions were 64-bit...

    →Itanium 2 translation support.

  • DEC
    Digital Equipment Corporation
    Digital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...

     achieved similar success with its translation tools to help users migrate from the CISC
    Complex instruction set computer
    A complex instruction set computer , is a computer where single instructions can execute several low-level operations and/or are capable of multi-step operations or addressing modes within single instructions...

     VAX
    VAX
    VAX was an instruction set architecture developed by Digital Equipment Corporation in the mid-1970s. A 32-bit complex instruction set computer ISA, it was designed to extend or replace DEC's various Programmed Data Processor ISAs...

     architecture to the Alpha
    DEC Alpha
    Alpha, originally known as Alpha AXP, is a 64-bit reduced instruction set computer instruction set architecture developed by Digital Equipment Corporation , designed to replace the 32-bit VAX complex instruction set computer ISA and its implementations. Alpha was implemented in microprocessors...

     RISC architecture.

  • DEC created the FX!32
    FX!32
    FX!32 is a software emulator program that allows x86 Win32 programs to execute on Alpha-based systems running Windows NT. Released in 1996, FX!32 was developed by Digital Equipment Corporation to support their Alpha microprocessors...

     binary translator for converting x86 applications to Alpha applications.

  • Sun Microsystems
    Sun Microsystems
    Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

    ' Wabi software included dynamic translation from x86 to SPARC instructions.

  • In January 2000, Transmeta
    Transmeta
    Transmeta Corporation was a US-based corporation that licensed low power semiconductor intellectual property. Transmeta originally produced very long instruction word code morphing microprocessors, with a focus on reducing power consumption in electronic devices. It was founded in 1995 by Bob...

     Corporation announced a novel processor design named Crusoe
    Transmeta Crusoe
    The Crusoe is a family of x86-compatible microprocessors developed by Transmeta. Crusoe was notable for its method of achieving x86 compatibility. Instead of the instruction set architecture being implemented in hardware, or translated by specialized hardware, the Crusoe runs a software abstraction...

    . From the FAQ on their web site, The smart microprocessor consists of a hardware VLIW core as its engine and a software layer called Code Morphing software. The Code Morphing software acts as a shell ... morphing or translating x86 instructions to native Crusoe instructions. In addition, the Code Morphing software contains a dynamic compiler and code optimizer ... The result is increased performance at the least amount of power. ... [This] allows Transmeta to evolve the VLIW hardware and Code Morphing software separately without affecting the huge base of software applications. More info at arstechnica, geek.com.

  • HP
    Hewlett-Packard
    Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

     ARIES (Automatic Re-translation and Integrated Environment Simulation) is a dynamic binary translation system that combines fast code interpretation with two phase dynamic translation to transparently and accurately execute HP 9000
    HP 9000
    HP 9000 is the name for a line of workstation and server computer systems produced by the Hewlett-Packard Company . The native operating system for almost all HP 9000 systems is HP-UX, a derivative of Unix. The HP 9000 brand was introduced in 1984 to encompass several existing technical...

     HP-UX
    HP-UX
    HP-UX is Hewlett-Packard's proprietary implementation of the Unix operating system, based on UNIX System V and first released in 1984...

     applications on HP-UX
    HP-UX
    HP-UX is Hewlett-Packard's proprietary implementation of the Unix operating system, based on UNIX System V and first released in 1984...

     11i for HP Integrity
    HP Integrity
    HP Integrity is series of Hewlett-Packard server computers produced by Hewlett-Packard since 2003, based on the Itanium processor architecture...

     servers. The ARIES fast interpreter emulates a complete set of non-privileged PA-RISC
    PA-RISC
    PA-RISC is an instruction set architecture developed by Hewlett-Packard. As the name implies, it is a reduced instruction set computer architecture, where the PA stands for Precision Architecture...

     instructions with no user intervention. During interpretation, it monitors the application's execution pattern and translates only the frequently executed code into native Itanium
    Itanium
    Itanium is a family of 64-bit Intel microprocessors that implement the Intel Itanium architecture . Intel markets the processors for enterprise servers and high-performance computing systems...

     code at runtime. ARIES implements two phase dynamic translation, a technique in which translated code in first phase collects runtime profile information which is used during second phase translation to further optimize the translated code. ARIES stores the dynamically translated code in memory buffer called code cache. Further references to translated basic blocks execute directly in the code cache and do not require additional interpretation or translation. The targets of translated code blocks are back-patched to ensure execution takes place in code cache most of the time. At the end of the emulation, ARIES discards all the translated code without modifying the original application. The ARIES emulation engine also implements Environment Emulation which emulates an HP 9000
    HP 9000
    HP 9000 is the name for a line of workstation and server computer systems produced by the Hewlett-Packard Company . The native operating system for almost all HP 9000 systems is HP-UX, a derivative of Unix. The HP 9000 brand was introduced in 1984 to encompass several existing technical...

     HP-UX
    HP-UX
    HP-UX is Hewlett-Packard's proprietary implementation of the Unix operating system, based on UNIX System V and first released in 1984...

     application's system calls, signal delivery, exception management, threads management, emulation of HP
    Hewlett-Packard
    Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

     GDB for debugging, and core file creation for the application.

  • Intel Corporation has developed and implemented an IA-32 Execution Layer - a dynamic binary translator designed to support IA-32 applications on Itanium
    Itanium
    Itanium is a family of 64-bit Intel microprocessors that implement the Intel Itanium architecture . Intel markets the processors for enterprise servers and high-performance computing systems...

    -based systems, which was included in Microsoft
    Microsoft
    Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

     Windows server OS for Itanium
    Itanium
    Itanium is a family of 64-bit Intel microprocessors that implement the Intel Itanium architecture . Intel markets the processors for enterprise servers and high-performance computing systems...

     architecture, as well as in several flavors of Linux
    Linux
    Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

    , including Red Hat
    Red Hat
    Red Hat, Inc. is an S&P 500 company in the free and open source software sector, and a major Linux distribution vendor. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina with satellite offices worldwide....

     and Suse. It used a two-phase (later three-phase) approach: initially it quickly translated every piece of code at a basic block level, adding certain instrumentation for detecting hot code; then hot code was dynamically optimized at a super-block level, and the optimized translated code replaced cold code on the fly. Later interpretation engine was added that allowed to avoid altogether translation of code executed just a few times - cold non-optimized translation became thus the second phase, and hot optimized translation became the third phase. IA-32 Execution Layer supported self-modified code, and could even optimize it quite well.

  • Some test & debugging
    Debugger
    A debugger or debugging tool is a computer program that is used to test and debug other programs . The code to be examined might alternatively be running on an instruction set simulator , a technique that allows great power in its ability to halt when specific conditions are encountered but which...

     systems dating back to the 1970s
    1970s
    File:1970s decade montage.png|From left, clockwise: US President Richard Nixon doing the V for Victory sign after his resignation from office after the Watergate scandal in 1974; Refugees aboard a US naval boat after the Fall of Saigon, leading to the end of the Vietnam War in 1975; The 1973 oil...

     such as "Oliver", utilized dynamic binary translation to provide breakpoint
    Breakpoint
    In software development, a breakpoint is an intentional stopping or pausing place in a program, put in place for debugging purposes. It is also sometimes simply referred to as a pause....

    , storage protection, trace, program animation
    Program animation
    Program animation or Stepping refers to the very common debugging method of executing code one "line" at a time. The programmer may examine the state of the program, machine, and related data before and after execution of a particular line of code...

     and other features for IBM/360/370/390/ES9000 platforms.

See also

  • Comparison of platform virtual machines
  • Emulator
    Emulator
    In computing, an emulator is hardware or software or both that duplicates the functions of a first computer system in a different second computer system, so that the behavior of the second system closely resembles the behavior of the first system...

  • Instruction set simulator
    Instruction Set Simulator
    An instruction set simulator is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent the processor's registers.Instruction simulation is a...

  • Just-in-time compilation
    Just-in-time compilation
    In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

  • Shadow memory
    Shadow memory
    Shadow memory describes a computer science technique in which potentially every byte used by a program during its execution has a shadow byte or bytes. These shadow bytes are typically invisible to the original program and are used to record information about the original piece of data...

  • Virtual machine
    Virtual machine
    A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...


External links

  • http://bellard.org/qemu/
  • http://www.itee.uq.edu.au/~csmweb/decompilation/bintrans.html (somewhat dated)
  • http://www.gtoal.com/sbt/ Static Binary Translation HOWTO
  • http://www.itee.uq.edu.au/~cristina/uqbt.html University of Queensland Binary Translator
  • http://www.experimentalstuff.com/Technologies/Walkabout/ Walkabout - Binary Translation research by Sun and University collaborators
  • http://csdl.computer.org/comp/mags/mi/1998/02/m2056abs.htm - FX!32, an x86 to Alpha binary translator developed by DEC
  • http://www.hp.com/go/aries
  • http://www.cse.iitd.ernet.in/~sbansal/publications.html - S. Bansal, A. Aiken, "Binary Translation Using Peephole Superoptimizers," In Proceedings of Symposium on Operating Systems Design and Implementation (OSDI), December 2008.
  • http://dl.acm.org/citation.cfm?id=956550 - Leonid Baraz, Tevi Devor, Orna Etzion, Shalom Goldenberg, Alex Skaletsky, Yun Wang, Yigal Zemach, IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK