Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Interpreter (computing)

Interpreter (computing)

Overview
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, an interpreter normally means a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 that executes
Execution (computers)
Execution in computer and software engineering is the process by which a computer or a virtual machine carries out the instructions of a computer program. The instructions in the program trigger sequences of simple actions on the executing machine...

, i.e. performs, instructions written in a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

. An interpreter may be a program that either
Discussion
Ask a question about 'Interpreter (computing)'
Start a new discussion about 'Interpreter (computing)'
Answer questions from other users
Full Discussion Forum
 
Unanswered Questions
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, an interpreter normally means a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 that executes
Execution (computers)
Execution in computer and software engineering is the process by which a computer or a virtual machine carries out the instructions of a computer program. The instructions in the program trigger sequences of simple actions on the executing machine...

, i.e. performs, instructions written in a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

. An interpreter may be a program that either
  1. executes the source code
    Source code
    In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

     directly
  2. translates source code into some efficient intermediate representation (code) and immediately executes this
  3. explicitly executes stored precompiled code made by a compiler
    Compiler
    A compiler is a computer program that transforms source code written in a programming language into another computer language...

     which is part of the interpreter system


Early versions of the Lisp programming language
Lisp programming language
Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...

 and Dartmouth BASIC
Dartmouth BASIC
Dartmouth BASIC is the original version of the BASIC programming language. It is so named because it was designed and implemented at Dartmouth College...

 would be examples of type 1.
Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

, Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

, MATLAB
MATLAB
MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

, and Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

 are examples of type 2, while UCSD Pascal
UCSD Pascal
UCSD Pascal was a Pascal programming language system that ran on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1978...

 is type 3: Source programs are compiled ahead of time and stored as machine independent code, which is then linked at run-time and executed by an interpreter and/or compiler (for JIT
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 systems). Some systems, such as Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

, contemporary versions of BASIC
BASIC
BASIC is a family of general-purpose, high-level programming languages whose design philosophy emphasizes ease of use - the name is an acronym from Beginner's All-purpose Symbolic Instruction Code....

, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 and others, may also combine 2 and 3.

While interpreting and compiling
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 are the two main means by which programming languages are implemented, these are not fully mutually exclusive categories, one of the reasons being that most interpreting systems also perform some translation work, just like compilers. The terms "interpreted language
Interpreted language
Interpreted language is a programming language in which programs are 'indirectly' executed by an interpreter program. This can be contrasted with a compiled language which is converted into machine code and then 'directly' executed by the host CPU...

" or "compiled language
Compiled language
A compiled language is a programming language whose implementations are typically compilers , and not interpreters ....

" merely mean that the canonical implementation of that language is an interpreter or a compiler; a high level language is basically an abstraction which is (ideally) independent of particular implementations.

History


The first interpreted high-level language was probably Lisp.
Lisp was first implemented by Steve Russell
Steve Russell
Steve "Slug" Russell is a programmer and computer scientist most famous for creating Spacewar!, one of the earliest videogames, in 1961 with the fellow members of the Tech Model Railroad Club at MIT working on a DEC Digital PDP-1...

 on an IBM 704
IBM 704
The IBM 704, the first mass-produced computer with floating point arithmetic hardware, was introduced by IBM in 1954. The 704 was significantly improved over the IBM 701 in terms of architecture as well as implementations which were not compatible with its predecessor.Changes from the 701 included...

 computer. Russell had read McCarthy's paper, and realized (to McCarthy's surprise) that the Lisp eval function could be implemented in machine code. The result was a working Lisp interpreter which could be used to run Lisp programs, or more properly, 'evaluate Lisp expressions.'

Advantages and disadvantages of using interpreters


Programmers usually write programs in high level code, which the CPU cannot execute; so this source code has to be converted into machine code. This conversion is done by a compiler or an interpreter. A compiler makes the conversion just once, while an interpreter typically converts it every time a program is executed (or in some languages like early versions of BASIC
BASIC
BASIC is a family of general-purpose, high-level programming languages whose design philosophy emphasizes ease of use - the name is an acronym from Beginner's All-purpose Symbolic Instruction Code....

, every time a single instruction is executed).

Development cycle


During program development, the programmer makes frequent changes to source code. With a compiler, each time the programmer makes a change to the source code, s/he must wait for the compiler to make a compilation of the altered source files, and link all of the binary code files together before the program can be executed. The larger the program, the longer the programmer must wait to see the results of the change. By contrast, a programmer using an interpreter does a lot less waiting, as the interpreter usually just needs to translate the code being worked on to an intermediate representation (or not translate it at all), thus requiring much less time before the changes can be tested.

Distribution


A compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 converts source code into binary instruction for a specific processor's architecture, thus making it less portable. This conversion is made just once, on the developer's environment, and after that the same binary can be distributed to the user's machines where it can be executed without further translation. A cross compiler
Cross compiler
A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is run. Cross compiler tools are used to generate executables for embedded system or multiple platforms. It is used to compile for a platform upon which it is not feasible to...

 can generate binary code for the user machine even if it has a different processor than the machine where the code is compiled.

An interpreted program can be distributed as source code. It needs to be translated in each final machine, which takes more time but makes the program distribution independent of the machine's architecture. However, the portability of interpreted source code is dependent on the target machine actually having a suitable interpreter. If the interpreter needs to be supplied along with the source, the overall installation process is more complex than delivery of a monolithic executable.

The fact that interpreted code can easily be read and copied by humans can be of concern from the point of view of copyright
Copyright
Copyright is a legal concept, enacted by most governments, giving the creator of an original work exclusive rights to it, usually for a limited time...

. However, various systems of encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...


and obfuscation
Obfuscation
Obfuscation is the hiding of intended meaning in communication, making communication confusing, wilfully ambiguous, and harder to interpret.- Background :Obfuscation may be used for many purposes...

 exist. Delivery of intermediate code, such as bytecode, has a similar effect to obfuscation.

Efficiency


The main disadvantage of interpreters is that when a program is interpreted, it typically runs more slowly than if it had been compiled. The difference in speeds could be tiny or great; often an order of magnitude and sometimes more. It generally takes longer to run a program under an interpreter than to run the compiled code but it can take less time to interpret it than the total time required to compile and run it. This is especially important when prototyping and testing code when an edit-interpret-debug cycle can often be much shorter than an edit-compile-run-debug cycle.

Interpreting code is slower than running the compiled code because the interpreter must analyze each statement in the program each time it is executed and then perform the desired action, whereas the compiled code just performs the action within a fixed context determined by the compilation. This run-time analysis is known as "interpretive overhead". Access to variables is also slower in an interpreter because the mapping of identifiers to storage locations must be done repeatedly at run-time rather than at compile time.

There are various compromises between the development speed when using an interpreter and the execution speed when using a compiler. Some systems (such as some Lisps) allow interpreted and compiled code to call each other and to share variables. This means that once a routine has been tested and debugged under the interpreter it can be compiled and thus benefit from faster execution while other routines are being developed. Many interpreters do not execute the source code as it stands but convert it into some more compact internal form. Many BASIC
BASIC
BASIC is a family of general-purpose, high-level programming languages whose design philosophy emphasizes ease of use - the name is an acronym from Beginner's All-purpose Symbolic Instruction Code....

 interpreters replace keyword
Keyword (computer programming)
In computer programming, a keyword is a word or identifier that has a particular meaning to the programming language. The meaning of keywords — and, indeed, the meaning of the notion of keyword — differs widely from language to language....

s with single byte tokens which can be used to find the instruction in a jump table. Others achieve even higher levels of program compaction by using a bit-orientated rather than a byte-orientated program memory structure, where commands tokens may occupy perhaps 5 bits of a 16 bit word, leaving 11 bits for their label or address operands.

An interpreter might well use the same lexical analyzer
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...

 and parser as the compiler and then interpret the resulting abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

.

Regress


Interpretation cannot be used as the sole method of execution: even though an interpreter can itself be interpreted, a directly executed programme is needed somewhere at the bottom of the stack.

Bytecode interpreters



There is a spectrum of possibilities between interpreting and compiling, depending on the amount of analysis performed before the program is executed. For example, Emacs Lisp
Emacs Lisp
Emacs Lisp is a dialect of the Lisp programming language used by the GNU Emacs and XEmacs text editors . It is used for implementing most of the editing functionality built into Emacs, the remainder being written in C...

 is compiled to bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

, which is a highly compressed and optimized representation of the Lisp source, but is not machine code (and therefore not tied to any particular hardware). This "compiled" code is then interpreted by a bytecode interpreter (itself written in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

). The compiled code in this case is machine code for a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

, which is implemented not in hardware, but in the bytecode interpreter. The same approach is used with the Forth code used in Open Firmware
Open Firmware
Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers . It originated at Sun, and has been used by Sun, Apple, IBM, and most other non-x86 PCI chipset...

 systems: the source language is compiled into "F code" (a bytecode), which is then interpreted by a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

.

Control table
Control table
Control tables are tables that control the program flow or play a major part in program control. There are no rigid rules concerning the structure or content of a control table - its only qualifying attribute is its ability to direct program flow in some way through its 'execution' by an associated...

s - that do not necessarily ever need to pass through a compiling phase - dictate appropriate algorithmic control flow
Control flow
In computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....

 via customized interpreters in similar fashion to bytecode interpreters.

Abstract Syntax Tree interpreters


In the spectrum between interpreting and compiling, another approach is transforming the source code into an optimized Abstract Syntax Tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 (AST) then executing the program following this tree structure, or using it to generate native code Just-In-Time. In this approach, each sentence needs to be parsed just once. As an advantage over bytecode, the AST keeps the global program structure and relations between statements (which is lost in a bytecode representation), and when compressed provides a more compact representation. Thus, using AST has been proposed as a better intermediate format for Just-in-time compilers than bytecode. Also, it allows to perform better analysis during runtime.

However, for interpreters, an AST causes more overhead than a bytecode interpreter, because of nodes related to syntax performing no useful work, of a less sequential representation (requiring traversal of more pointers) and of overhead visiting the tree.

Just-in-time compilation


Further blurring the distinction between interpreters, byte-code interpreters and compilation is just-in-time compilation
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 (or JIT), a technique in which the intermediate representation is compiled to native machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

 at runtime. This confers the efficiency of running native code, at the cost of startup time and increased memory use when the bytecode or AST is first compiled. Adaptive optimization
Adaptive optimization
Adaptive optimization is a technique in computer science that performs dynamic recompilation of portions of a program based on the current execution profile. With a simple implementation, an adaptive optimizer may simply make a trade-off between Just-in-time compilation and interpreting instructions...

 is a complementary technique in which the interpreter profiles the running program and compiles its most frequently-executed parts into native code. Both techniques are a few decades old, appearing in languages such as Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 in the 1980s.

Just-in-time compilation has gained mainstream attention amongst language implementers in recent years, with Java, the .NET Framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 and most modern JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 implementations now including JITs.

Applications

  • Interpreters are frequently used to execute command language
    Command language
    A command language is a domain-specific interpreted language; a common example of a command language are shell or batch programming languages. These languages can be used directly at the command line, but can also automate tasks that would normally be performed manually at the command line...

    s, and glue language
    Glue language
    A glue language is a programming language used for connecting software components together.Examples of glue languages:* Unix Shell scripts * Windows NT type Shell scripts...

    s since each operator executed in command language is usually an invocation of a complex routine such as an editor or compiler.
  • Self-modifying code
    Self-modifying code
    In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...

     can easily be implemented in an interpreted language. This relates to the origins of interpretation in Lisp and artificial intelligence
    Artificial intelligence
    Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

     research.
  • Virtualisation. Machine code intended for one hardware architecture can be run on another using a virtual machine
    Virtual machine
    A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

    , which is essentially an interpreter.
  • Sandboxing
    Sandbox (computer security)
    In computer security, a sandbox is a security mechanism for separating running programs. It is often used to execute untested code, or untrusted programs from unverified third-parties, suppliers, untrusted users and untrusted websites....

    : An interpreter or virtual machine is not compelled to actually execute all the instructions the source code it is processing. In particular, it can refuse to execute code that violates any security
    Computer security
    Computer security is a branch of computer technology known as information security as applied to computers and networks. The objective of computer security includes protection of information and property from theft, corruption, or natural disaster, while allowing the information and property to...

     constraints it is operating under.

Punched card interpreter


The term "interpreter" often referred to a piece of unit record equipment
Unit record equipment
Before the advent of electronic computers, data processing was performed using electromechanical devices called unit record equipment, electric accounting machines or tabulating machines. Unit record machines were as ubiquitous in industry and government in the first half of the twentieth century...

 that could read punched card
Punched card
A punched card, punch card, IBM card, or Hollerith card is a piece of stiff paper that contains digital information represented by the presence or absence of holes in predefined positions...

s and print the characters in human-readable form on the card. The IBM 550
IBM 550
The IBM 550 numerical interpreter was the first commercial machine made by IBM that read numerical data punched on cards and printed it across the top of each card. The 550 was introduced in 1930....

 Numeric Interpreter and IBM 557
IBM 557
The IBM 557 Alphabetic Interpreter allowed holes in punched cards to be interpreted and the Hollerith punched card characters printed on any row or column, selected by a plugboard control panel. The machine was a synchronous system where brushes would glide over a hole in a punched card and...

 Alphabetic Interpreter are typical examples from 1930 and 1954, respectively.

See also

  • Command-line interpreter
  • Interpreted language
    Interpreted language
    Interpreted language is a programming language in which programs are 'indirectly' executed by an interpreter program. This can be contrasted with a compiled language which is converted into machine code and then 'directly' executed by the host CPU...

  • Compiled language
    Compiled language
    A compiled language is a programming language whose implementations are typically compilers , and not interpreters ....

  • Dynamic compilation
    Dynamic compilation
    Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in the Self programming language, the best-known language that uses this technique is Java...

  • Partial evaluation
    Partial evaluation
    In computing, partial evaluation is a technique for several different types of program optimization by specialization. The most straightforward application is to produce new programs which run faster than the originals while being guaranteed to behave in the same way...

  • Meta-circular evaluator
    Meta-circular evaluator
    A meta-circular evaluator is a special case of a self-interpreter in which the existing facilities of the parent interpreter are directly applied to the source code being interpreted, without any need for additional implementation...


External links