Home      Discussion      Topics      Dictionary      Almanac
Signup       Login


Ask a question about 'Self-hosting'
Start a new discussion about 'Self-hosting'
Answer questions from other users
Full Discussion Forum
The term self-hosting was coined to refer to the use of a computer program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 as part of the toolchain
In software, a toolchain is the set of programming tools that are used to create a product...

 or operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 that produces new versions of that same program—for example, a compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 that can compile its own source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

. Self-hosting software is commonplace on personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...

s and larger systems. Other programs that are typically self-hosting include kernels, assemblers, shell
Shell (computing)
A shell is a piece of software that provides an interface for users of an operating system which provides access to the services of a kernel. However, the term is also applied very loosely to applications and may include any software that is "built around" a particular component, such as web...

s and revision control software
Revision control
Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...


If a system is so new that no software has been written for it, then software is developed on another self-hosting system and placed on a storage
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

 device that the new system can read. Development continues this way until the new system can reliably host its own development. Development of the Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 operating system, for example, was initially hosted on a Minix
MINIX is a Unix-like computer operating system based on a microkernel architecture created by Andrew S. Tanenbaum for educational purposes; MINIX also inspired the creation of the Linux kernel....

 system. Writing new software development tools "from the metal" (that is, without using another host system) is rare and in many cases impractical.


The first self-hosting compiler (excluding assemblers) was written for Lisp
Lisp programming language
Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older...

 by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.
The compiler as it exists on the standard compiler tape is a machine language program that was obtained by having the S-expression
S-expressions or sexps are list-based data structures that represent semi-structured data. An S-expression may be a nested list of smaller S-expressions. S-expressions are probably best known for their use in the Lisp family of programming languages...

 definition of the compiler work on itself through the interpreter.
(AI Memo 39)

This technique is only possible when an interpreter already exists for the very same language that is to be compiled. It borrows directly from the notion of running a program on itself as input, which is also used in various proofs in theoretical computer science
Theoretical computer science
Theoretical computer science is a division or subset of general computer science and mathematics which focuses on more abstract or mathematical aspects of computing....

, such as the proof that the halting problem
Halting problem
In computability theory, the halting problem can be stated as follows: Given a description of a computer program, decide whether the program finishes running or continues to run forever...

 is undecidable.


Several programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s are self-hosting, in the sense that a compiler for the language can be written in that same language. Self-hosting is an attribute of a particular language implementation and not of the programming language in general. The first compiler for a new programming language can be written in another language (in rare cases, machine language
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...

) or can be produced using bootstrapping
Bootstrapping (compilers)
In computer science, bootstrapping is the process of writing a compiler in the target programming language which it is intended to compile...

. Programming languages which have been self-hosted include Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

BASIC is a family of general-purpose, high-level programming languages whose design philosophy emphasizes ease of use - the name is an acronym from Beginner's All-purpose Symbolic Instruction Code....

, C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, CoffeeScript
CoffeeScript is a programming language that transcompiles to JavaScript. The language adds syntactic sugar inspired by Ruby, Python and Haskell to enhance JavaScript's brevity and readability, as well as adding more sophisticated features like array comprehension and pattern matching...

, F#, FASM
FASM in computing is an assembler. It supports programming in Intel-style assembly language on the IA-32 and x86-64 computer architectures. It claims high speed, size optimizations, operating system portability, and macro abilities. It is a low-level assembler and intentionally uses very few...

, Forth, Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

, Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, Lisp, Modula-2
Modula-2 is a computer programming language designed and developed between 1977 and 1980 by Niklaus Wirth at ETH Zurich as a revision of Pascal to serve as the sole programming language for the operating system and application software for the personal workstation Lilith...

, OCaml, Oberon
Oberon (programming language)
Oberon is a programming language created in 1986 by Professor Niklaus Wirth and his associates at ETH Zurich in Switzerland. It was developed as part of the implementation of the Oberon operating system...

, Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

, Python
PyPy is a Python interpreter and JIT compiler. PyPy focuses on speed, efficiency and 100% compatibility with the original CPython interpreter.- Details and motivation :...

, Scala, Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

, and Vala
Vala (programming language)
Vala is a programming language created with the goal of bringing modern language features to C, with no added runtime needs and with little overhead, by targeting the GObject object system. It is being developed by Jürg Billeter and Raffaele Sandrini. The syntax borrows heavily from C#...


See also

  • Bootstrapping (compilers)
    Bootstrapping (compilers)
    In computer science, bootstrapping is the process of writing a compiler in the target programming language which it is intended to compile...

  • Cross compiler
    Cross compiler
    A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is run. Cross compiler tools are used to generate executables for embedded system or multiple platforms. It is used to compile for a platform upon which it is not feasible to...

  • Dogfooding
  • Futamura projection
  • Self-interpreter
    A self-interpreter, or metainterpreter, is a programming language interpreter written in the language it interprets. An example would be a BASIC interpreter written in BASIC...

  • Self reference