Forth
Encyclopedia
Forth is a structured
Structured programming
Structured programming is a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops - in contrast to using simple tests and jumps such as the goto statement which could...

, imperative
Imperative programming
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state...

, reflective
Reflection (computer science)
In computer science, reflection is the process by which a computer program can observe and modify its own structure and behavior at runtime....

, concatenative
Concatenative programming language
A concatenative programming language is a point-free programming language in which all expressions denote functions and the juxtaposition of expressions denotes function composition...

, extensible
Extensible programming
Extensible programming is a term used in computer science to describe a style of computer programming that focuses on mechanisms to extend the programming language, compiler and runtime environment. Extensible programming languages, supporting this style of programming, were an active area of work...

, stack-based
Stack-oriented programming language
A stack-oriented programming language is one that relies on a stack machine model for passing parameters. Several programming languages fit this description, notably Forth, RPL, PostScript, and also many Assembly languages ....

 computer
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

 programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 and programming environment. Although not an acronym, the language's name is sometimes spelled with all capital letters as FORTH, following the customary usage during its earlier years.

A procedural programming
Procedural programming
Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm, derived from structured programming, based upon the concept of the procedure call...

 language without type checking
Type system
A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...

, Forth features both interactive execution of commands (making it suitable as a shell for systems that lack a more formal operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

) and the ability to compile
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 sequences of commands for later execution. Some Forth implementations (usually early versions or those written to be extremely portable) compile threaded code
Threaded code
In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines...

, but many implementations today generate optimized
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

 machine code like other language compilers.

Although not as popular as other programming systems, Forth has enough support to keep several language vendors and contractors in business. Forth is currently used in boot loaders such as Open Firmware
Open Firmware
Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers . It originated at Sun, and has been used by Sun, Apple, IBM, and most other non-x86 PCI chipset...

, space applications, and other embedded systems. Gforth
Gforth
Gforth is a project to develop an implementation for ANS Forth. It is part of the GNU Project.-Goals:Gforth's goals can be split into several subgoals:* Gforth should conform to the ANS Forth standard....

, an implementation of Forth by the GNU Project
GNU Project
The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

, is actively maintained, with its most recent release in December 2008. The 1994 standard is currently undergoing revision, provisionally titled Forth 200x.

Overview

A Forth environment combines the compiler with an interactive shell. The user interactively defines and runs subroutine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....

s, or "words," in a virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

 similar to the runtime
Run-time system
A run-time system is a software component designed to support the execution of computer programs written in some computer language...

 environment. Words can be tested, redefined, and debugged as the source is entered without recompiling or restarting the whole program. All syntactic elements, including variables and basic operators, appear as such procedures. Even if a particular word is optimized so as not to require a subroutine call, it is also still available as a subroutine. On the other hand, the shell may compile interactively typed commands into machine code before running them. (This behavior is common, but not required.) Forth environments vary in how the resulting program is stored, but ideally running the program has the same effect as manually re-entering the source. This contrasts with the combination of C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 with Unix shell
Unix shell
A Unix shell is a command-line interpreter or shell that provides a traditional user interface for the Unix operating system and for Unix-like systems...

s, wherein compiled functions are a special class of program objects and interactive commands are strictly interpreted
Interpreter (computing)
In computer science, an interpreter normally means a computer program that executes, i.e. performs, instructions written in a programming language...

. Most of Forth's unique properties result from this principle. By including interaction, scripting, and compilation, Forth was popular on computers with limited resources, such as the BBC Micro
BBC Micro
The BBC Microcomputer System, or BBC Micro, was a series of microcomputers and associated peripherals designed and built by Acorn Computers for the BBC Computer Literacy Project, operated by the British Broadcasting Corporation...

 and Apple II
Apple II
The Apple II is an 8-bit home computer, one of the first highly successful mass-produced microcomputer products, designed primarily by Steve Wozniak, manufactured by Apple Computer and introduced in 1977...

 series, and remains so in applications such as firmware
Firmware
In electronic systems and computing, firmware is a term often used to denote the fixed, usually rather small, programs and/or data structures that internally control various electronic devices...

 and small microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

s. Where C compilers may now generate code with more compactness and performance, Forth retains the advantage of interactivity.

Stacks

Most programming environments with recursive subroutine
Subroutine
In computer science, a subroutine is a portion of code within a larger program that performs a specific task and is relatively independent of the remaining code....

s use a stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...

 for control flow
Control flow
In computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....

. This structure typically also stores local variable
Local variable
In computer science, a local variable is a variable that is given local scope. Such a variable is accessible only from the function or block in which it is declared. In programming languages with only two levels of visibility, local variables are contrasted with global variables...

s, including subroutine parameter
Parameter (computer science)
In computer programming, a parameter is a special kind of variable, used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are called arguments...

s (in call by value system such as C). Forth often does not have local variables, however, nor is it call-by-value. Instead, intermediate values are kept in a second stack
Stack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...

. Words operate directly on the topmost values in the first stack. It may therefore be called the "parameter" or "data" stack, but most often simply "the" stack. The second, function-call stack is then called the "linkage" or "return" stack, abbreviated rstack. Special rstack manipulation functions provided by the kernel allow it to be used for temporary storage within a word, but otherwise it cannot be used to pass parameters or manipulate data.

Most words are specified in terms of their effect on the stack. Typically, parameters are placed on the top of the stack before the word executes. After execution, the parameters have been erased and replaced with any return values. For arithmetic operators, this follows the rule of reverse Polish notation
Reverse Polish notation
Reverse Polish notation is a mathematical notation wherein every operator follows all of its operands, in contrast to Polish notation, which puts the operator in the prefix position. It is also known as Postfix notation and is parenthesis-free as long as operator arities are fixed...

. See below for examples illustrating stack usage.

Maintenance

Forth is a simple yet extensible language; its modularity and extensibility permit the writing of high-level programs such as CAD
Computer-aided design
Computer-aided design , also known as computer-aided design and drafting , is the use of computer technology for the process of design and design-documentation. Computer Aided Drafting describes the process of drafting with a computer...

 systems. However, extensibility also helps poor programmers to write incomprehensible code, which has given Forth a reputation as a "write-only language
Write-only language
A write-only language is a programming language with syntax sufficiently dense and bizarre that any routine of significant size is automatically write-only code...

". Forth has been used successfully in large, complex projects, while applications developed by competent, disciplined professionals have proven to be easily maintained on evolving hardware platforms over decades of use.
Forth has a niche both in astronomical and space applications.
Forth is still used today in many embedded system
Embedded system
An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

s (small computerized devices) because of its portability
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

, efficient memory use, short development time, and fast execution speed. It has been implemented efficiently on modern RISC processors, and processors that use Forth as machine language
Stack machine
A stack machine may be* A real or emulated computer that evaluates each sub-expression of a program statement via a pushdown data stack and uses a reverse Polish notation instruction set....

 have been produced. Other uses of Forth include the Open Firmware
Open Firmware
Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers . It originated at Sun, and has been used by Sun, Apple, IBM, and most other non-x86 PCI chipset...

 boot ROMs
Booting
In computing, booting is a process that begins when a user turns on a computer system and prepares the computer to perform its normal operations. On modern computers, this typically involves loading and starting an operating system. The boot sequence is the initial set of operations that the...

 used by Apple, IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

, Sun
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

, and OLPC XO-1
OLPC XO-1
The XO-1, previously known as the $100 Laptop, Children's Machine, and 2B1, is an inexpensive subnotebook computer intended to be distributed to children in developing countries around the world, to provide them with access to knowledge, and opportunities to "explore, experiment and express...

; and the FICL-based first stage boot controller of the FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...

 operating system.

History

Forth evolved from Charles H. Moore
Charles H. Moore
Charles H. Moore is the inventor of the Forth programming language.- Biography :In 1968, while employed at the United States National Radio Astronomy Observatory , Moore invented the initial version of the Forth language to help control radio telescopes...

's personal programming system, which had been in continuous development since 1958. Forth was first exposed to other programmers in the early 1970s, starting with Elizabeth Rather
Elizabeth Rather
Elizabeth Rather is the co-founder of FORTH, Inc. and is a leading expert in the Forth programming language.She became involved with Forth while she was at the University of Arizona, but working part-time for NRAO...

 at the US National Radio Astronomy Observatory
National Radio Astronomy Observatory
The National Radio Astronomy Observatory is a Federally Funded Research and Development Center of the United States National Science Foundation operated under cooperative agreement by Associated Universities, Inc for the purpose of radio astronomy...

. After their work at NRAO, Charles Moore and Elizabeth Rather formed FORTH, Inc. in 1973, refining and porting Forth systems to dozens of other platforms in the next decade.

Forth is so named because in 1968 "the file holding the interpreter was labeled FOURTH, for 4th (next) generation software—but the IBM 1130
IBM 1130
The IBM 1130 Computing System was introduced in 1965. It was IBM's least-expensive computer to date, and was aimed at price-sensitive, computing-intensive technical markets like education and engineering. It succeeded the IBM 1620 in that market segment. The IBM 1800 was a process control variant...

 operating system restricted file names to 5 characters." Moore saw Forth as a successor to compile-link-go third-generation programming language
Third-generation programming language
A third-generation programming language is a refinement of a second-generation programming language. The second generation of programming languages brought logical structure to software. The third generation brought refinements to make the languages more programmer-friendly...

s, or software for "fourth generation" hardware, not a fourth-generation programming language
Fourth-generation programming language
A fourth-generation programming language is a programming language or programming environment designed with a specific purpose in mind, such as the development of commercial business software. In the history of computer science, the 4GL followed the 3GL in an upward trend toward higher...

 as the term has come to be used.

Because Charles Moore had frequently moved from job to job over his career, an early pressure on the developing language was ease of porting
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

 to different computer architectures. A Forth system has often been used to bring up new hardware. For example, Forth was the first resident software on the new Intel 8086
Intel 8086
The 8086 is a 16-bit microprocessor chip designed by Intel between early 1976 and mid-1978, when it was released. The 8086 gave rise to the x86 architecture of Intel's future processors...

 chip in 1978 and MacFORTH was the first resident development system for the first Apple Macintosh
Macintosh
The Macintosh , or Mac, is a series of several lines of personal computers designed, developed, and marketed by Apple Inc. The first Macintosh was introduced by Apple's then-chairman Steve Jobs on January 24, 1984; it was the first commercially successful personal computer to feature a mouse and a...

 in 1984.

FORTH, Inc.'s microFORTH was developed for the Intel 8080
Intel 8080
The Intel 8080 was the second 8-bit microprocessor designed and manufactured by Intel and was released in April 1974. It was an extended and enhanced variant of the earlier 8008 design, although without binary compatibility...

, Motorola 6800
Motorola 6800
The 6800 was an 8-bit microprocessor designed and first manufactured by Motorola in 1974. The MC6800 microprocessor was part of the M6800 Microcomputer System that also included serial and parallel interface ICs, RAM, ROM and other support chips...

, and Zilog Z80
Zilog Z80
The Zilog Z80 is an 8-bit microprocessor designed by Zilog and sold from July 1976 onwards. It was widely used both in desktop and embedded computer designs as well as for military purposes...

 microprocessors starting in 1976. MicroFORTH was later used by hobbyists to generate Forth systems for other architectures, such as the 6502
MOS Technology 6502
The MOS Technology 6502 is an 8-bit microprocessor that was designed by Chuck Peddle and Bill Mensch for MOS Technology in 1975. When it was introduced, it was the least expensive full-featured microprocessor on the market by a considerable margin, costing less than one-sixth the price of...

 in 1978. Wide dissemination finally led to standardization of the language. Common practice was codified in the de facto standards FORTH-79 and FORTH-83 in the years 1979 and 1983, respectively. These standards were unified by ANSI
American National Standards Institute
The American National Standards Institute is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organization also coordinates U.S. standards with international...

 in 1994, commonly referred to as ANS Forth.

Forth became very popular in the 1980s because it was well suited to the small microcomputer
Microcomputer
A microcomputer is a computer with a microprocessor as its central processing unit. They are physically small compared to mainframe and minicomputers...

s of that time, as it is compact and portable. At least one home computer
Home computer
Home computers were a class of microcomputers entering the market in 1977, and becoming increasingly common during the 1980s. They were marketed to consumers as affordable and accessible computers that, for the first time, were intended for the use of a single nontechnical user...

, the British Jupiter ACE
Jupiter ACE
The Jupiter Ace was a British home computer of the early 1980s, produced by a company, set up for the purpose, named Jupiter Cantab. The Ace differed from other microcomputers of the time in that it used FORTH instead of the more common BASIC.- Introduction :...

, had Forth in its ROM
Read-only memory
Read-only memory is a class of storage medium used in computers and other electronic devices. Data stored in ROM cannot be modified, or can be modified only slowly or with difficulty, so it is mainly used to distribute firmware .In its strictest sense, ROM refers only...

-resident operating system. The Canon Cat
Canon Cat
The Canon Cat was a task-dedicated, desktop computer released by Canon Inc. in 1987 at a price of $1495 USD. On the surface it was not unlike the dedicated word processors popular in the late 1970s to early 1980s, but it was far more powerful and incorporated many unique ideas for data...

 also used Forth for its system programming. Rockwell
Rockwell
- People :* Dick Rockwell, an American comic strip and comic book artist, nephew of Norman Rockwell* Francis W. Rockwell, a United States Congressman from Massachusetts* Francis W...

 also produced single-chip microcomputers with resident Forth kernels, the R65F11 and R65F12. A complete family tree is at TU-Wien.

Programmer's perspective

Forth relies heavily on explicit use of a data stack and reverse Polish notation
Reverse Polish notation
Reverse Polish notation is a mathematical notation wherein every operator follows all of its operands, in contrast to Polish notation, which puts the operator in the prefix position. It is also known as Postfix notation and is parenthesis-free as long as operator arities are fixed...

 (RPN or postfix notation), commonly used in calculators from Hewlett-Packard
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

. In RPN, the operator is placed after its operands, as opposed to the more common infix notation
Infix notation
Infix notation is the common arithmetic and logical formula notation, in which operators are written infix-style between the operands they act on . It is not as simple to parse by computers as prefix notation or postfix notation Infix notation is the common arithmetic and logical formula notation,...

 where the operator is placed between its operands. Postfix notation makes the language easier to parse and extend; Forth's flexibility makes a static BNF grammar inappropriate, and it does not have a monolithic compiler. Extending the compiler only requires writing a new word, instead of modifying a grammar and changing the underlying implementation.

Using RPN, one could get the result of the mathematical expression (25 * 10 + 50) this way:

25 10 * 50 + . 300 ok

This command line first puts the numbers 25 and 10 on the implied stack.

The word * multiplies the two numbers on the top of the stack and replaces them with their product.
Then the number 50 is placed on the stack.

The word + adds it to the previous product. Finally, the . command prints the result to the user's terminal.

Even Forth's structural features are stack-based. For example:

: FLOOR5 ( n -- n' ) DUP 6 < IF DROP 5 ELSE 1 - THEN ;

This code defines a new word (again, word is the term used for a subroutine) called FLOOR5 using the following commands: DUP duplicates the number on the stack; 6 places a 6 on top of the stack; < compares the top two numbers on the stack (6 and the DUPed input), and replaces them with a true-or-false value; IF takes a true-or-false value and chooses to execute commands immediately after it or to skip to the ELSE; DROP discards the value on the stack; and THEN ends the conditional. The text in parentheses is a comment, advising that this word expects a number on the stack and will return a possibly changed number. The FLOOR5 word is equivalent to this function written in the C programming language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 using the ternary operator
Ternary operation
In mathematics, a ternary operation is an n-ary operation with n = 3. A ternary operation on a set A takes any given three elements of A and combines them to form a single element of A. An example of a ternary operation is the product in a heap....

:

int floor5(int v) {
return (v < 6) ? 5 : (v - 1);
}

This function is written more succinctly as:

: FLOOR5 ( n -- n' ) 1- 5 MAX ;

You would run this word as follows:

1 FLOOR5 . 5 ok
8 FLOOR5 . 7 ok

First the interpreter pushes a number (1 or 8) onto the stack, then it calls FLOOR5, which pops off this number again and pushes the result. Finally, a call to "." pops the result and prints it to the user's terminal.

Facilities

Forth parsing
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...

 is simple, despite having no explicit grammar
Formal grammar
A formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...

. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter
Delimiter
A delimiter is a sequence of one or more characters used to specify the boundary between separate, independent regions in plain text or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values.Delimiters represent...

; some systems recognise additional whitespace
Whitespace (computer science)
In computer science, whitespace is any single character or series of characters that represents horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page...

 characters. When the interpreter finds a word, it tries to look the word up in the dictionary. If the word is found, the interpreter executes the code associated with the word, and then returns to parse the rest of the input stream. If the word isn't found, the word is assumed to be a number, and an attempt is made to convert it into a number and push it on the stack; if successful, the interpreter continues parsing the input stream. Otherwise, if both the lookup and number conversion fails, the interpreter prints the word followed by an error message indicating the word is not recognised, flushes the input stream, and waits for new user input.

The definition of a new word is started with the word : (colon) and ends with the word ; (semi-colon). For example

: X DUP 1+ . . ;

will compile the word X, and makes the name findable in the dictionary. When executed by typing 10 X at the console this will print 11 10.

Most Forth systems include a specialized assembler that produces executable words. The assembler is a special dialect of the compiler. Forth assemblers often use a reverse-polish syntax in which the parameters of an instruction precede the instruction. The usual design of a Forth assembler is to construct the instruction on the stack, then copy it into memory as the last step. Registers may be referenced by the name used by the manufacturer, numbered (0..n, as used in the actual operation code) or named for their purpose in the Forth system: e.g. "S" for the register used as a stack pointer.

Operating system, files, and multitasking

Classic Forth systems traditionally use neither operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

 nor file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

. Instead of storing code in files, source-code is stored in disk blocks written to physical disk addresses. The word BLOCK is employed to translate the number of a 1K-sized block of disk space into the address of a buffer containing the data, which is managed automatically by the Forth system. Some implement contiguous disk files using the system's disk access, where the files are located at fixed disk block ranges. Usually these are implemented as fixed-length binary records, with an integer number of records per disk block. Quick searching is achieved by hashed access on key data.

Multitasking
Computer multitasking
In computing, multitasking is a method where multiple tasks, also known as processes, share common processing resources such as a CPU. In the case of a computer with a single CPU, only one task is said to be running at any point in time, meaning that the CPU is actively executing instructions for...

, most commonly cooperative round-robin scheduling
Round-robin scheduling
Round-robin is one of the simplest scheduling algorithms for processes in an operating system. As the term is generally used, time slices are assigned to each process in equal portions and in circular order, handling all processes without priority . Round-robin scheduling is simple, easy to...

, is normally available (although multitasking words and support are not covered by the ANSI Forth Standard). The word PAUSE is used to save the current task's execution context, to locate the next task, and restore its execution context. Each task has its own stacks, private copies of some control variables and a scratch area. Swapping tasks is simple and efficient; as a result, Forth multitaskers are available even on very simple microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...

s such as the Intel 8051
Intel 8051
The Intel MCS-51 is a Harvard architecture, single chip microcontroller series which was developed by Intel in 1980 for use in embedded systems. Intel's original versions were popular in the 1980s and early 1990s. While Intel no longer manufactures the MCS-51, binary compatible derivatives remain...

, Atmel AVR
Atmel AVR
The AVR is a modified Harvard architecture 8-bit RISC single chip microcontroller which was developed by Atmel in 1996. The AVR was one of the first microcontroller families to use on-chip flash memory for program storage, as opposed to one-time programmable ROM, EPROM, or EEPROM used by other...

, and TI MSP430
TI MSP430
The MSP430 is a mixed-signal microcontroller family from Texas Instruments. Built around a 16-bit CPU, the MSP430 is designed for low cost, and specifically, low power consumption embedded applications. The architecture dates from the 1990s and is reminiscent of the DEC PDP-11.-Applications:The...

.

By contrast, some Forth systems run under a host operating system such as Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 or a version of Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 and use the host operating system's file system for source and data files; the ANSI Forth Standard describes the words used for I/O. Other non-standard facilities include a mechanism for issuing call
System call
In computing, a system call is how a program requests a service from an operating system's kernel. This may include hardware related services , creating and executing new processes, and communicating with integral kernel services...

s to the host OS or windowing system
Windowing system
A windowing system is a component of a graphical user interface , and more specifically of a desktop environment, which supports the implementation of window managers, and provides basic support for graphics hardware, pointing devices such as mice, and keyboards...

s, and many provide extensions that employ the scheduling provided by the operating system. Typically they have a larger and different set of words from the stand-alone Forth's PAUSE word for task creation, suspension, destruction and modification of priority.

Self-compilation and cross compilation

A fully featured Forth system with all source code will compile itself, a technique commonly called meta-compilation by Forth programmers (although the term doesn't exactly match meta-compilation
Meta-Compilation
Metacompilation is a computation which involves metasystem transitions from a computing machine M to a metamachine M' which controls, analyzes and imitates the work of M. Semantics-based program transformation, such as partial evaluation and supercompilation , is metacomputation...

 as it is normally defined). The usual method is to redefine the handful of words that place compiled bits into memory. The compiler's words use specially named versions of fetch and store that can be redirected to a buffer area in memory. The buffer area simulates or accesses a memory area beginning at a different address than the code buffer. Such compilers define words to access both the target computer's memory, and the host (compiling) computer's memory.

After the fetch and store operations are redefined for the code space, the compiler, assembler, etc. are recompiled using the new definitions of fetch and store. This effectively reuses all the code of the compiler and interpreter. Then, the Forth system's code is compiled, but this version is stored in the buffer. The buffer in memory is written to disk, and ways are provided to load it temporarily into memory for testing. When the new version appears to work, it is written over the previous version.

There are numerous variations of such compilers for different environments. For embedded system
Embedded system
An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

s, the code may instead be written to another computer, a technique known as cross compilation, over a serial port or even a single TTL
Transistor-transistor logic
Transistor–transistor logic is a class of digital circuits built from bipolar junction transistors and resistors. It is called transistor–transistor logic because both the logic gating function and the amplifying function are performed by transistors .TTL is notable for being a widespread...

 bit, while keeping the word names and other non-executing parts of the dictionary in the original compiling computer. The minimum definitions for such a forth compiler are the words that fetch and store a byte, and the word that commands a Forth word to be executed. Often the most time-consuming part of writing a remote port is constructing the initial program to implement fetch, store and execute, but many modern microprocessors have integrated debugging features (such as the Motorola CPU32) that eliminate this task.

Structure of the language

The basic data structure of Forth is the "dictionary" which maps "words" to executable code or named data structures. The dictionary is laid out in memory as a tree of linked list
Linked list
In computer science, a linked list is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of a datum and a reference to the next node in the sequence; more complex variants add additional links...

s with the links proceeding from the latest (most recently) defined word to the oldest, until a sentinel value
Sentinel value
In computer programming, a sentinel value is a special value whose presence guarantees termination of a loop that processes structured data...

, usually a NULL pointer, is found. A context switch causes a list search to start at a different leaf. A linked list search continues as the branch merges into the main trunk leading eventually back to the sentinel, the root.
There can be several dictionaries. In rare cases such as meta-compilation a dictionary might be isolated and stand-alone.
The effect resembles that of nesting namespaces and can overload keywords depending on the context.

A defined word generally consists of head and body with the head consisting of the name field (NF) and the link field (LF) and body consisting of the code field (CF) and the parameter field (PF).

Head and body of a dictionary entry are treated separately because they may not be contiguous. For example, when a Forth program is recompiled for a new platform, the head may remain on the compiling computer, while the body goes to the new platform. In some environments (such as embedded system
Embedded system
An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

s) the heads occupy memory unnecessarily. However, some cross-compilers may put heads in the target if the target itself is expected to support an interactive Forth.

Dictionary entry

The exact format of a dictionary entry is not prescribed, and implementations vary. However, certain components are almost always present, though the exact size and order may vary. Described as a structure, a dictionary entry might look this way:

structure
byte: flag \ 3bit flags + length of word's name
char-array: name \ name's runtime length isn't known at compile time
address: previous \ link field, backward ptr to previous word
address: codeword \ ptr to the code to execute this word
any-array: parameterfield \ unknown length of data, words, or opcodes
end-structure forthword

The name field starts with a prefix giving the length of the word's name (typically up to 32 bytes), and several bits for flags. The character representation of the word's name then follows the prefix. Depending on the particular implementation of Forth, there may be one or more NUL ('\0') bytes for alignment.

The link field contains a pointer to the previously defined word. The pointer may be a relative displacement or an absolute address that points to the next oldest sibling.

The code field pointer will be either the address of the word which will execute the code or data in the parameter field or the beginning of machine code that the processor will execute directly. For colon defined words, the code field pointer points to the word that will save the current Forth instruction pointer (IP) on the return stack, and load the IP with the new address from which to continue execution of words. This is the same as what a processor's call/return instructions does.

Structure of the compiler

The compiler itself is not a monolithic program. It consists of Forth words visible to the system, and usable
by a programmer. This allows a programmer to change the compiler's words for special purposes.

The "compile time" flag in the name field is set for words with "compile time" behavior. Most simple words execute the same code whether they are typed on a command line, or embedded in code. When compiling these, the compiler simply places code or a threaded pointer to the word.

The classic examples of compile-time words are the control structures such as IF and WHILE. All of Forth's control structures, and almost all of its compiler are implemented as compile-time words. All of Forth's control flow
Control flow
In computer science, control flow refers to the order in which the individual statements, instructions, or function calls of an imperative or a declarative program are executed or evaluated....

 words are executed during compilation to compile various combinations of the primitive words BRANCH (unconditional branch) and ?BRANCH (pop a value off the stack, and branch if it is false). During compilation, the data stack is used to support control structure balancing, nesting, and backpatching of branch addresses. The snippet:
... DUP 6 < IF DROP 5 ELSE 1 - THEN ...
would be compiled to the following sequence inside of a definition:
... DUP LIT 6 < ?BRANCH 5 DROP LIT 5 BRANCH 3 LIT 1 - ...
The numbers after BRANCH represent relative jump addresses. LIT is the primitive word for pushing a "literal" number onto the data stack.

Compilation state and interpretation state

The word : (colon) parses a name as a parameter, creates a dictionary entry (a colon definition) and enters compilation state. The interpreter continues to read space-delimited words from the user input device. If a word is found, the interpreter executes the compilation semantics associated with the word, instead of the interpretation semantics. The default compilation semantics of a word are to append its interpretation semantics to the current definition.

The word ; (semi-colon) finishes the current definition and returns to interpretation state. It is an example of a word whose compilation semantics differ from the default. The interpretation semantics of ; (semi-colon), most control flow words, and several other words are undefined in ANS Forth, meaning that they must only be used inside of definitions and not on the interactive command line.

The interpreter state can be changed manually with the words [ (left-bracket) and ] (right-bracket) which enter interpretation state or compilation state, respectively. These words can be used with the word LITERAL to calculate a value during a compilation and to insert the calculated value into the current colon definition. LITERAL has the compilation semantics to take an object from the data stack and to append semantics to the current colon definition to place that object on the data stack.

In ANS Forth, the current state of the interpreter can be read from the flag
Flag (computing)
In computer programming, flag can refer to one or more bits that are used to store a binary value or code that has an assigned meaning, but can refer to uses of other data types...

 STATE which contains the value true when in compilation state and false otherwise. This allows the implementation of so-called state-smart words with behavior that changes according to the current state of the interpreter.

Immediate words

The word IMMEDIATE marks the most recent colon definition as an immediate word, effectively replacing its compilation semantics with its interpretation semantics. Immediate words are normally executed during compilation, not compiled but this can be overridden by the programmer, in either state. ; is an example of an immediate word. In ANS Forth, the word POSTPONE takes a name as a parameter and appends the compilation semantics of the named word to the current definition even if the word was marked immediate. Forth-83 defined separate words COMPILE and [COMPILE] to force the compilation of non-immediate and immediate words, respectively.

Unnamed words and execution tokens

In ANS Forth, unnamed words can be defined with the word :NONAME which compiles the following words up to the next ; (semi-colon) and leaves an execution token on the data stack. The execution token provides an opaque handle for the compiled semantics, similar to the function pointer
Function pointer
A function pointer is a type of pointer in C, C++, D, and other C-like programming languages, and Fortran 2003. When dereferenced, a function pointer can be used to invoke a function and pass it arguments just like a normal function...

s of the C programming language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

.

Execution tokens can be stored in variables. The word EXECUTE takes an execution token from the data stack and performs the associated semantics. The word COMPILE, (compile-comma) takes an execution token from the data stack and appends the associated semantics to the current definition.

The word ' (tick) takes the name of a word as a parameter and returns the execution token associated with that word on the data stack. In interpretation state, ' RANDOM-WORD EXECUTE is equivalent to RANDOM-WORD.

Parsing words and comments

The words : (colon), POSTPONE, ' (tick) and :NONAME are examples of parsing words that take their arguments from the user input device instead of the data stack. Another example is the word ( (paren) which reads and ignores the following words up to and including the next right parenthesis and is used to place comments in a colon definition. Similarly, the word \ (backslash) is used for comments that continue to the end of the current line. To be parsed correctly, ( (paren) and \ (backslash) must be separated by whitespace from the following comment text.

Structure of code

In most Forth systems, the body of a code definition consists of either machine language, or some form of threaded code
Threaded code
In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines...

. The original Forth which follows the informal FIG standard (Forth Interest Group), is a TIL (Threaded Interpretive Language). This is also called indirect-threaded code, but direct-threaded and subroutine threaded Forths have also become popular in modern times. The fastest modern Forths use subroutine threading, insert simple words as macros, and perform peephole optimization
Peephole optimization
In compiler theory, peephole optimization is a kind of optimization performed over a very small set of instructions in a segment of generated code. The set is called a "peephole" or a "window"...

 or other optimizing strategies to make the code smaller and faster.

Data objects

When a word is a variable or other data object, the CF points to the runtime code associated with the defining word that created it. A defining word has a characteristic "defining behavior" (creating a dictionary entry plus possibly allocating and initializing data space) and also specifies the behavior of an instance of the class of words constructed by this defining word. Examples include:

VARIABLE
Names an uninitialized, one-cell memory location. Instance behavior of a VARIABLE returns its address on the stack.

CONSTANT
Names a value (specified as an argument to CONSTANT). Instance behavior returns the value.

CREATE
Names a location; space may be allocated at this location, or it can be set to contain a string or other initialized value. Instance behavior returns the address of the beginning of this space.


Forth also provides a facility by which a programmer can define new application-specific defining words, specifying both a custom defining behavior and instance behavior. Some examples include circular buffers, named bits on an I/O port, and automatically indexed arrays.

Data objects defined by these and similar words are global in scope. The function provided by local variables in other languages is provided by the data stack in Forth (although Forth also has real local variables). Forth programming style uses very few named data objects compared with other languages; typically such data objects are used to contain data which is used by a number of words or tasks (in a multitasked implementation).

Forth does not enforce consistency of data type usage; it is the programmer's responsibility to use appropriate operators to fetch and store values or perform other operations on data.

Programming

Words written in Forth are compiled into an executable form. The classical "indirect threaded" implementations compile lists of addresses of words to be executed in turn; many modern systems generate actual machine code (including calls to some external words and code for others expanded in place). Some systems have optimizing compilers. Generally speaking, a Forth program is saved as the memory image of the compiled program with a single command (e.g., RUN) that is executed when the compiled version is loaded.

During development, the programmer uses the interpreter to execute and test each little piece as it is developed. Most Forth programmers therefore advocate a loose top-down design, and bottom-up development with continuous testing and integration.

The top-down design is usually separation of the program into "vocabularies" that are then used as high-level sets of tools to write the final program. A well-designed Forth program reads like natural language, and implements not just a single solution, but also sets of tools to attack related problems.

Hello world

For an explanation of the tradition of programming "Hello World", see Hello world program
Hello world program
A "Hello world" program is a computer program that outputs "Hello world" on a display device. Because it is typically one of the simplest programs possible in most programming languages, it is by tradition often used to illustrate to beginners the most basic syntax of a programming language, or to...

.


One possible implementation:

: HELLO ( -- ) CR ." Hello, world!" ; HELLO
Hello, world!

The word CR (Carriage Return) causes the following output to be displayed on a new line. The parsing word ." (dot-quote) reads a double-quote delimited string and appends code to the current definition so that the parsed string will be displayed on execution. The space character separating the word ." from the string Hello, world! is not included as part of the string. It is needed so that the parser recognizes ." as a Forth word.

A standard Forth system is also an interpreter, and the same output can be obtained by typing the following code fragment into the Forth console:

CR .( Hello, world!)

.( (dot-paren) is an immediate word that parses a parenthesis-delimited string and displays it. As with the word ." the space character separating .( from Hello, world! is not part of the string.

The word CR comes before the text to print. By convention, the Forth interpreter does not start output on a new line. Also by convention, the interpreter waits for input at the end of the previous line, after an ok prompt. There is no implied 'flush-buffer' action in Forth's CR, as sometimes is in other programming languages.

Mixing compilation state and interpretation state

Here is the definition of a word EMIT-Q which when executed emits the single character Q:

: EMIT-Q 81 ( the ASCII value for the character 'Q' ) EMIT ;

This definition was written to use the ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 value of the Q character (81) directly. The text between the parentheses is a comment and is ignored by the compiler. The word EMIT takes a value from the data stack and displays the corresponding character.

The following redefinition of EMIT-Q uses the words [ (left-bracket), ] (right-bracket), CHAR and LITERAL to temporarily switch to interpreter state, calculate the ASCII value of the Q character, return to compilation state and append the calculated value to the current colon definition:

: EMIT-Q [ CHAR Q ] LITERAL EMIT ;

The parsing word CHAR takes a space-delimited word as parameter and places the value of its first character on the data stack. The word [CHAR] is an immediate version of CHAR. Using [CHAR], the example definition for EMIT-Q could be rewritten like this:

: EMIT-Q [CHAR] Q EMIT ; \ Emit the single character 'Q'

This definition used \ (backslash) for the describing comment.

Both CHAR and [CHAR] are predefined in ANS Forth. Using IMMEDIATE and POSTPONE, [CHAR] could have been defined like this:

: [CHAR] CHAR POSTPONE LITERAL ; IMMEDIATE

A complete RC4 cipher program

In 1987, Ron Rivest
Ron Rivest
Ronald Linn Rivest is a cryptographer. He is the Andrew and Erna Viterbi Professor of Computer Science at MIT's Department of Electrical Engineering and Computer Science and a member of MIT's Computer Science and Artificial Intelligence Laboratory...

 developed the RC4
RC4
In cryptography, RC4 is the most widely used software stream cipher and is used in popular protocols such as Secure Sockets Layer and WEP...

 cipher-system for RSA Data Security, Inc. The code is extremely simple and can be written by most programmers from the description.

We have an array of 256 bytes, all different. Every time the array is used it changes by swapping two bytes. The swaps are controlled by counters i and j, each initially 0. To get a new i, add 1. To get a new j, add the array byte at the new i. Exchange the array bytes at i and j. The code is the array byte at the sum of the array bytes at i and j. This is XORed with a byte of the plaintext to encrypt, or the ciphertext to decrypt. The array is initialized by first setting it to 0 through 255. Then step through it using i and j, getting the new j by adding to it the array byte at i and a key byte, and swapping the array bytes at i and j. Finally, i and j are set to 0. All additions are modulo 256.

The following Standard Forth version uses Core and Core Extension words only.


0 value ii 0 value jj
0 value KeyAddr 0 value KeyLen
create SArray 256 allot \ state array of 256 bytes
KeyArray KeyLen mod KeyAddr ;

get_byte + c@ ;
set_byte + c! ;
as_byte 255 and ;
reset_ij 0 TO ii 0 TO jj ;
i_update 1 + as_byte TO ii ;
j_update ii SArray get_byte + as_byte TO jj ;
swap_s_ij

jj SArray get_byte
ii SArray get_byte jj SArray set_byte
ii SArray set_byte

rc4_init ( KeyAddr KeyLen -- )

256 min TO KeyLen TO KeyAddr
256 0 DO i i SArray set_byte LOOP
reset_ij
BEGIN
ii KeyArray get_byte jj + j_update
swap_s_ij
ii 255 < WHILE
ii i_update
REPEAT
reset_ij
rc4_byte

ii i_update jj j_update
swap_s_ij
ii SArray get_byte jj SArray get_byte + as_byte SArray get_byte xor



This is one of many tests to validate the code.

hex
create AKey 61 c, 8A c, 63 c, D2 c, FB c,
test cr 0 DO rc4_byte . LOOP cr ;

AKey 5 rc4_init
2C F9 4C EE DC 5 test \ output should be: F1 38 29 C9 DE

Implementations

Because the Forth virtual machine is simple to implement and has no standard reference implementation, there are numerous implementations of the language. In addition to supporting the standard varieties of desktop computer systems (POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

, Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

), many of these Forth systems also target a variety of embedded systems. Listed here are the some of the more prominent systems which conform to the 1994 ANS Forth standard.
  • Gforth
    Gforth
    Gforth is a project to develop an implementation for ANS Forth. It is part of the GNU Project.-Goals:Gforth's goals can be split into several subgoals:* Gforth should conform to the ANS Forth standard....

     - a portable ANS Forth implementation from the GNU Project
    GNU Project
    The GNU Project is a free software, mass collaboration project, announced on September 27, 1983, by Richard Stallman at MIT. It initiated GNU operating system development in January, 1984...

  • FORTH, Inc. - founded by the originators of Forth, sells desktop (SwiftForth) and embedded (SwiftX) ANS Forth
  • MPE Ltd. - sells highly optimized desktop (VFX) and embedded ANS Forth compilers
  • Open Firmware
    Open Firmware
    Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers . It originated at Sun, and has been used by Sun, Apple, IBM, and most other non-x86 PCI chipset...

     - a bootloader and BIOS
    BIOS
    In IBM PC compatible computers, the basic input/output system , also known as the System BIOS or ROM BIOS , is a de facto standard defining a firmware interface....

     standard based on ANS Forth
  • pforth
    Pforth
    Portable Forth is a portable implementation of the Forth programming language written in ANSI C. It differs from other distributions of Forth in that it strives for portability over performance....

  • Freely available implementations
  • Commercial implementations
  • Win32 Forth

There is also a more up-to-date index of Forth systems, organized by platform, also including Forth systems that do not (or not completely) conform to the ANS 94 standard.

See also

  • Joy
  • Reva Forth
    Reva Forth
    Reva Forth is a free implementation of the Forth computer language for Windows, Linux and Mac.Reva is based on RetroForth, which is modeled after CMforth, Colorforth, Eforth and Pygmy. It uses some, but not all, of Chuck Moore's newer ideas. It is currently targeted at the Intel x86 CPU family,...

  • STOIC
    STOIC
    STOIC was a variant of Forth.It started out at the MIT and Harvard Biomedical Engineering Centre in Boston, and was written in the mid 1970s by Jonathan Sachs...

  • Stack machine
    Stack machine
    A stack machine may be* A real or emulated computer that evaluates each sub-expression of a program statement via a pushdown data stack and uses a reverse Polish notation instruction set....

  • Charles H. Moore
    Charles H. Moore
    Charles H. Moore is the inventor of the Forth programming language.- Biography :In 1968, while employed at the United States National Radio Astronomy Observatory , Moore invented the initial version of the Forth language to help control radio telescopes...


External links


External links

  • [news://comp.lang.forth comp.lang.forth] - Usenet
    Usenet
    Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...

     newsgroup with active Forth discussion
  • Forth Chips Page — Forth in hardware
  • Jupiter Ace Archive — Forth computer from 1983
  • fignition — Forth in DIY hardware
  • A Beginner's Guide to Forth by J.V. Noble
  • Forth Links
  • J2EE Forth — a Forth implementation in Java
  • Various Forth variants and ANSI docs
  • Thinking Forth Project includes the seminal (and previously out of print) book Thinking Forth by Leo Brodie published in 1984, now available both as a PDF and in hardcopy as a reprint, with some revisions to ensure current compatibility.
  • Starting Forth author Leo Brodie's homepage.
  • Computerworld Interview with Charles H. Moore on Forth
  • FORTH Retro Podcast in two parts (part I, part II)
  • gforth homepage. GNU version of forth.
  • Reva Forth homepage. x86 Forth for GNU/Linux, Mac(intel) and windows
  • Mitch Bradley's Forth Lessons (and Open Firmware lessons) supporting the OLPC XO-1
    OLPC XO-1
    The XO-1, previously known as the $100 Laptop, Children's Machine, and 2B1, is an inexpensive subnotebook computer intended to be distributed to children in developing countries around the world, to provide them with access to knowledge, and opportunities to "explore, experiment and express...

    project.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK