All Topics  
Forth

 
Forth

   Email Print
   Bookmark   Link






 

Forth



 
 
Forth is a structured
Structured programming

Structured programming can be seen as a subset or subdiscipline of procedural programming, one of the major programming paradigms. It is most famous for removing or reducing reliance on the GOTO Statement ....
, imperative
Imperative programming

In computer science, imperative programming is a programming paradigm that describes computation in terms of statement s that change a program state ....
, stack-based
Stack-oriented programming language

A stack-oriented programming language is one that relies on a stack machine model for passing parameters. Several programming languages fit this description, notably Forth and PostScript, and also many Assembly languages ....
, computer
Computer programming

Computer programming is the process of writing, testing, debugging/troubleshooting, and maintaining the source code of computer programs. This source code is written in a programming language....
 programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
 and programming environment. Forth is sometimes spelled in all capital letters following the customary usage during its earlier years, although the name is not an acronym.

A procedural
Procedural programming

Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm based upon the concept of the procedure call....
, stack-oriented
Stack-oriented programming language

A stack-oriented programming language is one that relies on a stack machine model for passing parameters. Several programming languages fit this description, notably Forth and PostScript, and also many Assembly languages ....
 and reflective
Reflection (computer science)

In computer science, reflection is the process by which a computer program can observe and modify its own structure and behaviour. The programming paradigm driven by reflection is called reflective programming....
 programming language without type checking
Type system

In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."....
, Forth features both interactive execution of commands (making it suitable as a shell for systems that lack a more formal operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
) and the ability to compile
Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
 sequences of commands for later execution.






Discussion
Ask a question about 'Forth'
Start a new discussion about 'Forth'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Forth is a structured
Structured programming

Structured programming can be seen as a subset or subdiscipline of procedural programming, one of the major programming paradigms. It is most famous for removing or reducing reliance on the GOTO Statement ....
, imperative
Imperative programming

In computer science, imperative programming is a programming paradigm that describes computation in terms of statement s that change a program state ....
, stack-based
Stack-oriented programming language

A stack-oriented programming language is one that relies on a stack machine model for passing parameters. Several programming languages fit this description, notably Forth and PostScript, and also many Assembly languages ....
, computer
Computer programming

Computer programming is the process of writing, testing, debugging/troubleshooting, and maintaining the source code of computer programs. This source code is written in a programming language....
 programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
 and programming environment. Forth is sometimes spelled in all capital letters following the customary usage during its earlier years, although the name is not an acronym.

A procedural
Procedural programming

Procedural programming can sometimes be used as a synonym for imperative programming , but can also refer to a programming paradigm based upon the concept of the procedure call....
, stack-oriented
Stack-oriented programming language

A stack-oriented programming language is one that relies on a stack machine model for passing parameters. Several programming languages fit this description, notably Forth and PostScript, and also many Assembly languages ....
 and reflective
Reflection (computer science)

In computer science, reflection is the process by which a computer program can observe and modify its own structure and behaviour. The programming paradigm driven by reflection is called reflective programming....
 programming language without type checking
Type system

In computer science, a type system may be defined as "a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute."....
, Forth features both interactive execution of commands (making it suitable as a shell for systems that lack a more formal operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
) and the ability to compile
Compiler

A compiler is a computer program that transforms source code written in a programming language into another computer language . The most common reason for wanting to transform source code is to create an executable program....
 sequences of commands for later execution. Some Forth implementations (usually early versions or those written to be extremely portable) compile threaded code
Threaded code

In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines....
, but many implementations today generate optimized
Compiler optimization

Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attribute of an executable computer program....
 machine code like other language compilers.

Although not as popular as other programming systems, Forth has enough support to keep several language vendors and contractors in business. Forth is currently used in boot loaders such as Open Firmware
Open Firmware

Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers....
, space applications, and other embedded systems. An implementation of Forth by the GNU Project
GNU Project

The GNU Project is a free software, mass collaboration project, announced on September 27 1983 by Richard Stallman. It initiated the GNU operating system, software development for which began in January 1984....
 is actively maintained, the last release in November 2008. The 1994 standard is currently undergoing revision, provisionally titled Forth 200x.

Overview


A Forth environment combines the compiler with an interactive shell. The user interactively defines and runs subroutine
Subroutine

In computer science, a subroutine or subprogram is a portion of computer code within a larger computer program, which performs a specific task and is relatively independent of the remaining code....
s, or "words," in a virtual machine
Virtual machine

In computer science, a virtual machine is a software implementation of a machine that executes programs like a real machine.Definitions...
 similar to the runtime
Runtime

In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination ....
 environment. Words can be tested, redefined, and debugged as the source is entered without recompiling or restarting the whole program. All syntactic elements, including variables and basic operators, appear as such procedures. Even if a particular word is optimized so as not to require a subroutine call, it is also still available as a subroutine. On the other hand, the shell may compile interactively typed commands into machine code before running them. (This behavior is common, but not required.) Forth environments vary in how the resulting program is stored, but ideally running the program has the same effect as manually re-entering the source. This contrasts with the combination of C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
 with Unix shell
Unix shell

A Unix shell is a command-line interpreter and script host that provides a traditional user interface for the Unix operating system and for Unix-like systems....
s, wherein compiled functions are a special class of program objects and interactive commands are strictly interpreted
Interpreter (computing)

In computer science, an interpreter normally means a computer program that execution , i.e. performs, instructions written in a programming language....
. Most of Forth's unique properties result from this principle. By including interaction, scripting, and compilation, Forth was popular on computers with limited resources, such as the BBC Micro
BBC Micro

The BBC Microcomputer System, or BBC Micro, was a series of microcomputers and associated peripherals designed and built by Acorn Computers for the BBC Computer Literacy Project, operated by the British Broadcasting Corporation....
 and Apple II series, and remains so in applications such as firmware
Firmware

Firmware is a term sometimes used to denote the fixed, usually rather small, programs that internally control various electronic devices. Typical examples range from end user products such as remote controls or calculators, via computer parts and devices like harddisks, keyboard s, TFT screens or memory cards, all the way to scientific instr...
 and small microcontroller
Microcontroller

A microcontroller is a small computer on a single integrated circuit consisting of a relatively simple CPU combined with support functions such as a crystal oscillator, timers, watchdog, serial and analog I/O etc....
s. Where C compilers may now generate code with more compactness and performance, Forth retains the advantage of interactivity.

The stacks


Every programming environment with subroutine
Subroutine

In computer science, a subroutine or subprogram is a portion of computer code within a larger computer program, which performs a specific task and is relatively independent of the remaining code....
s implements a stack
Stack (data structure)

In computer science, a stack is an abstract data type and data structure based on the principle of LIFO . Stacks are used extensively at every level of a modern computer system....
 for control flow
Control flow

In computer science control flow refers to the order in which the individual statement , Instruction or function calls of an imperative programming or functional programming computer program are execution or evaluated....
. This structure typically also stores local variable
Local variable

In computer science, a local variable is a variable that is given local scope . Such a variable is accessible only from the subroutine or statement block in which it is declared....
s, including subroutine parameter
Parameter (computer science)

In computer programming, a parameter is a special kind of variable#In_computer_programming that refers to data that a subroutine receives to operate on....
s (in a call by value system such as C). Forth often does not have local variables, however, nor is it call-by-value. Instead, intermediate values are kept in a second stack
Stack (data structure)

In computer science, a stack is an abstract data type and data structure based on the principle of LIFO . Stacks are used extensively at every level of a modern computer system....
. Words operate directly on the topmost values in this stack. It may therefore be called the "parameter" or "data" stack, but most often simply "the" stack. The function-call stack is then called the "linkage" or "return" stack, abbreviated rstack. Special rstack manipulation functions provided by the kernel allow it to be used for temporary storage within a word, but otherwise it cannot be used to pass parameters or manipulate data.

Most words are specified in terms of their effect on the stack. Typically, parameters are placed on the top of the stack before the word executes. After execution, the parameters have been erased and replaced with any return values. For arithmetic operators, this follows the rule of reverse Polish notation
Reverse Polish notation

Reverse Polish notation by analogy with the related Polish notation, a prefix notation introduced in 1920 by the Poland mathematician Jan Lukasiewicz, is a mathematical notation wherein every operator follows all of its operands....
. See below for examples illustrating stack usage.

Maintenance


Forth is a simple yet extensible language; its modularity and extensibility permit the writing of high-level programs such as CAD systems. However, extensibility also helps poor programmers to write incomprehensible code, which has given Forth a reputation as a "write-only language
Write-only language

Write-only language is a derogatory term for a programming language whose syntax is considered inscrutable. Write-only code is source code so arcane, complex, or ill-structured that it cannot be reliably modified or even comprehended by anyone with the possible exception of the author....
". Forth has been used successfully in large, complex projects, while applications developed by competent, disciplined professionals have proven to be easily maintained on evolving hardware platforms over decades of use. Forth has a niche both in astronomical and space applications. Forth is still used today in many embedded system
Embedded system

An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions, often with real-time computing constraints....
s (small computerized devices) because of its portability
Porting

In computer science, porting is the process of adapting software so that an executable Computer program can be created for a computing environment that is different from the one for which it was originally designed ....
, efficient memory use, short development time, and fast execution speed. It has been implemented efficiently on modern RISC processors, and processors that use Forth as machine language
Stack machine

In computer science, a stack machine is a model of computation in which the computer's memory takes the form of one or more stack s. The term also refers to an actual computer implementing or simulating the idealized stack machine....
 have been produced. Other uses of Forth include the Open Firmware
Open Firmware

Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers....
 boot ROMs
Booting

In computing, booting is a Bootstrapping process that starts operating systems when the user turns on a computer system. A boot sequence is the initial set of operations that the computer performs when it is switched on....
 used by Apple, IBM
IBM

International Business Machines Corporation, abbreviated IBM and nicknamed "Big Blue" , is a multinational corporation computer technology and consulting corporation headquartered in Armonk, New York, New York, United States....
, Sun
Sun Microsystems

Sun Microsystems, Inc. is a multinational corporation vendor of computers, computer components, computer software, and information technology services, founded on February 24, 1982....
, and OLPC XO-1
OLPC XO-1

The XO-1, previously known as the $100 Laptop, Children's Machine, and 2B1, is an inexpensive subnotebook computer intended to be distributed to children in developing countries around the world, to provide them with Access to Knowledge movement, and opportunities to "explore, experiment and express themselves" ....
; and the -based first stage boot controller
BTX (boot loader)

BTX is the standard FreeBSD and DragonflyBSD boot loader. It includes an Open Firmware-like Forth interpreter to manage the boot process. It is written by Robert Nordier....
 of the FreeBSD
FreeBSD

FreeBSD is a Unix-like free software operating system descended from AT&T Unix via the Berkeley Software Distribution branch through the 386BSD and Berkeley Software Distribution#4.4BSD and descendants operating systems....
 operating system.

History

Forth evolved from Charles H. Moore
Charles H. Moore

Charles H. Moore is the inventor of the Forth .In 1968, while employed at the United States National Radio Astronomy Observatory , Moore invented the initial version of the Forth language to help control radio telescopes....
's personal programming system, which had been in continuous development since 1958. Forth was first exposed to other programmers in the early 1970s, starting with Elizabeth Rather
Elizabeth Rather

Elizabeth Rather is the co-founder of FORTH, Inc. and is a leading expert in the Forth .Elizabeth Rather was a colleague of Charles H. Moore back when he worked at NRAO....
 at the US National Radio Astronomy Observatory
National Radio Astronomy Observatory

The National Radio Astronomy Observatory is a FFRDC of the United States National Science Foundation operated under cooperative agreement by Associated Universities, Inc for the purpose of radio astronomy....
. After their work at NRAO, Charles Moore and Elizabeth Rather formed FORTH, Inc. in 1973, refining and porting Forth systems to dozens of other platforms in the next decade.

Forth is so named because in 1968 "[t]he file holding the interpreter was labeled FOURTH, for 4th (next) generation software — but the IBM 1130
IBM 1130

The IBM 1130 Computing System was introduced in 1965. It was IBM's least-expensive computer to date, and was aimed at price-sensitive, computing-intensive technical markets like education and engineering....
 operating system restricted file names to 5 characters." Moore saw Forth as a successor to compile-link-go third-generation programming language
Third-generation programming language

A third-generation language is a refinement of a second generation programming language. Where as a second generation language is more aimed to fix logical structure to the language, a third generation language aims to refine the usability of the language in such a way to make it more user friendly....
s, or software for "fourth generation" hardware, not a fourth-generation programming language
Fourth-generation programming language

A fourth-generation programming language is a programming language or programming environment designed with a specific purpose in mind, such as the development of commercial business software....
 as the term has come to be used.

Because Charles Moore had frequently moved from job to job over his career, an early pressure on the developing language was ease of porting
Porting

In computer science, porting is the process of adapting software so that an executable Computer program can be created for a computing environment that is different from the one for which it was originally designed ....
 to different computer architectures. A Forth system has often been used to bring up new hardware. For example, Forth was the first resident software on the new Intel 8086
Intel 8086

The 8086 is a 16-bit microprocessor chip designed by Intel and introduced on the market in 1978, which gave rise to the x86 architecture. Intel 8088, released in 1979, was essentially the same chip, but with an external 8-bit bus , and is notable as the processor used in the original IBM PC....
 chip in 1978 and MacFORTH was the first resident development system for the first Apple Macintosh
Macintosh

File:Imac alu.pngMacintosh, commonly shortened to Mac, is a brand name which covers several lines of personal computers designed, developed, and marketed by Apple Inc....
 in 1984.

FORTH, Inc's microFORTH was developed for the Intel 8080
Intel 8080

The Intel 8080 was an early microprocessor designed and manufactured by Intel. The 8-bit microprocessor was released in April 1974 running at 2 megahertz , and is generally considered to be the first truly usable microprocessor....
, Motorola 6800
Motorola 6800

The 6800 is an 8-bit microprocessor produced by Motorola and released shortly after the Intel 8080 in late 1974. It had 78 instructions, including the famous, undocumented Halt and Catch Fire bus test instruction....
, and Zilog Z80
Zilog Z80

The Zilog Z80 is an 8-bit microprocessor designed and sold by Zilog from July 1976 onwards. It was widely used both in desktop and embedded computer designs as well as for military purposes....
 microprocessors starting in 1976. MicroFORTH was later used by hobbyists to generate Forth systems for other architectures, such as the 6502
MOS Technology 6502

The MOS Technology 6502 is an 8-bit microprocessor that was designed by Chuck Peddle and Bill Mensch for MOS Technology in 1975. When it was introduced, it was the least expensive full-featured central processing unit on the market by a considerable margin, costing less than one-sixth the price of competing designs from larger companies such...
 in 1978. Wide dissemination finally led to standardization of the language. Common practice was codified in the de facto standards FORTH-79 and FORTH-83 in the years 1979 and 1983, respectively. These standards were unified by ANSI
American National Standards Institute

The American National Standards Institute or ANSI is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States....
 in 1994, commonly referred to as ANS Forth.

Forth became very popular in the 1980s because it was well suited to the small microcomputer
Microcomputer

A microcomputer is a computer with a microprocessor as its central processing unit. Another general characteristic of these computers is that they occupy physically small amounts of space when compared to mainframe computer and minicomputers....
s of that time, as it is compact and portable. At least one home computer
Home computer

A home computer was a class of personal computer entering the market in 1977 and becoming common during the 1980s. They were marketed to consumers as accessible personal computers, more capable than video game consoles....
, the British Jupiter ACE
Jupiter ACE

The Jupiter Ace was a British home computer of the early 1980s, produced by a company, set up for the purpose, named Jupiter Cantab. The Ace differed from other microcomputers of the time in that it used Forth instead of the traditional BASIC ....
, had Forth in its ROM
Read-only memory

Read-only memory is a class of computer storage media used in computers and other electronic devices. Because data stored in ROM cannot be modified , it is mainly used to distribute firmware ....
-resident operating system. The Canon Cat
Canon Cat

The Canon Cat was a task-dedicated, desktop microcomputer released by Canon Inc. in 1987 at a price of $1495 United States dollar. On the surface it was not unlike the dedicated word processors popular in the late 1970s to early 1980s, but it was far more powerful and incorporated many unique ideas for data manipulation....
 also used Forth for its system programming. Rockwell also produced single-chip microcomputers with resident Forth kernels, the R65F11 and R65F12.

Programmer's perspective

Forth relies heavily on explicit use of a data stack and reverse Polish notation
Reverse Polish notation

Reverse Polish notation by analogy with the related Polish notation, a prefix notation introduced in 1920 by the Poland mathematician Jan Lukasiewicz, is a mathematical notation wherein every operator follows all of its operands....
 (RPN or postfix notation), commonly used in calculators from Hewlett-Packard
Hewlett-Packard

The Hewlett-Packard Company , commonly referred to as HP, is a technology corporation headquartered in Palo Alto, California, United States....
. In RPN, the operator is placed after its operands, as opposed to the more common infix notation
Infix notation

Infix notation is the common arithmetic and logical formula notation, in which operators are written infix-style between the operands they act on ....
 where the operator is placed between its operands. Postfix notation makes the language easier to parse and extend; Forth does not use a BNF grammar, and does not have a monolithic compiler. Extending the compiler only requires writing a new word, instead of modifying a grammar and changing the underlying implementation.

Using RPN, one could get the result of the mathematical expression (25 * 10 + 50) this way:

25 10 * 50 + . 300 ok

This command line first puts the numbers 25 and 10 on the implied stack.

Forthstack1 5

The word * multiplies the two numbers on the top of the stack and replaces them with their product.

Forthstack2
Then the number 50 is placed on the stack.

Forthstack3

The word + adds it to the previous product. Finally, the . command prints the result to the user's terminal.

Even Forth's structural features are stack-based. For example:

: FLOOR5 ( n -- n' ) DUP 6 < IF DROP 5 ELSE 1 - THEN ;

This code defines a new word (again, 'word' is the term used for a subroutine) called FLOOR5 using the following commands: DUP duplicates the number on the stack; < compares 6 with the top number on the stack and replaces it with a true-or-false value; IF takes a true-or-false value and chooses to execute commands immediately after it or to skip to the ELSE; DROP discards the value on the stack; and THEN ends the conditional. The text in parentheses is a comment, advising that this word expects a number on the stack and will return a possibly changed number. The FLOOR5 word is equivalent to this function written in the C programming language
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
:

int floor5(int v)

This function is written more succinctly as:

: FLOOR5 ( n -- n' ) 1- 5 MAX ;

You would run this word as follows:

1 FLOOR5 . 5 ok 8 FLOOR5 . 7 ok

First the interpreter pushes a number (1 or 8) onto the stack, then it calls FLOOR5, which pops off this number again and pushes the result. Finally, a call to "." pops the result and prints it to the user's terminal.

Facilities


Forth parsing
Parsing

In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of lexical analysis#Token to determine their grammatical structure with respect to a given formal grammar....
 is simple, as it has no explicit grammar
Grammar

Grammar is the field of linguistics that covers the conventions governing the use of any given natural language. It includes morphology and syntax, often complemented by phonetics, phonology, semantics, and pragmatics....
. The interpreter reads a line of input from the user input device, which is then parsed for a word using spaces as a delimiter
Delimiter

A delimiter is a sequence of one or more character s used to specify the boundary between separate, independent regions in plain text or other data stream....
; some systems recognise additional whitespace characters. When the interpreter finds a word, it tries to look the word up in the dictionary. If the word is found, the interpreter executes the code associated with the word, and then returns to parse the rest of the input stream. If the word isn't found, the word is assumed to be a number, and an attempt is made to convert it into a number and push it on the stack; if successful, the interpreter continues parsing the input stream. Otherwise, if both the lookup and number conversion fails, the interpreter prints the word followed by an error message indicating the word is not recognised, flushes the input stream, and waits for new user input.

The definition of a new word is started with the word : (colon) and ends with the word ; (semi-colon). For example

: X DUP 1+ . . ;

will compile the word X, and makes the name findable in the dictionary. When executed by typing 10 X at the console this will print 11 10.

Most Forth systems include a specialized assembler
Assembly language

An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture....
 that produces executable words. The assembler is a special dialect of the compiler. Forth assemblers often use a reverse-polish syntax in which the parameters of an instruction precede the instruction. The usual design of a Forth assembler is to construct the instruction on the stack, then copy it into memory as the last step. Registers may be referenced by the name used by the manufacturer, numbered (0..n, as used in the actual operation code) or named for their purpose in the Forth system: e.g. "S" for the register used as a stack pointer.

Operating system, files and multitasking


Classic Forth systems traditionally use neither operating system
Operating system

An operating system is an interface between hardware and applications; it is responsible for the management and coordination of activities and the sharing of the limited resources of the computer....
 nor file system
File system

In computing, a file system is a method for store and organize computer files and the data they contain to make it easy to find and access them....
. Instead of storing code in files, source-code is stored in disk blocks written to physical disk addresses. The word BLOCK is employed to translate the number of a 1K-sized block of disk space into the address of a buffer containing the data, which is managed automatically by the Forth system. Some implement contiguous disk files using the system's disk access, where the files are located at fixed disk block ranges. Usually these are implemented as fixed-length binary records, with an integer number of records per disk block. Quick searching is achieved by hashed access on key data.

Multitasking
Computer multitasking

In computing, multitasking is a method by which multiple tasks, also known as Computer process, share common processing resources such as a Central processing unit....
, most commonly cooperative
Computer multitasking

In computing, multitasking is a method by which multiple tasks, also known as Computer process, share common processing resources such as a Central processing unit....
 round-robin scheduling
Round-robin scheduling

Round-robin is one of the simplest scheduling algorithms for Computer process in an operating system, which assigns Preemption_#Time_slice to each process in equal portions and in order, handling all processes without priority....
, is normally available (although multitasking words and support are not covered by the ANSI Forth Standard). The word PAUSE is used to save the current task's execution context, to locate the next task, and restore its execution context. Each task has its own stacks, private copies of some control variables and a scratch area. Swapping tasks is simple and efficient; as a result, Forth multitaskers are available even on very simple microcontroller
Microcontroller

A microcontroller is a small computer on a single integrated circuit consisting of a relatively simple CPU combined with support functions such as a crystal oscillator, timers, watchdog, serial and analog I/O etc....
s such as the Intel 8051
Intel 8051

The Intel 8051 is a Harvard architecture, single chip microcontroller which was developed by Intel in 1980 for use in embedded systems. Intel's original versions were popular in the 1980s and early 1990s, but has largely been superseded by a vast range of faster and/or functionally enhanced 8051-compatible devices manufactured by more th...
, Atmel AVR
Atmel AVR

The AVR is a Modified Harvard architecture 8-bit Reduced instruction set computer single chip microcontroller which was developed by Atmel in 1996....
, and TI MSP430
TI MSP430

The MSP430 is a microcontroller family from Texas Instruments. Built around a 16-bit CPU, the MSP430 is designed for low cost, low power consumption embedded applications....
.

By contrast, some Forth systems run under a host operating system such as Microsoft Windows
Microsoft Windows

Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces ....
, Linux
Linux

Linux is a generic term referring to Unix-like computer operating systems based on the Linux kernel. Their development is one of the most prominent examples of free and open source software collaboration; typically all the underlying source code can be used, freely modified, and redistributed by anyone under the terms of the GNU GPL license...
 or a version of Unix
Unix

Unix is a computer operating system originally developed in 1969 by a group of American Telephone & Telegraph employees at Bell Labs, including Ken Thompson , Dennis Ritchie, Douglas McIlroy, and Joe Ossanna....
 and use the host operating system's file system for source and data files; the ANSI Forth Standard describes the words used for I/O. Other non-standard facilities include a mechanism for issuing call
System call

In computing, a system call is the mechanism used by an application program to request service from the kernel based on the Monolithic_kernel or to system servers on operating systems based on the microkernel-structure....
s to the host OS or windowing system
Windowing system

A windowing system is a component of a graphical user interface , and more specifically of a desktop environment, which supports the implementation of window managers, and provides basic support for graphics hardware, pointing devices such as mice, and keyboards....
s, and many provide extensions that employ the scheduling provided by the operating system. Typically they have a larger and different set of words from the stand-alone Forth's PAUSE word for task creation, suspension, destruction and modification of priority.

Self compilation and cross compilation


A full-featured Forth system with all source code will compile itself, a technique commonly called meta-compilation by Forth programmers (although the term doesn't exactly match meta-compilation
Meta-Compilation

Metacompilation is a computation which involves metasystem transitions from a computing machine M to a metamachine M' which controls, analyzes and imitates the work of M....
 as it is normally defined). The usual method is to redefine the handful of words that place compiled bits into memory. The compiler's words use specially-named versions of fetch and store that can be redirected to a buffer area in memory. The buffer area simulates or accesses a memory area beginning at a different address than the code buffer. Such compilers define words to access both the target computer's memory, and the host (compiling) computer's memory.

After the fetch and store operations are redefined for the code space, the compiler, assembler, etc. are recompiled using the new definitions of fetch and store. This effectively reuses all the code of the compiler and interpreter. Then, the Forth system's code is compiled, but this version is stored in the buffer. The buffer in memory is written to disk, and ways are provided to load it temporarily into memory for testing. When the new version appears to work, it is written over the previous version.

There are numerous variations of such compilers for different environments. For embedded system
Embedded system

An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions, often with real-time computing constraints....
s, the code may instead be written to another computer, a technique known as cross compilation, over a serial port or even a single TTL
Transistor-transistor logic

File:68k ttl.jpgTransistor?transistor logic is a class of digital circuits built from bipolar junction transistors and resistors. It is called transistor?transistor logic because both the logic gating function and the amplifying function are performed by transistors ....
 bit, while keeping the word names and other non-executing parts of the dictionary in the original compiling computer. The minimum definitions for such a forth compiler are the words that fetch and store a byte, and the word that commands a Forth word to be executed. Often the most time-consuming part of writing a remote port is constructing the initial program to implement fetch, store and execute, but many modern microprocessors have integrated debugging features (such as the Motorola CPU32) that eliminate this task.

Structure of the language

The basic data structure of Forth is the "dictionary" which maps "words" to executable code or named data structures. The dictionary is laid out in memory as a tree of linked list
Linked list

In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures. It consists of a sequence of node s, each containing arbitrary data Field s and one or two reference s pointing to the next and/or previous nodes....
 with the links proceeding from the latest (most recently) defined word to oldest, until a sentinel, usually a NULL pointer, is found. A context switch causes a list search to start at a different leaf and a linked list search continues as the branch merges into the main trunk leading eventually back to the sentinel, the root. (in rare cases such as meta-compilation the dictionary might be isolated, there are several) The effect is a sophisticated use of namespaces and critically can have the effect of overloading keywords, the meaning is contextual.

A defined word generally consists of head and body with the head consisting of the name field (NF) and the link field (LF) and body consisting of the code field (CF) and the parameter field (PF).

Head and body of a dictionary entry are treated separately because they may not be contiguous. For example, when a Forth program is recompiled for a new platform, the head may remain on the compiling computer, while the body goes to the new platform. In some environments (such as embedded system
Embedded system

An embedded system is a special-purpose computer system designed to perform one or a few dedicated functions, often with real-time computing constraints....
s) the heads occupy memory unnecessarily. However, some cross-compilers may put heads in the target if the target itself is expected to support an interactive Forth.

Dictionary entry

The exact format of a dictionary entry is not prescribed, and implementations vary. However, certain components are almost always present, though the exact size and order may vary. Described as a structure, a dictionary entry might look this way:

structure byte: flag \ 3bit flags + length of word's name char-array: name \ name's runtime length isn't known at compile time address: previous \ link field, backward ptr to previous word address: codeword \ ptr to the code to execute this word any-array: parameterfield \ unknown length of data, words, or opcodes end-structure forthword

The name field starts with a prefix giving the length of the word's name (typically up to 32 bytes), and several bits for flags. The character representation of the word's name then follows the prefix. Depending on the particular implementation of Forth, there may be one or more NUL ('\0') bytes for alignment.

The link field contains a pointer to the previously defined word. The pointer may be a relative displacement or an absolute address that points to the next oldest sibling.

The code field pointer will be either the address of the word which will execute the code or data in the parameter field or the beginning of machine code that the processor will execute directly. For colon defined words, the code field pointer points to the word that will save the current Forth instruction pointer (IP) on the return stack, and load the IP with the new address from which to continue execution of words. This is the same as what a processor's call/return instructions does.

Structure of the compiler

The compiler itself consists of Forth words visible to the system, not a monolithic program. This allows a programmer to change the compiler's words for special purposes.

The "compile time" flag in the name field is set for words with "compile time" behavior. Most simple words execute the same code whether they are typed on a command line, or embedded in code. When compiling these, the compiler simply places code or a threaded pointer to the word.

The classic examples of compile-time words are the control structures such as IF and WHILE. All of Forth's control structures, and almost all of its compiler are implemented as compile-time words. All of Forth's control flow
Control flow

In computer science control flow refers to the order in which the individual statement , Instruction or function calls of an imperative programming or functional programming computer program are execution or evaluated....
 words are executed during compilation to compile various combinations of the primitive words BRANCH and ?BRANCH (branch if false). During compilation, the data stack is used to support control structure balancing, nesting, and backpatching of branch addresses. The snippet: ... DUP 6 < IF DROP 5 ELSE 1 - THEN ... would be compiled to the following sequence inside of a definition: ... DUP LIT 6 < ?BRANCH 5 DROP LIT 5 BRANCH 3 LIT 1 - ... The numbers after BRANCH represent relative jump addresses. LIT is the primitive word for pushing a "literal" number onto the data stack.

Compilation state and interpretation state
The word : (colon) parses a name as a parameter, creates a dictionary entry (a colon definition) and enters compilation state. The interpreter continues to read space-delimited words from the user input device. If a word is found, the interpreter executes the compilation semantics associated with the word, instead of the interpretation semantics. The default compilation semantics of a word are to append its interpretation semantics to the current definition.

The word ; (semi-colon) finishes the current definition and returns to interpretation state. It is an example of a word whose compilation semantics differ from the default. The interpretation semantics of ; (semi-colon), most control flow words, and several other words are undefined in ANS Forth, meaning that they must only be used inside of definitions and not on the interactive command line.

The interpreter state can be changed manually with the words [ (left-bracket) and ] (right-bracket) which enter interpretation state or compilation state, respectively. These words can be used with the word LITERAL to calculate a value during a compilation and to insert the calculated value into the current colon definition. LITERAL has the compilation semantics to take an object from the data stack and to append semantics to the current colon definition to place that object on the data stack.

In ANS Forth, the current state of the interpreter can be read from the flag
Flag (computing)

In computer programming, flag refers to one or more bits that are used to store a binary numeral system value or code that has an assigned meaning....
 STATE which contains the value true when in compilation state and false otherwise. This allows the implementation of so-called state-smart words with behavior that changes according to the current state of the interpreter.

Immediate words
The word IMMEDIATE marks the most recent colon definition as an immediate word, effectively replacing its compilation semantics with its interpretation semantics. Immediate words are normally executed during compilation, not compiled but this can be overridden by the programmer, in either state. ; is an example of an immediate word. In ANS Forth, the word POSTPONE takes a name as a parameter and appends the compilation semantics of the named word to the current definition even if the word was marked immediate. Forth-83 defined separate words COMPILE and [COMPILE] to force the compilation of non-immediate and immediate words, respectively.

Unnamed words and execution tokens
In ANS Forth, unnamed words can be defined with the word :NONAME which compiles the following words up to the next ; (semi-colon) and leaves an execution token on the data stack. The execution token provides an opaque handle for the compiled semantics, similar to the function pointer
Function pointer

A function pointer is a type of pointer in C , C++, D programming language, and other C-like programming languages. When Dereference operator, a function pointer invokes a subroutine, passing it zero or more arguments just like a normal function....
s of the C programming language
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
.

Execution tokens can be stored in variables. The word EXECUTE takes an execution token from the data stack and performs the associated semantics. The word COMPILE, (compile-comma) takes an execution token from the data stack and appends the associated semantics to the current definition.

The word ' (tick) takes the name of a word as a parameter and returns the execution token associated with that word on the data stack. In interpretation state, ' RANDOM-WORD EXECUTE is equivalent to RANDOM-WORD.

Parsing words and comments
The words : (colon), POSTPONE, ' (tick) and :NONAME are examples of parsing words that take their arguments from the user input device instead of the data stack. Another example is the word ( (paren) which reads and ignores the following words up to and including the next right parenthesis and is used to place comments in a colon definition. Similarly, the word \ (backslash) is used for comments that continue to the end of the current line. To be parsed correctly, ( (paren) and \ (backslash) must be separated by whitespace from the following comment text.

Structure of code

In most Forth systems, the body of a code definition consists of either machine language, or some form of threaded code
Threaded code

In computer science, the term threaded code refers to a compiler implementation technique where the generated code has a form that essentially consists entirely of calls to subroutines....
. The original Forth which follows the informal FIG standard (Forth Interest Group), is a TIL (Threaded Interpretive Language). This is also called indirect-threaded code, but direct-threaded and subroutine threaded Forths have also become popular in modern times. The fastest modern Forths use subroutine threading, insert simple words as macros, and perform peephole optimization
Peephole optimization

In compiler theory, peephole optimization is a kind of optimization performed over a very small set of instructions in a segment of generated code....
 or other optimizing strategies to make the code smaller and faster.

Data objects

When a word is a variable or other data object, the CF points to the runtime code associated with the defining word that created it. A defining word has a characteristic "defining behavior" (creating a dictionary entry plus possibly allocating and initializing data space) and also specifies the behavior of an instance of the class of words constructed by this defining word. Examples include:

VARIABLE
Names an uninitialized, one-cell memory location. Instance behavior of a VARIABLE returns its address on the stack.
CONSTANT
Names a value (specified as an argument to CONSTANT). Instance behavior returns the value.
CREATE
Names a location; space may be allocated at this location, or it can be set to contain a string or other initialized value. Instance behavior returns the address of the beginning of this space.


Forth also provides a facility by which a programmer can define new application-specific defining words, specifying both a custom defining behavior and instance behavior. Some examples include circular buffers, named bits on an I/O port, and automatically-indexed arrays.

Data objects defined by these and similar words are global in scope. The function provided by local variables in other languages is provided by the data stack in Forth (although Forth also has real local variables). Forth programming style uses very few named data objects compared with other languages; typically such data objects are used to contain data which is used by a number of words or tasks (in a multitasked implementation).

Forth does not enforce consistency of data type usage; it is the programmer's responsibility to use appropriate operators to fetch and store values or perform other operations on data.

Programming

Words written in Forth are compiled into an executable form. The classical "indirect threaded" implementations compile lists of addresses of words to be executed in turn; many modern systems generate actual machine code (including calls to some external words and code for others expanded in place). Some systems have optimizing compilers. Generally speaking, a Forth program is saved as the memory image of the compiled program with a single command (e.g., RUN) that is executed when the compiled version is loaded.

During development, the programmer uses the interpreter to execute and test each little piece as it is developed. Most Forth programmers therefore advocate a loose top-down design, and bottom-up development with continuous testing and integration.

The top-down design is usually separation of the program into "vocabularies" that are then used as high-level sets of tools to write the final program. A well-designed Forth program reads like natural language, and implements not just a single solution, but also sets of tools to attack related problems.

Code examples


Hello world

For an explanation of the tradition of programming "Hello World", see Hello world program
Hello world program

A "Hello World" program is a computer program that prints out "Hello world!" on a display device. It is used in many introductory tutorials for teaching a programming language....
.


One possible implementation:

: HELLO ( -- ) CR ." Hello, world!" ; HELLO

The word CR (Carriage Return) causes the following output to be displayed on a new line. The parsing word ." (dot-quote) reads a double-quote delimited string and appends code to the current definition so that the parsed string will be displayed on execution. The space character separating the word ." from the string Hello, world! is not included as part of the string. It is needed so that the parser recognizes ." as a Forth word.

A standard Forth system is also an interpreter, and the same output can be obtained by typing the following code fragment into the Forth console:

CR .( Hello, world!)

.( (dot-paren) is an immediate word that parses a parenthesis-delimited string and displays it. As with the word ." the space character separating .( from Hello, world! is not part of the string.

The word CR comes before the text to print. By convention, the Forth interpreter does not start output on a new line. Also by convention, the interpreter waits for input at the end of the previous line, after an ok prompt. There is no implied 'flush-buffer' action in Forth's CR, as sometimes is in other programming languages.

Mixing compilation state and interpretation state

Here is the definition of a word EMIT-Q which when executed emits the single character Q:

: EMIT-Q 81 ( the ASCII value for the character 'Q' ) EMIT ;

This definition was written to use the ASCII
ASCII

American Standard Code for Information Interchange , is a coding standard that can be used for interchanging information, if the information is expressed mainly by the written form of English words....
 value of the Q character (81) directly. The text between the parentheses is a comment and is ignored by the compiler. The word EMIT takes a value from the data stack and displays the corresponding character.

The following redefinition of EMIT-Q uses the words [ (left-bracket), ] (right-bracket), CHAR and LITERAL to temporarily switch to interpreter state, calculate the ASCII value of the Q character, return to compilation state and append the calculated value to the current colon definition:

: EMIT-Q [ CHAR Q ] LITERAL EMIT ;

The parsing word CHAR takes a space-delimited word as parameter and places the value of its first character on the data stack. The word [CHAR] is an immediate version of CHAR. Using [CHAR], the example definition for EMIT-Q could be rewritten like this:

: EMIT-Q [CHAR] Q EMIT ; \ Emit the single character 'Q'

This definition used \ (backslash) for the describing comment.

Both CHAR and [CHAR] are predefined in ANS Forth. Using IMMEDIATE and POSTPONE, [CHAR] could have been defined like this:

: [CHAR] CHAR POSTPONE LITERAL ; IMMEDIATE

Implementations


Because the Forth virtual machine is simple to implement and has no standard reference implementation, there are a plethora of implementations of the language. In addition to supporting the standard varieties of desktop computer systems (POSIX
POSIX

POSIX or "Portable Operating System Interface" is the collective name of a family of related standardizations specified by the Institute of Electrical and Electronics Engineers to define the application programming interface , along with shell and utilities interfaces for software compatible with variants of the Unix operating system, altho...
, Microsoft Windows
Microsoft Windows

Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces ....
, Mac OS X
Mac OS X

Mac OS X is a line of computer operating systems developed, marketed, and sold by Apple Inc., and since 2002 has been included with all new Macintosh computer systems....
), many of these Forth systems also target a variety of embedded systems. Listed here are the some of the more prominent systems which conform to the 1994 ANS Forth standard.
  • - a portable ANS Forth implementation from the GNU Project
    GNU Project

    The GNU Project is a free software, mass collaboration project, announced on September 27 1983 by Richard Stallman. It initiated the GNU operating system, software development for which began in January 1984....
  • - founded by the originators of Forth, sells desktop (SwiftForth) and embedded (SwiftX) ANS Forth solutions
  • - sells highly-optimized desktop (VFX) and embedded ANS Forth compilers
  • Open Firmware
    Open Firmware

    Open Firmware, or OpenBoot in Sun Microsystems parlance, is a standard defining the interfaces of a computer firmware system, formerly endorsed by the Institute of Electrical and Electronics Engineers....
     - a bootloader and BIOS
    BIOS

    In computing, the Basic Input/Output System , also known as the System BIOS, is a de facto standard defining a firmware interface for IBM PC Compatible computers....
     standard based on ANS Forth
  • A more up-to-date index of , organized by platform


See also


  • colorForth
    ColorForth

    colorForth is a programming language from the Forth 's original designer, Charles H. Moore, developed in the 1990s. There was an earlier predecessor called 386 OK which appeared for sale at Silicon Valley Forth Interest Group meetings in 1992 ...
  • Factor
  • FCode
  • Joy
  • STOIC
    STOIC

    STOIC was a variant of Forth .It started out at the MIT and Harvard Biomedical Engineering Centre in Boston, and was written in February 1977 by Jonathan Sachs....


Further Reading


External links


  • [news://comp.lang.forth comp.lang.forth] - Usenet
    Usenet

    Usenet, a portmanteau of "user" and "network", is a worldwide distributed Internet discussion system. It evolved from the general purpose UUCP architecture of the same name....
     newsgroup with active Forth discussion
  • — Forth in hardware
  • by J.V. Noble
  • at the Open Directory Project