Extensible programming
Encyclopedia
Extensible programming is a term used in computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 to describe a style of computer programming that focuses on mechanisms to extend the programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

, compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

 and runtime
Run-time system
A run-time system is a software component designed to support the execution of computer programs written in some computer language...

 environment. Extensible programming languages, supporting this style of programming, were an active area of work in the 1960s, but the movement was marginalized in the 1970s. Extensible programming has become a topic of renewed interest in the 21st century.

Historical movement

The first paper usually associated with the extensible programming language movement is M. Douglas McIlroy's
Douglas McIlroy
Malcolm Douglas McIlroy is a mathematician, engineer, and programmer. As of 2007 he is an Adjunct Professor of Computer Science at Dartmouth College. Dr...

 1960 paper on macros for higher-level programming languages. Another early description of the principle of extensibility occurs in Brooker and Morris's 1960 paper on the Compiler-Compiler
Compiler-compiler
A compiler-compiler or compiler generator is a tool that creates a parser, interpreter, or compiler from some form of formal description of a language and machine...

. The peak of the movement was marked by two academic symposia, in 1969 and 1971. By 1975, a survey article on the movement by Thomas A. Standish was essentially a post mortem. The Forth programming language was an exception, but it went essentially unnoticed.

Character of the historical movement

As typically envisioned, an extensible programming language consisted of a base language providing elementary computing facilities, and a meta-language capable of modifying the base language. A program then consisted of meta-language modifications and code in the modified base language.

The most prominent language-extension technique used in the movement was macro definition. Grammar modification was also closely associated with the movement, resulting in the eventual development of adaptive grammar formalisms
Adaptive grammar
An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow its own production rules to be manipulated.-Overview:John N...

. The Lisp language community remained separate from the extensible language community, apparently because, as one researcher observed,

any programming language in which programs and data are essentially interchangeable can be regarded as an extendible [sic] language. ... this can be seen very easily from the fact that Lisp has been used as an extendible language for years.


At the 1969 conference, Simula
Simula
Simula is a name for two programming languages, Simula I and Simula 67, developed in the 1960s at the Norwegian Computing Center in Oslo, by Ole-Johan Dahl and Kristen Nygaard...

 was presented as an extensible programming language.

Standish described three classes of language extension, which he called paraphrase
Paraphrase
Paraphrase is restatement of a text or passages, using other words. The term "paraphrase" derives via the Latin "paraphrasis" from the Greek , meaning "additional manner of expression". The act of paraphrasing is also called "paraphrasis."...

, orthophrase, and metaphrase (otherwise paraphrase and metaphrase being translation
Translation
Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. Whereas interpreting undoubtedly antedates writing, translation began only after the appearance of written literature; there exist partial translations of the Sumerian Epic of...

 terms).
  • Paraphrase
    Paraphrase
    Paraphrase is restatement of a text or passages, using other words. The term "paraphrase" derives via the Latin "paraphrasis" from the Greek , meaning "additional manner of expression". The act of paraphrasing is also called "paraphrasis."...

     defines a facility by showing how to exchange it for something previously defined (or to be defined). As examples, he mentions macro definitions, ordinary procedure definitions, grammatical extensions, data definitions, operator definitions, and control structure extensions.

  • Orthophrase adds features to a language that could not be achieved using the base language, such as adding an i/o system to a base language that previously had no i/o primitives. Extensions must be understood as orthophrase relative to some given base language, since a feature not defined in terms of the base language must be defined in terms of some other language. Orthophrase corresponds to the modern notion of plug-ins.

  • Metaphrase modifies the interpretation rules used for pre-existing expressions. It corresponds to the modern notion of reflection
    Reflection (computer science)
    In computer science, reflection is the process by which a computer program can observe and modify its own structure and behavior at runtime....

    .

Death of the historical movement

Standish attributed the failure of the extensibility movement to the difficulty of programming successive extensions. An ordinary programmer might build a single shell of macros around a base language, but if a second shell of macros was to be built around that, the programmer would have to be intimately familiar with both the base language and the first shell; a third shell would require familiarity with the base and both the first and second shells; and so on. (Note that shielding the programmer from lower-level details is the intent of the abstraction
Abstraction (computer science)
In computer science, abstraction is the process by which data and programs are defined with a representation similar to its pictorial meaning as rooted in the more complex realm of human life and language with their higher need of summarization and categorization , while hiding away the...

 movement that supplanted the extensibility movement.)

Despite the earlier presentation of Simula as extensible, by 1975, Standish's survey does not seem in practice to have included the newer abstraction-based technologies (though he used a very general definition of extensibility that technically could have included them). A 1978 history of programming abstraction from the invention of the computer to the (then) present day made no mention of macros, and gave no hint that the extensible languages movement had ever occurred. Macros were tentatively admitted into the abstraction movement by the late 1980s (perhaps due to the advent of hygienic macros), by being granted the pseudonym syntactic abstractions.

Modern movement

In the modern sense, a system that supports extensible programming will provide all of the features described below.

Extensible syntax

This simply means that the source language(s) to be compiled must not be closed, fixed, or static. It must be possible to add new keywords, concepts, and structures to the source language(s). Languages which allow the addition of constructs with user defined syntax include Camlp4
Camlp4
Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of Objective Caml libraries that are used to define grammars as well as loadable syntax extensions of such grammars...

, OpenC++
OpenC++
OpenC++ is a software tool to parse and analyze C++ source code. It uses a metaobject protocol to provide services for language extensions. OpenC++ got its continuation in VivaCore library .-External links:**"" by Shigeru Chiba...

, Seed7, and Felix. While it is acceptable for some fundamental and intrinsic language features to be immutable, the system must not rely solely on those language features. It must be possible to add new ones.

Extensible compiler

In extensible programming, a compiler is not a monolithic program that converts source code input into binary executable output. The compiler itself must be extensible to the point that it is really a collection of plugins that assist with the translation of source language input into anything. For example, an extensible compiler will support the generation of object code, code documentation, re-formatted source code, or any other desired output. The architecture of the compiler must permit its users to "get inside" the compilation process and provide alternative processing tasks at every reasonable step in the compilation process.

For just the task of translating source code into something that can be executed on a computer, an extensible compiler should:
  • use a plug-in or component architecture for nearly every aspect of its function
  • determine which language or language variant is being compiled and locate the appropriate plug-in to recognize and validate that language
  • use formal language specifications to syntactically and structurally validate arbitrary source languages
  • assist with the semantic validation of arbitrary source languages by invoking an appropriate validation plug-in
  • allow users to select from different kinds of code generators so that the resulting executable can be targeted for different processors, operating systems, virtual machines, or other execution environment.
  • provide facilities for error generation and extensions to it
  • allow new kinds of nodes in the abstract syntax tree
    Abstract syntax tree
    In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

     (AST),
  • allow new values in nodes of the AST,
  • allow new kinds of edges between nodes,
  • support the transformation of the input AST, or portions thereof, by some external "pass"
  • support the translation of the input AST, or portions thereof, into another form by some external "pass"
  • assist with the flow of information between internal and external passes as they both transform and translate the AST into new ASTs or other representations

Extensible runtime

At runtime, extensible programming systems must permit languages to extend the set of operations that it permits. For example, if the system uses a byte-code interpreter, it must allow new byte-code values to be defined. As with extensible syntax, it is acceptable for there to be some (smallish) set of fundamental or intrinsic operations that are immutable. However, it must be possible to overload or augment those intrinsic operations so that new or additional behavior can be supported.

Content separated from form

Extensible programming systems should regard programs as data to be processed. Those programs should be completely devoid of any kind of formatting information. The visual display and editing of programs to users should be a translation function, supported by the extensible compiler, that translates the program data into forms more suitable for viewing or editing. Naturally, this should be a two-way translation. This is important because it must be possible to easily process extensible programs in a variety of ways. It is unacceptable for the only uses of source language input to be editing, viewing and translation to machine code. The arbitrary processing of programs is facilitated by de-coupling the source input from specifications of how it should be processed (formatted, stored, displayed, edited, etc.).

Source language debugging support

Extensible programming systems must support the debugging of programs using the constructs of the original source language regardless of the extensions or transformation the program has undergone in order to make it executable. Most notably, it cannot be assumed that the only way to display runtime data is in structures or arrays. The debugger, or more correctly 'program inspector', must permit the display of runtime data in forms suitable to the source language. For example, if the language supports a data structure for a business process
Business process
A business process or business method is a collection of related, structured activities or tasks that produce a specific service or product for a particular customer or customers...

 or work flow, it must be possible for the debugger to display that data structure as a fishbone chart or other form provided by a plugin.

See also

:Category:Extensible syntax programming languages
  • Adaptive grammar
    Adaptive grammar
    An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow its own production rules to be manipulated.-Overview:John N...

  • Concept programming
    Concept programming
    Concept programming is a programming paradigm focusing on how concepts, that live in the programmer's head, translate into representations that are found in the code space. This approach was introduced in 2001 by Christophe de Dinechin with the XL Programming Language.- Pseudo-metrics :Concept...


General

  1. Greg Wilson's Article in ACM Queue
  2. Slashdot Discussion
  3. Modern Extensible Languages - A paper from Daniel Zingaro

Tools

  1. MetaLan extensible programming compiler engine implementation
  2. XPS — eXtensible Programming System (in development)
  3. MPS — JetBrains Metaprogramming system

Programming languages

  1. xtc — eXTensible C
  2. XLR: Extensible Language and Runtime
  3. Nemerle Macros
  4. Scala is extensible
  5. Boo Syntactic Macros
  6. Stanford University Intermediate Format compiler
  7. Seed7 - The extensible programming language
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK