Camlp4
Encyclopedia
Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of Objective Caml
Objective Caml
OCaml , originally known as Objective Caml, is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996...

 libraries that are used to define grammars as well as loadable syntax extensions of such grammars. Camlp4 stands for Caml
Caml
Caml is a dialect of the ML programming language family, developed at INRIA and formerly at ENS....

 Preprocessor
Preprocessor
In computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers...

 and Pretty-Printer and one of its most important applications is the definition of domain-specific extensions of the syntax of OCaml.

Camlp4 is part of the official OCaml distribution which is developed at the INRIA. Its original author is Daniel de Rauglaudre. OCaml version 3.10.0, released in May 2007, introduced a significantly modified and backwards-incompatible version of Camlp4. De Rauglaudre maintains a separate backwards-compatible version, which has been renamed Camlp5. All of the examples below are for Camlp5 or the previous version of Camlp4 (versions 3.09 and prior).

The current version of camlp4 doesn't have yet a manual. The "Camlp4 manual" is from 2003, and is not compatible with present version.

Concrete and abstract syntax

A Camlp4 preprocessor operates by loading a collection of compiled modules which define a parser as well as a pretty-printer: the parser converts an input program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 into an internal representation. This internal representation constitutes the abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 (AST). It can be output in a binary form, e.g. it can be passed directly to one of the OCaml compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

s, or it can be converted back into a clear text program. The notion of concrete syntax refers to the format in which the abstract syntax is represented.

For instance, the OCaml expression
Expression (programming)
An expression in a programming language is a combination of explicit values, constants, variables, operators, and functions that are interpreted according to the particular rules of precedence and of association for a particular programming language, which computes and then produces another value...

 (1 + 2) can also be written ((+) 1 2) or (((+) 1) 2). The difference is only at the level of the concrete syntax, since these three versions are equivalent representations of the same abstract syntax tree. As demonstrated by the definition of a revised syntax for OCaml, the same programming language can use different concrete syntaxes. They would all converge to an abstract syntax tree in a unique format that a compiler can handle.

The abstract syntax tree is at the center of the syntax extensions, which are in fact OCaml programs. Although the definition of grammars must be done in OCaml, the parser that is being defined or extended is not necessarily related to OCaml, in which case the syntax tree that is being manipulated is not the one of OCaml. Several libraries are provided which facilitate the specific manipulation of OCaml syntax trees.

Fields of application

Domain-specific languages are a major application of Camlp4. Since OCaml is a multi-paradigm language, with an interactive toplevel and a native code compiler, it can be used as a backend for any kind of original language. The only thing that the developer has to do is write a Camlp4 grammar which converts the domain-specific language in question into a regular OCaml program. Other target languages can also be used, such as C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

.

If the target language is OCaml, simple syntax add-ons or syntactic sugar
Syntactic sugar
Syntactic sugar is a computer science term that refers to syntax within a programming language that is designed to make things easier to read or to express....

 can be defined, in order to provide an expressivity which is not easy to achieve using the standard features of the OCaml language. A syntax extension is defined by a compiled OCaml module, which is passed to the camlp4o executable along with the program to process.

Interestingly, Camlp4 includes a domain-specific language as it provides syntax extensions which ease the development of syntax extensions.
These extensions allow a compact definition of grammars (EXTEND statements) and quotations such as <:expr< 1 + 1 >>, i.e. deconstructing and constructing abstract syntax trees in concrete syntax.

Example

The following example defines a syntax extension of OCaml. It provides a new keyword, memo, which can be used as a replacement for function and provides automatic memoization
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...

 of functions with pattern matching
Pattern matching
In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...

. Memoization
Memoization
In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs...

 consists in storing the results of previous computation
Computation
Computation is defined as any type of calculation. Also defined as use of computer technology in Information processing.Computation is a process following a well-defined model understood and expressed in an algorithm, protocol, network topology, etc...

s in a table so that the actual computation of the function for each possible argument occurs at most once.

This is pa_memo.ml, the file which defines the syntax extension:

let unique =
let n = ref 0 in
fun -> incr n; "__pa_memo" ^ string_of_int !n

EXTEND
GLOBAL: Pcaml.expr;

Pcaml.expr: LEVEL "expr1" [
[ "memo"; OPT "|"; pel = LIST1 match_case SEP "|" ->
let tbl = unique in
let x = unique in
let result = unique in
<:expr<
let $lid:tbl$ = Hashtbl.create 100 in
fun $lid:x$ ->
try Hashtbl.find $lid:tbl$ $lid:x$
with [ Not_found ->
let $lid:result$ = match $lid:x$ with [ $list:pel$ ] in
do { Hashtbl.replace $lid:tbl$ $lid:x$ $lid:result$;
$lid:result$ } ]
>> ]
];

match_case: [
[ p = Pcaml.patt; w = OPT [ "when"; e = Pcaml.expr -> e ];
"->"; e = Pcaml.expr ->
(p, w, e) ]
];
END


Example of program using this syntax extension:

let counter = ref 0 (* global counter of multiplications *)

(* factorial with memoization *)
let rec fac = memo
0 -> 1
| n when n > 0 ->
(incr counter;
n * fac (n - 1))
| _ -> invalid_arg "fac"

let run n =
let result = fac n in
let count = !counter in
Printf.printf "%i! = %i number of multiplications so far = %i\n"
n result count

let _ =
List.iter run [5; 4; 6]


The output of the program is as follows, showing that the fac function (factorial) only computes products that were not computed previously:

5! = 120 number of multiplications so far = 5
4! = 24 number of multiplications so far = 5
6! = 720 number of multiplications so far = 6

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK