Hygienic macro
Encyclopedia
Hygienic macros are macros whose expansion is guaranteed not to cause collisions with existing symbol definitions. They are a feature of programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s such as Scheme and Dylan.

The hygiene problem

In a programming language that has unhygienic macros, it is possible for existing variable bindings to be hidden from a macro by variable bindings that are created during its expansion. In C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, this problem can be illustrated by the following fragment:

  1. define INCI(i) {int a=0; ++i;}

int main(void)
{
int a = 0, b = 0;
INCI(a);
INCI(b);
printf("a is now %d, b is now %d\n", a, b);
return 0;
}

Running the above through the C preprocessor
C preprocessor
The C preprocessor is the preprocessor for the C and C++ computer programming languages. The preprocessor handles directives for source file inclusion , macro definitions , and conditional inclusion ....

 produces:


int main(void)
{
int a = 0, b = 0;
{int a=0; ++a;};
{int a=0; ++b;};
printf("a is now %d, b is now %d\n", a, b);
return 0;
}

So the variable a declared in the top scope is never altered by the execution of the program, as the output of the compiled program shows:

a is now 0, b is now 1

Note that some C compilers, such as gcc
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

, have an option like -Wshadow that warns when a local variable shadows a global variable, which would have caught the above problem. The simplest and least robust solution is to give the macro's variables unique names:

  1. define INCI(i) {int INCIa=0; ++i;}

int main(void)
{
int a = 0, b = 0;
INCI(a);
INCI(b);
printf("a is now %d, b is now %d\n", a, b);
return 0;
}

Until a variable named INCIa is created, this solution produces the correct output:

a is now 1, b is now 1
The "hygiene problem" can extend beyond variable bindings. Consider this Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

 macro:

(defmacro my-unless (condition &body body)
`(if (not ,condition)
(progn
,@body)))

While there are no references to variables in this macro, it assumes the symbols "if", "not", and "progn" are all bound to their usual function definitions. If, however the above macro is used in the following code:

(flet ((not (x) x))
(my-unless t
(format t "This should not be printed!")))

Because the definition of "not" has been locally altered, the message "This should not be printed!" will be printed, which is probably not the intended behavior. The problem can be fixed by manually inserting the desired function object into the return value of the macro.

(defmacro my-unless (condition &body body)
`(if (funcall ',#'not ,condition)
(progn
,@body)))

however, this approach can be problematic for code marshaling done by some compilers, and it is not possible to "protect" macro uses this way. (E.g., the "funcall" and the "progn" in the above can be broken too, with no way of protecting them.)

Strategies

In some languages such as Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

, Scheme and others of the Lisp language family, macros provide a powerful means of extending the language. Here the lack of hygiene in conventional macros is resolved by several strategies.
  • Obfuscation. If the programmer needs to use temporary storage during the expansion of a macro, he can use one with an unusual name and hope that the same name will never be used in a program that uses his macro. Of course any programmer knowing of gensym won't do this. (See next point)
  • Temporary symbol creation. In some programming languages it is possible for a new variable name, or symbol, to be generated and bound to a temporary location. The language processing system ensures that this never clashes with another name or location in the execution environment. The responsibility for choosing to use this feature within the body of a macro definition is left to the programmer. This method was used in MacLisp
    Maclisp
    MACLISP is a dialect of the Lisp programming language. It originated at MIT's Project MAC in the late 1960s and was based on Lisp 1.5. Richard Greenblatt was the main developer of the original codebase for the PDP-6; Jonl White was responsible for its later maintenance and development...

    , where a function named "gensym" could be used to generate a new symbol name. Similar functions (usually named gensym as well) exist in many Lisp-like languages, including the widely implemented Common Lisp
    Common Lisp
    Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

    http://www.lispworks.com/documentation/HyperSpec/Body/f_gensym.htm#gensym standard.
  • Hygienic transformation. The processor responsible for transforming the patterns of the input form into an output form detects symbol clashes and resolves them by temporarily changing the names of symbols. This kind of processing is supported by Scheme's "let-syntax" and "define-syntax" macro creation systems. The basic strategy is to identify bindings in the macro definition and replace those names with gensyms, and to identify free variables in the macro definition and make sure those names are looked up in the scope of the macro definition instead of the scope where the macro was used.

Syntax-rules

Syntax-rules is the standard high-level macro system of R5RS.


(define-syntax swap!
(syntax-rules
((_ a b)
(let ((temp a))
(set! a b)
(set! b temp)))))

Syntax-case

Syntax-case is a low- and high-level macro system that is part of R6RS.


(define-syntax swap!
(lambda (stx)
(syntax-case stx
((_ a b)
(syntax
(let ((temp a))
(set! a b)
(set! b temp)))))))

Syntactic closures

Syntactic closures are another type of macro system.


(define-syntax swap!
(sc-macro-transformer
(lambda (form environment)
(let ((a (close-syntax (cadr form) environment))
(b (close-syntax (caddr form) environment)))
`(let ((temp ,a))
(set! ,a ,b)
(set! ,b temp))))))

Explicit renaming

Explicit renaming is another type of macro system.


(define-syntax swap!
(er-macro-transformer
(lambda (form rename compare)
(let ((a (cadr form))
(b (caddr form))
(temp (rename 'temp)))
`
,a ,b)
,b ,temp))))))

See also

  • Macros
  • Syntactic closure
    Syntactic closure
    In computer science, syntactic closures are an implementation strategy for a hygienic macro system. The actual arguments to a macro call are closed in the current environment, such that they cannot inadvertently reference bindings introduced by the macro itself....

  • Preprocessor
    Preprocessor
    In computer science, a preprocessor is a program that processes its input data to produce output that is used as input to another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK