Metasyntax
Encyclopedia
A metasyntax describes the allowable structure and composition of phrases and sentences of a metalanguage
Metalanguage
Broadly, any metalanguage is language or symbols used when language itself is being discussed or examined. In logic and linguistics, a metalanguage is a language used to make statements about statements in another language...

, which is used to describe either a natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

 or a computer programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

. Some of the widely used formal metalanguages for computer languages are Backus–Naur Form
Backus–Naur form
In computer science, BNF is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols.It is applied wherever exact descriptions of...

 (BNF), Extended Backus–Naur Form
Extended Backus–Naur form
In computer science, Extended Backus–Naur Form is a family of metasyntax notations used for expressing context-free grammars: that is, a formal way to describe computer programming languages and other formal languages. They are extensions of the basic Backus–Naur Form metasyntax notation.The...

 (EBNF), Wirth syntax notation
Wirth syntax notation
Wirth syntax notation is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus-Naur form , it has several advantages over BNF in that it can be defined using itself, it contains an explicit iteration construct,...

 (WSN), and Augmented Backus–Naur Form
Augmented Backus–Naur form
In computer science, Augmented Backus–Naur Form is a metalanguage based on Backus–Naur Form , but consisting of its own syntax and derivation rules. The motive principle for ABNF is to describe a formal system of a language to be used as a bidirectional communications protocol...

 (ABNF). These metalanguages have their own metasyntax each composed of terminals, nonterminals, and metasymbols. A terminal symbol, such as a word or a token, is a stand-alone structure in a language being defined. A nonterminal symbol represents a syntactic category, which defines one or more valid phrasal or sentence structure consisted of an n-element subset. Metasymbols provide syntactic information for denotational purposes in a given metasyntax. Terminals, nonterminals, and metasymbols do not apply across all metalanguages. Typically, the metalanguage for token-level languages (formally called “regular language
Regular language
In theoretical computer science and formal language theory, a regular language is a formal language that can be expressed using regular expression....

s”) does not have nonterminals because nesting is not an issue in these regular languages. English, as a metalanguage for describing certain languages, does not contain metasymbols since all explanation could be done using English expression. There are only certain formal metalanguages used for describing recursive languages (formally called context-free language
Context-free language
In formal language theory, a context-free language is a language generated by some context-free grammar. The set of all context-free languages is identical to the set of languages accepted by pushdown automata.-Examples:...

s) have terminals, nonterminals, and metasymbols in their metasyntax.

Element of metasyntax

  • Terminals: a stand-alone syntactic structure. Terminals could be denoted by double quoting the name of the terminals.
eg. “else” , “if”, “then”, “while”
  • Nonterminals: a symbolic representation defining a set of allowable syntactic structures that is composed of a subset of elements. Nonterminals could be denoted by angle bracketing the name of the nonterminals.
eg. , ,

  • Metasymbol: a symbolic representation denoting syntactic information.
eg. := , |, {}, , [], *

Methods of phrase termination

  • Juxtaposition: e.g. A B
  • Alternation: e.g. A|B
  • Repetition: e.g. {A B}
  • Optional phrase: e.g. [A B]
  • Grouping: e.g. (A|B)

The standard convention

  • 'Backus–Naur Form
    Backus–Naur form
    In computer science, BNF is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols.It is applied wherever exact descriptions of...

    ' denotes nonterminal symbols by angle bracketing the name of the syntactic category
    Syntactic category
    A syntactic category is either a phrasal category, such as noun phrase or verb phrase, which can be decomposed into smaller syntactic categories, or a lexical category, such as noun or verb, which cannot be further decomposed....

    , while it denotes terminal symbols by double quoting the terminal words. Terminals can never appear on the left-hand side of the metasymbol "::=" in a derivation
    Derivation
    Derivation may refer to:* Derivation , a function on an algebra which generalizes certain features of the derivative operator* Derivation * Derivation in differential algebra, a unary function satisfying the Leibniz product law...

     rule. The body of the definition on the right-hand side may be composed with several alternative forms with each alternative syntactic construct being separated by the metasymbol "|". Each of these alternative construct may be either terminal or nonterminal.

  • 'Extended Backus–Naur Form
    Extended Backus–Naur form
    In computer science, Extended Backus–Naur Form is a family of metasyntax notations used for expressing context-free grammars: that is, a formal way to describe computer programming languages and other formal languages. They are extensions of the basic Backus–Naur Form metasyntax notation.The...

    ' uses all facilities in BNF and introduces two more metasymbols for additional features. One of these two new features is applied to denote an optional phrase in a statement by square bracketing the optional phrase. The second feature is applied to denote a phrase that is to be repeated zero or more times by curly bracketing the phrase.

  • 'Wirth syntax notation
    Wirth syntax notation
    Wirth syntax notation is a metasyntax, that is, a formal way to describe formal languages. Originally proposed by Niklaus Wirth in 1977 as an alternative to Backus-Naur form , it has several advantages over BNF in that it can be defined using itself, it contains an explicit iteration construct,...

    ' uses all facilities in EBNF except that the nonterminals are not necessarily angle bracketed but is always defined on the right-hand side of "=" in its production rule. It also does not require every nonterminal to be explicitly defined. Nonterminals such as and are implicitly defined as ASCII-character and optional white space respectively.

  • 'Augmented Backus–Naur Form
    Augmented Backus–Naur form
    In computer science, Augmented Backus–Naur Form is a metalanguage based on Backus–Naur Form , but consisting of its own syntax and derivation rules. The motive principle for ABNF is to describe a formal system of a language to be used as a bidirectional communications protocol...

    ' denotes nonterminal symbols by starting a one-word-name with an alphabet as the name of the syntactic category. Angle brackets are not required. Terminal symbols are either denoted by double quoted words or denoted by the following numeric structure: a "%", followed by "b" or "x" or "d", followed by a numeric value or a concatenation of numeric values separated by ".". Metasymbol "-" is placed between two numeric values to denote value range. As that of BNF, the terminals of ABNF never occures on the left-hand-side of the metasymbol "=" in the derivation rule. Metasymbol "/" denotes alternations. White space is used to separate elements in the body of the definition. The metasyntax for repetition in ABNF has several forms. A "*" preceding an element denotes the element to be repeated zero or more times. Numeric value followed by "*" followed by numeric value followed by an element denotes the element to be repeated at least times and at most times. A single numeric value preceding an element denotes the element to be repeated times. Comments may be express after metasymbol ";". As in EBNF, square bracketing a phrase denotes the phrase to be optional.

Variations

The metasyntax convention of these formal metalanguages are not yet formalized. Many metasyntactic variations or extensions exist in the reference manual of various computer programming languages. One variation to the standard convention for denoting nonterminals and terminals is to remove metasymbols such as angle brackets and quotations and apply font types to the intended words. In Ada
Ada (programming language)
Ada is a structured, statically typed, imperative, wide-spectrum, and object-oriented high-level computer programming language, extended from Pascal and other languages...

, for example, syntactic categories are denoted by applying lower case sans-serif font on the intended words or symbols. All terminal words or symbols, in Ada, consist of characters of code position between 16#20# and 16#7E# (inclusive). The definition for each character set is referred to the International Standard described by ISO/IEC 10646:2003. In C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 and Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, syntactic categories are denoted using italic font while terminal symbols are denoted by gothic
Sans-serif
In typography, a sans-serif, sans serif or san serif typeface is one that does not have the small projecting features called "serifs" at the end of strokes. The term comes from the French word sans, meaning "without"....

 font. In J
J (programming language)
The J programming language, developed in the early 1990s by Kenneth E. Iverson and Roger Hui, is a synthesis of APL and the FP and FL function-level languages created by John Backus....

, its metasyntax does not apply metasymbols to describe J's syntax at all. Rather, all syntactic explanations are done in a metalanguage very similar to English called Dictionary, which is uniquely documented for J.

Advantage of the extensions

The purpose of the new extensions is to provide a simpler and unambiguous metasyntax. In terms of simplicity, BNF’s metanotation definitely does not help to make the metasyntax easier-to-read as the open-end and close-end metasymbols appear too abundantly. In terms of ambiguity, BNF’s metanotation generates unnecessary complexity when quotation marks, apostrophes, less-than signs or greater-than signs come to serve as terminal symbols, which they often do. The extended metasyntax utilizes properties such as case, font, and code position of characters to reduce unnecessary aforementioned complexity. Moreover, some metalanguages use fonted separator categories to incorporate metasyntactic features for layout conventions, which are not formally supported by BNF.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK