Parboiled (Java)
Encyclopedia
parboiled is an open-source Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 library released under an Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....

. It provides support for defining PEG
Parsing expression grammar
A parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...

 parsers directly in Java source code.

parboiled is commonly used as an alternative for regular expressions or parser generators (like ANTLR
ANTLR
In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...

 or JavaCC
JavaCC
JavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...

), especially for smaller and medium-size applications.

Apart from providing the constructs for grammar definition parboiled implements a complete recursive descent parser
Recursive descent parser
A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures where each such procedure usually implements one of the production rules of the grammar...

 with support for abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...

 construction, parse error reporting and parse error recovery.

Example

Since parsing with parboiled does not require a separate lexing
Lexical analysis
In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner...

 phase and there is no special syntax to learn for grammar definition parboiled makes it comparatively easy to build custom parsers quickly.

Consider this the following classic “calculator” example, with these rules in a simple pseudo notation
ExpressionTerm ((‘+’ / ‘-’) Term)*
TermFactor (('*' / '/') Factor)*
FactorNumber / '(' Expression ')'
Number ← [0-9]+


With parboiled this rule description can be translated directly into the following Java code:


import org.parboiled.BaseParser;

public class CalculatorParser extends BaseParser {

public Rule expression {
return sequence(
term,
zeroOrMore(
sequence(
firstOf('+', '-'),
term
)
)
);
}

public Rule term {
return sequence(
factor,
zeroOrMore(
sequence(
firstOf('*', '/'),
factor
)
)
);
}

public Rule factor {
return firstOf(
number,
sequence('(', expression, ')')
);
}

public Rule number {
return oneOrMore(charRange('0', '9'));
}

}


The class defines the parser rules for the language (yet without any actions), which could be used to parse actual input with code such as this:


String input = "1+2";
CalculatorParser parser = Parboiled.createParser(CalculatorParser.class);
ParsingResult result = ReportingParseRunner.run(parser.expression, input);
String parseTreePrintOut = ParseTreeUtils.printNodeTree(result);
System.out.println(parseTreePrintOut);

See also

  • Parsing expression grammar
    Parsing expression grammar
    A parsing expression grammar, or PEG, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language...

    s
  • Regular expressions
  • ANTLR
    ANTLR
    In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...

  • JavaCC
    JavaCC
    JavaCC is an open source parser generator and lexical analyzer generator for the Java programming language. JavaCC is similar to yacc in that it generates a parser from a formal grammar written in EBNF notation, except the output is Java source code...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK