Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Parse tree

Parse tree

Overview
A concrete syntax tree or parse tree is an (ordered, rooted) tree
Tree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is not a tree, but an arborescence: an acyclic connected graph where each node has zero or more children nodes and at most one parent node...

 that represents the syntactic
Syntax
In linguistics, syntax is the study of the principles and rules for constructing sentences in natural languages...

 structure of a string
String (computer science)
In mathematics, a string is an sequence of symbols that are chosen from a set or alphabet.In computer programming, a string is, essentially, a sequence of characters...

 according to some formal grammar
Formal grammar
A formal grammar is a set of rules for forming strings in a formal language. These rules that make up the grammar describe how to form strings from the language's alphabet that are valid according to the language's syntax...

. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf node
Leaf node
In computer science, a leaf node or external node is a node of a tree data structure that has zero child nodes. Often, leaf nodes are the nodes farthest from the root node. In the graph theory tree, a leaf node is a vertex of degree 1 other than the root...

s are labeled by terminals of the grammar. A program that produces such trees is called a parser
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar.Parsing is also an earlier term for the diagramming of sentences of...

. Parse trees may be generated for sentence
Sentence (linguistics)
In linguistics, a sentence is an expression in natural language—a grammatical and lexical unit consisting of one or more words, representing distinct and differentiated concepts, and combined to form a meaningful statement, question, request and command....

s in natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

s (see natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language...

), as well as during processing
Compiler
A compiler is a computer program that transforms source code written in a computer language into another computer language...

 of computer languages, such as programming language
Programming language
A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...

s.
Discussion
Ask a question about 'Parse tree'
Start a new discussion about 'Parse tree'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
A concrete syntax tree or parse tree is an (ordered, rooted) tree
Tree (data structure)
In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is not a tree, but an arborescence: an acyclic connected graph where each node has zero or more children nodes and at most one parent node...

 that represents the syntactic
Syntax
In linguistics, syntax is the study of the principles and rules for constructing sentences in natural languages...

 structure of a string
String (computer science)
In mathematics, a string is an sequence of symbols that are chosen from a set or alphabet.In computer programming, a string is, essentially, a sequence of characters...

 according to some formal grammar
Formal grammar
A formal grammar is a set of rules for forming strings in a formal language. These rules that make up the grammar describe how to form strings from the language's alphabet that are valid according to the language's syntax...

. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf node
Leaf node
In computer science, a leaf node or external node is a node of a tree data structure that has zero child nodes. Often, leaf nodes are the nodes farthest from the root node. In the graph theory tree, a leaf node is a vertex of degree 1 other than the root...

s are labeled by terminals of the grammar. A program that produces such trees is called a parser
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar.Parsing is also an earlier term for the diagramming of sentences of...

. Parse trees may be generated for sentence
Sentence (linguistics)
In linguistics, a sentence is an expression in natural language—a grammatical and lexical unit consisting of one or more words, representing distinct and differentiated concepts, and combined to form a meaningful statement, question, request and command....

s in natural language
Natural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

s (see natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language...

), as well as during processing
Compiler
A compiler is a computer program that transforms source code written in a computer language into another computer language...

 of computer languages, such as programming language
Programming language
A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human...

s. Parse trees are distinct from abstract syntax tree
Abstract syntax tree
In computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a certain programming language. Each node of the tree denotes a construct occurring in the source code...

s (also known simply as syntax trees), in that their structure and elements more concretely reflect the syntax of the input language.

Basic description


A parse tree is made up of nodes and branches. The image below represents a linguistic parse tree, here representing the English
English language
English is a West Germanic language that developed in England during the Anglo-Saxon era. As a result of the military, economic, scientific, political, and cultural influence of the British Empire during the 18th, 19th, and early 20th centuries, and of the United States since the mid 20th century,...

 sentence "John hit the ball". (The parse tree here is a greatly simplified one; for more information, see X-bar theory
X-bar theory
X-bar theory is a component of linguistic theory which attempts to identify syntactic features common to all languages. It claims that among their phrasal categories, all languages share certain structural similarities, including one known as the "X-bar", which does not appear in traditional phrase...

.) The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, hit, the, ball). We use the following abbreviations in the example:
  • S for sentence
    Sentence (linguistics)
    In linguistics, a sentence is an expression in natural language—a grammatical and lexical unit consisting of one or more words, representing distinct and differentiated concepts, and combined to form a meaningful statement, question, request and command....

    , the top-level structure in this example.
  • NP for noun phrase
    Noun phrase
    In grammar, a noun phrase is a phrase whose head is a noun or a pronoun, optionally accompanied by a set of modifiers.Noun phrases are very common cross-linguistically, but some languages like Tuscarora and Cayuga have been argued to lack this category.- Form :Noun phrases normally consist of a...

    . The first (leftmost) NP, a single noun "John", serves as the subject
    Subject (grammar)
    The subject is one of the two main constituents of a clause, according to a tradition that can be tracked back to Aristotle. The other constituent is the predicate...

     of the sentence. The second one is the object
    Object (grammar)
    An object in grammar is a sentence element and is often part of the sentence predicate. It denotes somebody or something involved in the subject's "performance" of the verb...

     of the sentence.
  • VP for verb phrase
    Verb phrase
    In linguistics, a verb phrase or VP is a syntactic structure composed of the predicative elements of a sentence and functions in providing information about the subject of the sentence.-VPs in the generative grammar framework:...

     which serves as the predicate
    Predicate (grammar)
    In traditional grammar, a predicate is one of the two main parts of a sentence . For the simple sentence "John [is yellow]," John acts as the subject, and is yellow acts as the predicate, a subsequent description of the subject headed with a verb.In current linguistic semantics, a predicate is an...

    .
  • V for verb
    Verb
    kalleah hit meIn syntax, a verb is a word that usually denotes an action , an occurrence , or a state of being . Depending on the language, a verb may vary in form according to many factors, possibly including its tense, aspect, mood and voice...

    . In this case, it's a transitive verb
    Transitive verb
    In syntax, a transitive verb is a verb that requires both a direct subject and one or more objects.-Examples:Some examples of sentences with transitive verbs:*Harry sees Adam....

     "hit".
  • Det for determiner
    Determiner (class)
    A determiner is a noun modifier that expresses the reference of a noun or noun phrase in the context, including quantity, rather than attributes expressed by adjectives...

    , in this instance the definite article
    Article (grammar)
    An article is a word that combines with a noun to indicate the type of reference being made by the noun, and may also specify the volume or numerical scope of that reference. there are only three articles, a, an and and. The articles in the English language are the and a...

     "the".
  • and N for noun
    Noun
    In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition....

    .


In a parse tree, each node is either a root node, a branch node, or a leaf node. In the example to the right, S is a root node, NP and VP are branch nodes, while John, hit, the, and ball are all leaf nodes.

A node can also be referred to as parent node or a child node. A parent node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both NP and VP. A child node is one that has at least one node directly above it to which it is linked by a branch of the tree. Again from our example, hit is a child node of V. The terms mother and daughter are also sometimes used for this relationship.

External links