A
concrete syntax tree or
parse tree or
parsing tree
is an ordered, rooted
treeIn computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at...
that represents the
syntacticIn linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
structure of a
stringIn formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....
according to some
formal grammarA formal grammar is a set of formation rules for strings in a formal language. The rules describe how to form strings from the language's alphabet that are valid according to the language's syntax...
. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf nodes are labeled by terminals of the grammar. Parse trees may be generated for
sentenceIn the field of linguistics, a sentence is an expression in natural language, and often defined to indicate a grammatical unit consisting of one or more words that generally bear minimal syntactic relation to the words that precede or follow it...
s in
natural languageIn the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
s (
see natural language processingNatural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....
), as well as during
processingA compiler is a computer program that transforms source code written in a programming language into another computer language...
of computer languages, such as
programming languageA programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s. Parse trees are distinct from
abstract syntax treeIn computer science, an abstract syntax tree , or just syntax tree, is a tree representation of the abstract syntactic structure of source code written in a programming language. Each node of the tree denotes a construct occurring in the source code. The syntax is 'abstract' in the sense that it...
s (also known simply as syntax trees), in that their structure and elements more concretely reflect the syntax of the input language.
If a formal grammar is
ambiguousIn computer science, a context-free grammar is said to be an ambiguous grammar if there exists a string that can be generated by the grammar in more than one way...
, there can be multiple parse trees for some given string (
syntactic ambiguitySyntactic ambiguity is a property of sentences which may be reasonably interpreted in more than one way, or reasonably interpreted to mean more than one thing...
).
Basic description
A parse tree is made up of nodes and branches. The image below represents a linguistic parse tree, here representing the
EnglishEnglish is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
sentence "John hit the ball". (The parse tree here is a greatly simplified one; for more information, see
X-bar theoryX-bar theory is a component of linguistic theory which attempts to identify syntactic features presumably common to all those human languages that fit in a presupposed framework...
.) The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, hit, the, ball). We use the following abbreviations in the example:
- S for sentence
In the field of linguistics, a sentence is an expression in natural language, and often defined to indicate a grammatical unit consisting of one or more words that generally bear minimal syntactic relation to the words that precede or follow it...
, the top-level structure in this example
- NP for noun phrase
In grammar, a noun phrase, nominal phrase, or nominal group is a phrase based on a noun, pronoun, or other noun-like word optionally accompanied by modifiers such as adjectives....
. The first (leftmost) NP, a single noun "John", serves as the subjectThe subject is one of the two main constituents of a clause, according to a tradition that can be tracked back to Aristotle and that is associated with phrase structure grammars; the other constituent is the predicate. According to another tradition, i.e...
of the sentence. The second one is the objectAn object in grammar is part of a sentence, and often part of the predicate. It denotes somebody or something involved in the subject's "performance" of the verb. Basically, it is what or whom the verb is acting upon...
of the sentence.
- VP for verb phrase
In linguistics, a verb phrase or VP is a syntactic unit composed of at least one verb and the dependents of that verb. One can distinguish between two types of VPs, finite VPs and non-finite VPs . While phrase structure grammars acknowledge both, dependency grammars reject the existence of a...
, which serves as the predicateThere are two competing notions of the predicate in theories of grammar. Traditional grammar tends to view a predicate as one of two main parts of a sentence, the other being the subject, which the predicate modifies. The other understanding of predicates is inspired from work in predicate calculus...
- V for verb
A verb, from the Latin verbum meaning word, is a word that in syntax conveys an action , or a state of being . In the usual description of English, the basic form, with or without the particle to, is the infinitive...
. In this case, it's a transitive verbIn syntax, a transitive verb is a verb that requires both a direct subject and one or more objects. The term is used to contrast intransitive verbs, which do not have objects.-Examples:Some examples of sentences with transitive verbs:...
"hit".
- Det for determiner
A determiner is a noun-modifier that expresses the reference of a noun or noun-phrase in the context, rather than attributes expressed by adjectives...
, in this instance the definite articleAn article is a word that combines with a noun to indicate the type of reference being made by the noun. Articles specify the grammatical definiteness of the noun, in some languages extending to volume or numerical scope. The articles in the English language are the and a/an, and some...
"the"
- N for noun
In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition .Lexical categories are defined in terms of how their members combine with other kinds of...

In a parse tree, each node is either a
root node, a
branch node, or a
leaf node. In the example to the right, S is a root node, NP and VP are branch nodes, while John, hit, the, and ball are all leaf nodes. The leaves are the lexical tokens of the sentence.
A node can also be referred to as parent node or a child node. A
parent node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both NP and VP. A
child node is one that has at least one node directly above it to which it is linked by a branch of the tree. Again from our example, hit is a child node of V. The terms
mother and
daughter are also sometimes used for this relationship.
External links