Lookahead - AbsoluteAstronomy.com

For information about the look-ahead feature of some audio compressors, see look-ahead.

Lookahead is a tool in algorithms for looking ahead a few more input items before making a cost effective decision at one stage of the algorithm.

Lookahead in search problems

In artificial intelligence

Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

, lookahead is an important component of combinatorial search which specifies, roughly, how deeply the graph

Graph (data structure)

In computer science, a graph is an abstract data structure that is meant to implement the graph and hypergraph concepts from mathematics.A graph data structure consists of a finite set of ordered pairs, called edges or arcs, of certain entities called nodes or vertices...

representing the problem is explored. The need for a specific limit on lookahead comes from the large problem graphs in many applications, such as computer chess

Computer chess

Computer chess is computer architecture encompassing hardware and software capable of playing chess autonomously without human guidance. Computer chess acts as solo entertainment , as aids to chess analysis, for computer chess competitions, and as research to provide insights into human...

and computer Go

Computer Go

Computer Go is the field of artificial intelligence dedicated to creating a computer program that plays Go, a traditional board game.-Performance:...

. A naive breadth-first search

Breadth-first search

In graph theory, breadth-first search is a graph search algorithm that begins at the root node and explores all the neighboring nodes...

of these graphs would quickly consume all the memory of any modern computer. By setting a specific lookahead limit, the algorithm's time can be carefully controlled; its time increases exponentially

Exponential growth

Exponential growth occurs when the growth rate of a mathematical function is proportional to the function's current value...

as the lookahead limit increases.

More sophisticated search techniques such as alpha-beta pruning

Alpha-beta pruning

Alpha-beta pruning is a search algorithm which seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games...

are able to eliminate entire subtrees of the search tree from consideration. When these techniques are used, lookahead is not a precisely defined quantity, but instead either the maximum depth searched or some type of average.

Lookahead in parsing

Lookahead establishes the maximum incoming tokens that a parser can use to decide which rule it should use.

Lookahead is especially relevant to LL

LL parser

An LL parser is a top-down parser for a subset of the context-free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentence...

, LR

LR parser

In computer science, an LR parser is a parser that reads input from Left to right and produces a Rightmost derivation. The term LR parser is also used; where the k refers to the number of unconsumed "look ahead" input symbols that are used in making parsing decisions...

, and LALR parser

LALR parser

In computer science, an LALR parser is a type of LR parser based on a finite-state-automata concept. The data structure used by an LALR parser is a pushdown automaton...

s, where it is often explicitly indicated by affixing the lookahead to the algorithm name in parentheses, such as LALR(1).

Most programming language

Programming language

A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s, the primary target of parsers, are carefully defined in such a way that a parser with limited lookahead, typically one, can parse them, because parsers with limited lookahead are often more efficient. One important change to this trend came in 1990 when Terence Parr

Terence Parr

Terence John Parr is a professor of computer science at the University of San Francisco. He is best known for his ANTLR parser generator and contributions to parsing theory. He also developed the engine for Java and other programming languages...

created ANTLR

ANTLR

In computer-based language recognition, ANTLR , or ANother Tool for Language Recognition, is a parser generator that uses LL parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set , first developed in 1989, and is under active development...

for his Ph.D. thesis, a parser generator for efficient LL(k) parsers, where k is any fixed value.

Parsers typically have only a few actions after seeing each token. They are shift (add this token to the stack for later reduction), reduce (pop tokens from the stack and form a syntactic construct), end, error (no known rule applies) or conflict (does not know whether to shift or reduce).

Lookahead has two advantages.

It helps the parser take the correct action in case of conflicts. For example, parsing the if statement in the case of an else clause.
It eliminates many duplicate states and eases the burden of an extra stack. A C language non-lookahead parser will have around 10,000 states. A lookahead parser will have around 300 states.

Example: Parsing the Expression 1 + 2 * 3

Set of expression parsing rules (called grammar) is as follows,

Rule1: E → E + E Expression is the sum of two expressions.
Rule2: E → E * E Expression is the product of two expressions.
Rule3: E → number Expression is a simple number
Rule4: + has less precedence than *

Most programming languages (except for a few such as APL and Smalltalk) and algebraic formulas give higher precedence to multiplication than addition, in which case the correct interpretation of the example above is (1 + (2*3)).
Note that Rule4 above is a semantic rule. It is possible to rewrite the grammar to incorporate this into the syntax. However, not all such rules can be translated into syntax.

Simple non-lookahead parser actions

Reduces 1 to expression E on input 1 based on rule3.
Shift + onto stack on input 1 in anticipation of rule1.
Reduce stack element 2 to Expression E based on rule3.
Reduce stack items E+ and new input E to E based on rule1.
Shift * onto stack on input * in anticipation of rule2.
Shift 3 onto stack on input 3 in anticipation of rule3.
Reduce 3 to Expression E on input 3 based on rule3.
Reduce stack items E* and new input E to E based on rule2.

The parse tree and resulting code from it is not correct according to language semantics.

To correctly parse without lookahead, there are three solutions:

The user has to enclose expressions within parentheses. This often is not a viable solution.
The parser needs to have more logic to backtrack and retry whenever a rule is violated or not complete. The similar method is followed in LL parsers.
Alternatively, the parser or grammar needs to have extra logic to delay reduction and reduce only when it is absolutely sure which rule to reduce first. This method is used in LR parsers. This correctly parses the expression but with many more states and increased stack depth.

Lookahead parser actions

Shift 1 onto stack on input 1 in anticipation of rule3. It does not reduce immediately.
Reduce stack item 1 to simple Expression on input + based on rule3. The lookahead is +, so we are on path to E +, so we can reduce the stack to E.
Shift + onto stack on input + in anticipation of rule1.
Shift 2 onto stack on input 2 in anticipation of rule3.
Reduce stack item 2 to Expression on input * based on rule3. The lookahead * expects only E before it.
Now stack has E + E and still the input is *. It has two choices now, either to shift based on rule2 or reduction based on rule1. Since * has more precedence than + based on rule4, so shift * onto stack in anticipation of rule2.
Shift 3 onto stack on input 3 in anticipation of rule3.
Reduce stack item 3 to Expression after seeing end of input based on rule3.
Reduce stack items E * E to E based on rule2.
Reduce stack items E + E to E based on rule1.

The parse tree generated is correct and simply more efficient than non-lookahead parsers. This is the strategy followed in LALR parser

LALR parser

In computer science, an LALR parser is a type of LR parser based on a finite-state-automata concept. The data structure used by an LALR parser is a pushdown automaton...

Lookahead vs. Lazy evaluation

This is in contrast to another technique called lazy evaluation

Lazy evaluation

In programming language theory, lazy evaluation or call-by-need is an evaluation strategy which delays the evaluation of an expression until the value of this is actually required and which also avoids repeated evaluations...

that delays the computation until it is really needed. Both techniques are used for economical usage of space or time. Lookahead makes the right decision and so avoids backtracking from undesirable stages at later stages of the algorithm and so saves space, at the cost of a slight increase of time due to the overhead of extra lookups. Lazy evaluation normally skips the unexplored algorithmic paths and thus saves both the time and space in general. Some applications of lazy evaluations are demand paging in operating systems and lazy parse tables in compilers.

In search space exploration, both the techniques are used. When there are already promising paths in the algorithm to evaluate, lazy evaluation is used and to be explored paths will be saved in the queue or stack. When there are no promising paths to evaluate and to check whether the new path can be a more promising path in leading to solution, lookahead is used.

Compilers also use both the techniques. They will be lazy in generating parse tables from given rules, but they lookahead in parsing given input.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.