Common subexpression elimination
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, common subexpression elimination (CSE) is a compiler optimization
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

 that searches for instances of identical expressions
Expression (mathematics)
In mathematics, an expression is a finite combination of symbols that is well-formed according to rules that depend on the context. Symbols can designate numbers , variables, operations, functions, and other mathematical symbols, as well as punctuation, symbols of grouping, and other syntactic...

 (i.e., they all evaluate to the same value), and analyses whether it is worthwhile replacing them with a single variable holding the computed value.

Example

In the following code:

a = b * c + g;
d = b * c * d;

it may be worth transforming the code so that it is translated as if it had been written:

tmp = b * c;
a = tmp + g;
d = tmp * d;

"worth" means that the resulting code would execute faster.

Principle

The possibility to perform CSE is based on available expression
Available expression
In the field of compiler optimizations, available expressions is an analysis algorithm that determines for each point in the program the set of expressions that need not be recomputed. Those expressions are said to be available at such a point...

 analysis (a data flow analysis). An expression b*c is availabe at a point p in a program if:
  • every path from the initial node to p evaluates b*c before reaching p,
  • and there are no assignments to b or c after the evaluation but before p.


The cost/benefit analysis performed by an optimizer will calculate whether the cost of the store to tmp is less than the cost of the multiplication; in practice other factors such as which values are held in which registers are also significant.

Compiler writers distinguish two kinds of CSE:
  • local common subexpression elimination works within a single basic block
    Basic block
    In computing, a basic block is a portion of the code within a program with certain desirable properties that make it highly amenable to analysis. Compilers usually decompose programs into their basic blocks as a first step in the analysis process...

  • global common subexpression elimination works on an entire procedure,

Both kinds rely on data flow analysis of which expressions are available at which points in a program.

Benefits

The benefits of performing CSE are great enough that it is a commonly used optimization.

In simple cases like in the example above, programmers may manually eliminate the duplicate expressions while writing the code. The greatest source of CSEs are intermediate code sequences generated by the compiler, such as for array indexing calculations, where it is not possible for the developer to manually intervene. In some cases language features may create many duplicate expressions. For instance, C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 macros, where macro expansions may result in common subexpressions not apparent in the original source code.

Compilers need to be judicious about the number of temporaries created to hold values. An excessive number of temporary values creates register pressure possibly resulting in spilling registers to memory, which may take longer than simply recomputing an arithmetic result when it is needed.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK