Use-define chain - AbsoluteAstronomy.com

A Use-Definition Chain is a data structure

Data structure

In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

that consists of a use, U, of a variable

Variable (programming)

In computer programming, a variable is a symbolic name given to some known or unknown quantity or information, for the purpose of allowing the name to be used independently of the information it represents...

, and all the definitions, D, of that variable that can reach that use without any other intervening definitions. A definition can have many forms, but is generally taken to mean the assignment

Assignment (computer science)

In computer programming, an assignment statement sets or re-sets the value stored in the storage location denoted by a variable name. In most imperative computer programming languages, assignment statements are one of the basic statements...

of some value to a variable (which is different from the use of the term that refers to the language construct involving a data type

Data type

In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

and allocating storage).

A counterpart of a UD Chain is a Definition-Use Chain (DU Chain), which consists of a definition, D, of a variable and all the uses, U, reachable from that definition without any other intervening definitions.

Both UD and DU chains are created by using a form of static code analysis

Static code analysis

Static program analysis is the analysis of computer software that is performed without actually executing programs built from that software In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code...

known as data flow analysis. Knowing the use-def and def-use chains for a program or subprogram is a prerequisite for many compiler optimization

Compiler optimization

Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

s, including constant propagation and common subexpression elimination

Common subexpression elimination

In computer science, common subexpression elimination is a compiler optimization that searches for instances of identical expressions , and analyses whether it is worthwhile replacing them with a single variable holding the computed value.- Example :In the following code: a = b * c + g; d = b * c...

Purpose

Making the use-define or define-use chains is a step in liveness analysis, so that logical representations of all the variables can be identified and tracked through the code.

Consider the following snippet of code:



 int x = 0;    /* A */

 x = x + y;    /* B */

 /* 1, some uses of x */

 x = 35;       /* C */

 /* 2, some more uses of x */

Notice that x is assigned a value at three points (marked A, B, and C). However, at the point marked "1", the use-def chain for x should indicate that its current value must have come from line B (and its value at line B must have come from line A). Contrariwise, at the point marked "2", the use-def chain for x indicates that its current value must have come from line C. Since the value of the x in block 2 does not depend on any definitions in block 1 or earlier, x might as well be a different variable there; practically speaking, it is a different variable — call it x2.



 int x = 0;    /* A */

 x = x + y;    /* B */

 /* 1, some uses of x */

 int x2 = 35;  /* C */

 /* 2, some uses of x2 */

The process of splitting x into two separate variables is called live range splitting. See also static single assignment form

Static single assignment form

In compiler design, static single assignment form is a property of an intermediate representation , which says that each variable is assigned exactly once...

Setup

The list of statements determines a strong order among statements.

Statements are labeled using the following conventions: s(i), where i is an integer in [1,n]; and n is the number of statements in the basic block
Basic block
In computing, a basic block is a portion of the code within a program with certain desirable properties that make it highly amenable to analysis. Compilers usually decompose programs into their basic blocks as a first step in the analysis process...
Variables are identified in italic (e.g., v,u and t)
Every variable is assumed to have a definition in the context or scope. (In static single assignment form, use-define chains are explicit because each chain contains a single element.)

For a variable, such as v, its declaration is identified as V (italic capital letter), and for short, its declaration is identified as s(0). In general, a declaration of a variable can be in an outer scope (e.g., a global variable

Global variable

In computer programming, a global variable is a variable that is accessible in every scope . Interaction mechanisms with global variables are called global environment mechanisms...

Definition of a Variable

When a variable, v, is on the LHS of an assignment statement, such as s(j), then s(j) is a definition of v. Every variable (v) has at least one definition by its declaration (V) (or initialization).

Use of a Variable

If variable, v, is on the RHS of statement s(j), there is a statement, s(i) with i < j and min(j-i), that it is a definition of v and it has a use at s(j) (or, in short, when a variable, v, is on the RHS of a statement s(j), then v has a use at statement s(j)).

Execution

Consider the sequential execution of the list of statements, s(i), and what can now be observed as the computation at statement, j:

A definition at statement s(i) with i < j is alive at j, if it has a use at a statement s(k) with k ≥ j. The set of alive definitions at statement i is denoted as A(i) and the number of alive definitions as |A(i)|. (A(i) is a simple but powerful concept: theoretical and practical results in space complexity theory, access complexity(I/O complexity), register allocation
Register allocation
In compiler optimization, register allocation is the process of assigning a large number of target program variables onto a small number of CPU registers...

and cache locality exploitation are based on A(i).)
A definition at statement s(i) kills all previous definitions (s(k) with k < i) for the same variables.

Execution example for def-use-chain

This example is based on a java algorithm for finding the ggt/gcd

Greatest common divisor

In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...

(it is not important to understand, what function the ggt / this code represents)



 int ggt(int a, int b){

     int c = a;

     int d = b;

     if(c  0)

        return d;

     while(d != 0){

        if(c > d)

           c = c - d;

        else

           d = d - c;

     }

     return c;

 }

To find out all def-use-chains for variable d, do the following steps:

1.Search for the first time, the variable is defined (write access).

In this case it is "d=b" (l.3)

2.Search for the first time, the variable is read.

In ths case it is "return d"

3.Write down this information in the following style:

[name of the variable you are creating a def-use-chain for, the concrete write access, the concrete read access]

In this case it is:

[d, d=b, return d]

Repeat this steps in the following style:

Combine each write access with each read access (but NOT the other way round)

The result should be:

[d, d=b, return d]
[d, d=b, while(d!=0)]
[d, d=b, if(c>d)]
[d, d=b, c=c-d]
[d, d=b, d=d-c]
[d, d=d-c, while(d!=0)]
[d, d=d-c, if(c>d)]
[d, d=d-c, c=c-d]
[d, d=d-c, d=d-c]

You have to take care, if the variable is changed by the time.

For example: From line 3 down to line 9, "d" is not redefined / changed.

At line 10, "d" could be redefined, this is, why you have to recombine this write access on "d" with all possible read access, which could be reached.

In this case, only the code beyond line 6 is relevant. Line 3 for example cannot be reached again.

For your understanding, you can imagine 2 different variables "d":

[d1, d1=b, return d1]
[d1, d1=b, while(d1!=0)]
[d1, d1=b, if(c>d1)]
[d1, d1=b, c=c-d1]
[d1, d1=b, d1=d1-c]
[d2, d2=d2-c, while(d2!=0)]
[d2, d2=d2-c, if(c>d2)]
[d2, d2=d2-c, c=c-d2]
[d2, d2=d2-c, d2=d2-c]

Method of building a use-def (or ud) chain

Set definitions in statement s(0)
For each i in [1,n], find live definitions that have use in statement s(i)
Make a link among definitions and uses
Set the statement s(i), as definition statement
Kill previous definitions

With this algorithm, two things are accomplished:

A directed acyclic graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...

(DAG) is created on the variable uses and definitions. The DAG specifies a data dependency among assignment statements, as well as a partial order (therefore parallelism among statements).
When statement s(i) is reached, there is a list of live variable assignments. If only one assignment is live, for example, constant propagation might be used.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.