Aliasing (computing)
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated to all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers
Alias analysis
Alias analysis is a technique in compiler theory, used to determine if a storage location may be accessed in more than one way. Two pointers are said to be aliased if they point to the same location....

 intend to make and compute useful information for understanding aliasing in programs.

Array bounds checking

For example, the C programming language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 does not perform array bounds checking. One can then exploit the implementation of the programming language by the compiler, plus the computer architecture's assembly language conventions, to achieve aliasing effects.

If an array is created on the stack
Call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, control stack, run-time stack, or machine stack, and is often shortened to just "the stack"...

, with a variable laid out in memory directly beside that array, one could index outside that array and then directly change that variable by changing the relevant array element. For example, if we have an int array of size 10 (for this example's sake, calling it vector), next to another int variable (call it i), vector[10] (i.e. the 11th element) would be aliased to i if they are adjacent in memory.

This is possible in some implementations of C because an array is in reality a block of contiguous memory, and array elements are merely offsets off the address of the beginning of that block multiplied by the size of a single element. Since C has no bounds checking, indexing and addressing outside of the array is possible. Note that the aforementioned aliasing behaviour is implementation specific. Some implementations may leave space between arrays and variables on the stack, for instance, to align variables to memory locations that are a multiple of the architecture's native word size. The C standard does not generally specify how data is to be laid out in memory. (ISO/IEC 9899:1999, section 6.2.6.1).

It is not erroneous for a compiler to omit aliasing effects for accesses that fall outside the bounds of an array.

Aliased pointers

Another variety of aliasing can occur in any language that can refer to one location in memory with more than one name (for example, with pointers). See the C example of the xor swap algorithm that is a function; it assumes the two pointers passed to it are distinct, but if they are in fact equal (or aliases of each other), the function fails. This is a common problem with functions that accept pointer arguments, and their tolerance (or the lack thereof) for aliasing must be carefully documented, particularly for functions that perform complex manipulations on memory areas passed to them.

Specified aliasing

Controlled aliasing behaviour may be desirable in some cases (that is, aliasing behaviour that is specified, unlike that relevant to memory layout in C). It is common practice in Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

. The Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 specifies, in some constructs, aliasing behaviour, such as in foreach loops. This allows certain data structures to be modified directly with less code. For example,


my @array = (1, 2, 3);

foreach my $element (@array) {
# Increment $element, thus automatically
# modifying @array, since $element is aliased
# to each of @array's elements in turn.
$element++;
}

print "@array \n";

will print out "2 3 4" as a result. If one wanted to bypass aliasing effects, one could copy the contents of the index variable into another and change the copy.

Conflicts with optimization

Optimizers often have to make conservative assumptions about variables in the presence of pointers. For example, a constant propagation process that knows the value of variable x is 5 would not be able to keep using this information after an assignment to another variable (for example, *y = 10) because it could be that *y is an alias of x. This could be the case after an assignment like y = &x.

As an effect of the assignment to *y, the value of x would be changed as well, so propagating the information that x is 5 to the statements following *y = 10 would be potentially wrong (if *y is indeed an alias of x). However, if we have information about pointers, the constant propagation process could make a query like: can x be an alias of *y? Then, if the answer is no, x = 5 can be propagated safely.

Another optimization impacted by aliasing is code reordering. If the compiler decides that x is not an alias of *y, then code that uses or changes the value of x can be moved before the assignment *y = 10, if this would improve scheduling
Instruction scheduling
In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines...

 or enable more loop optimization
Loop optimization
In compiler theory, loop optimization plays an important role in improving cache performance, making effective use of parallel processing capabilities, and reducing overheads associated with executing loops. Most execution time of a scientific program is spent on loops...

s to be carried out.

To enable such optimizations in a predictable manner, the ISO standard for the C programming language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 (including its newer C99 edition, see section 6.5, paragraph 7) specifies that it is illegal (with some exceptions) for pointers of different types to reference the same memory location. This rule, known as "strict aliasing", sometime allows for impressive increases in performance, but has been known to break some otherwise valid code. Several software projects intentionally violate this portion of the C99 standard. For example, Python 2.x
CPython
CPython is the default, most-widely used implementation of the Python programming language. It is written in C. In addition to CPython, there are two other production-quality Python implementations: Jython, written in Java, and IronPython, which is written for the Common Language Runtime. There...

 did so to implement reference counting, and required changes to the basic object structs in Python 3 to enable this optimisation. The Linux kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

 does this because strict aliasing causes problems with optimization of inlined code. In such cases, when compiled with gcc, the option -fno-strict-aliasing is invoked to prevent unwanted or invalid optimizations that could produce incorrect code.

External links


See also

  • Aliasing
    Aliasing
    In signal processing and related disciplines, aliasing refers to an effect that causes different signals to become indistinguishable when sampled...

     for uses of the word when applied to signal processing, including computer graphics.
  • Sense and reference
    Sense and reference
    Sinn and bedeutung are usually translated, respectively, as sense and reference. Two different aspects of some terms' meanings, a term's reference is the object that the term refers to, while the term's sense is the way that the term refers to that object.Sinn and bedeutung were introduced by...

  • Pointer alias
    Pointer alias
    In computer programming, aliasing refers to the situation where the same memory location can be accessed using different names.For instance, if a function takes two pointers A and B which have the same value, then the name A[0] aliases the name B[0]. In this case we say the pointers A and B alias...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK