Type punning
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, type punning is a common term for any programming technique that subverts or circumvents the type system
Type system
A type system associates a type with each computed value. By examining the flow of these values, a type system attempts to ensure or prove that no type errors can occur...

 of a programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

 in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

In C and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, constructs such as type conversion
Type conversion
In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies or type representations...

, union
Union (computer science)
In computer science, a union is a value that may have any of several representations or formats; or a data structure that consists of a variable which may hold such a value. Some programming languages support special data types, called union types, to describe such values and variables...

, and reinterpret_cast are provided in order to permit many kinds of type punning, although some kinds are not actually supported by the standard language. See for the instance the floating-point example below.

Sockets example

One classic example of type punning is found in the Berkeley sockets
Berkeley sockets
The Berkeley sockets application programming interface comprises a library for developing applications in the C programming language that perform inter-process communication, most commonly for communications across a computer network....

 interface. The function to bind an opened but uninitialized socket to an IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 is declared as follows:


int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);


The bind function is usually called as follows:


struct sockaddr_in sa = {0};
int sockfd = ...;
sa.sin_family = AF_INET;
sa.sin_port = htons(port);
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);


The Berkeley sockets library fundamentally relies on the fact that in C, a pointer to struct sockaddr_in is freely convertible to a pointer to struct sockaddr; and, in addition, that the two structure types share the same memory layout. Therefore, a reference to the structure field my_addr->sin_family (where my_addr is of type struct sockaddr*) will actually refer to the field sa.sin_family (where sa is of type struct sockaddr_in). In other words, the sockets library uses type punning to implement a rudimentary form of inheritance
Inheritance (computer science)
In object-oriented programming , inheritance is a way to reuse code of existing objects, establish a subtype from an existing object, or both, depending upon programming language support...

.

Often seen in the programming world is the use of "padded" data structures to allow for the storage of different kinds of values in what is effectively the same storage space. This is often seen when two structures are used in mutual exclusivity for optimization.

Floating-point example

Not all examples of type punning involve structures, as the previous example did. Suppose we want to determine whether a floating-point number is negative. We could write:


bool is_negative(float x) {
return (x < 0.0);
}

However, supposing that floating-point comparisons are expensive, and also supposing that float is represented according to the IEEE floating-point standard
IEEE floating-point standard
IEEE 754–1985 was an industry standard for representingfloating-pointnumbers in computers, officially adopted in 1985 and superseded in 2008 byIEEE 754-2008. During its 23 years, it was the most widely used format for...

, and integers are 32 bits wide, we could engage in type punning to extract the sign bit
Sign bit
In computer science, the sign bit is a bit in a computer numbering format that indicates the sign of a number. In IEEE format, the sign bit is the leftmost bit...

 of the floating-point number using only integer operations:


bool is_negative(float x) {
unsigned int *ui = (unsigned int *)&x;
return ((*ui & 0x80000000) != 0);
}

This kind of type punning is more dangerous than most. Whereas the former example relied only on guarantees made by the C programming language about structure layout and pointer convertibility, the latter example relies on assumptions about a particular system's hardware. Some situations, such as time-critical
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

 code that the compiler otherwise fails to optimize
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

, may require dangerous code. In these cases, documenting all such assumptions in comment
Comment (computer programming)
In computer programming, a comment is a programming language construct used to embed programmer-readable annotations in the source code of a computer program. Those annotations are potentially significant to programmers but typically ignorable to compilers and interpreters. Comments are usually...

s, and introducing static assertions to verify portability expectations, helps to keep the code maintainable
Maintainability
In engineering, maintainability is the ease with which a product can be maintained in order to:* isolate defects or their cause* correct defects or their cause* meet new requirements* make future maintenance easier, or* cope with a changed environment...

.

Use of union

The above example violates the C language's constraints on how objects are accessed: the declared type of x is float but it is read through an expression of type unsigned int. See Aliasing (computing)
Aliasing (computing)
In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated to all aliased names, which may not be expected by the programmer...

 for further discussion of this point and practical consequences.

The problem can be fixed by the use of a union (making the same assumptions about representation as above):


bool is_negative(float x) {
union {
unsigned int ui;
float d;
} my_union = { .d = x };
return ((my_union.ui & 0x80000000) != 0);
}

Here the relevant declared type is float for the store and unsigned for the read, thus complying with the s6.5 rules.

For another example of type punning, see Stride of an array
Stride of an array
In computer programming, the stride of an array refers to the number of locations in memory between successive array elements, measured in bytes or in units of the size of the array's elements....

.

External links

  • Section of the GCC
    GNU Compiler Collection
    The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...

     manual on -fstrict-aliasing, which defeats some type punning
  • Defect Report 257 to the C99
    C99
    C99 is a modern dialect of the C programming language. It extends the previous version with new linguistic and library features, and helps implementations make better use of available computer hardware and compiler technology.-History:...

    standard, incidentally defining "type punning" in terms of union, and discussing the issues surrounding the implementation-defined behavior of the last example above
  • Defect Report 283 on the use of unions for type punning
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK