Union (computer science)
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a union is a value
Value (computer science)
In computer science, a value is an expression which cannot be evaluated any further . The members of a type are the values of that type. For example, the expression "1 + 2" is not a value as it can be reduced to the expression "3"...

 that may have any of several representations or formats; or a data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 that consists of a variable which may hold such a value. Some programming languages support special data type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...

s, called (somewhat confusingly) union types, to describe such values and variables. In other words, a union type definition will specify which of a number of permitted primitive types may be stored in its instances, eg "float or long integer". Contrast with a record
Record (computer science)
In computer science, a record is an instance of a product of primitive data types called a tuple. In C it is the compound data in a struct. Records are among the simplest data structures. A record is a value that contains other values, typically in fixed number and sequence and typically indexed...

, which could be defined to contain a float and an integer; whereas, in a union, there is only one value at a time.

In type theory
Type theory
In mathematics, logic and computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory, or the study of such formalisms in general...

, a union has a sum type.

Depending on the language and type, a union value may be used in some operations, such as assignment and comparison for equality, without knowing its specific type. Other operations may require that knowledge, either by some external information, or by the use of a tagged union
Tagged union
In computer science, a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly...

.

Note: The remainder of this article refers strictly to primitive untagged unions, as opposed to tagged union
Tagged union
In computer science, a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly...

s.

Because of the limitations of their use, untagged unions are generally only provided in untyped languages or in an unsafe way (as in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

). They have the advantage over simple tagged unions of not requiring space to store the tag.

The name "union" stems from the type's formal definition. If one sees a type as the set of all values that that type can take on, a union type is simply the mathematical union
Union (set theory)
In set theory, the union of a collection of sets is the set of all distinct elements in the collection. The union of a collection of sets S_1, S_2, S_3, \dots , S_n\,\! gives a set S_1 \cup S_2 \cup S_3 \cup \dots \cup S_n.- Definition :...

 of its constituting types, since it can take on any value any of its fields can. Also, because a mathematical union discards duplicates, if more than one fields of the union can take on a single common value, it is impossible to tell from the value alone which field was last written.

However, one useful programming function of unions is to map smaller data elements to larger ones for easier manipulation. A data structure, consisting for example of 4 bytes and a 32-bit integer, can form a union (in this case with an unsigned 64-bit integer) and thus be more readily accessed for purposes of comparison etc.

Like a structure, all of the members of a union are by default public. The keywords private, public, and protected may be used inside a struct or a union in exactly the same way they are used inside a class for defining private, public, and protected members.

C/C++

In C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 and C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, untagged unions are expressed nearly exactly like structures (struct
Struct (C programming language)
A struct in C programming language is a structured type that aggregates a fixed set of labelled objects, possibly of different types, into a single object.A struct declaration consists of a list of fields, each of which can have any type...

s), except that each data member begins at the same location in memory. The data members, as in structures, need not be primitive values, and in fact may be structures or even other unions. However, C++ does not allow for a data member to be any type that has a full fledged constructor/destructor and/or copy constructor, or a non-trivial copy assignment operator. For example, it is impossible to have the standard C++ string
String (C++)
In the C++ programming language, the std::string class is a standard representation for a string of text. This class alleviates many of the problems introduced by C-style strings by putting the onus of memory ownership on the string class rather than on the programmer...

 as a member of a union.

The primary usefulness of a union is to conserve space, since it provides a way of letting many different types be stored in the same space. Unions also provide crude polymorphism. However, there is no checking of types, so it is up to the programmer to be sure that the proper fields are accessed in different contexts. The relevant field of a union variable is typically determined by the state of other variables, possibly in an enclosing struct.

One common C programming idiom uses unions to perform what C++ calls a reinterpret_cast, by assigning to one field of a union and reading from another, as is done in code which depends on the raw representation of the values. A practical example is the method of computing square roots using the IEEE representation. This is not, however, a safe use of unions in general.

Anonymous union

Unions can also be anonymous; that is, they do not have a name. Their data members are accessed directly. In addition to this, they have certain other restrictions like:
  • They must also be declared as static if declared in file scope.If declared in local scope, they must be static or automatic.
  • They can have only public members; private and protected members in anonymous unions generate errors.
  • They cannot have function members.


An important point to be noted is that simply omitting the class-name portion of the syntax does not make a union an anonymous union. For a union to qualify as an anonymous union, the declaration must not declare an object.
Example:

// anonymous_unions.cpp
  1. include

using namespace std;
int main {
union {
int d;
char *f;
};

d = 4;
cout << d << endl;

f = "inside of union";
cout << f << endl;
}

COBOL

In COBOL
COBOL
COBOL is one of the oldest programming languages. Its name is an acronym for COmmon Business-Oriented Language, defining its primary domain in business, finance, and administrative systems for companies and governments....

, union data items are defined in two ways. The first uses the RENAMES (66 level) keyword, which effectively maps a second alphanumeric data item on top of the same memory location as a preceding data item. In the example code below, data item PERSON-REC is defined as a group containing another group and a numeric data item. PERSON-DATA is defined as an alphanumeric data item that renames PERSON-REC, treating the data bytes continued within it as character data.
01 PERSON-REC.
05 PERSON-NAME.
10 PERSON-NAME-LAST PIC X(12).
10 PERSON-NAME-FIRST PIC X(16).
10 PERSON-NAME-MID PIC X.
05 PERSON-ID PIC 9(9) PACKED-DECIMAL.

01 PERSON-DATA RENAMES PERSON-REC.

The second way to define a union type is by using the REDEFINES keyword. In the example code below, data item VERS-NUM is defined as a 2-byte binary integer containing a version number. A second data item VERS-BYTES is defined as a two-character alphanumeric variable. Since the second item is redefined over the first item, the two items share the same address in memory, and therefore share the same underlying data bytes. The first item interprets the two data bytes as a binary value, while the second item interprets the bytes as character values.
01 VERS-INFO.
05 VERS-NUM PIC S9(4) COMP.
05 VERS-BYTES PIC X(2)
REDEFINES VERS-NUMBER.

Syntax and Example

In C and C++, the syntax is:

union
{
<1st variable name>;
<2nd variable name>;
.
.
.
;
} ;


A structure can also be a member of a union, as the following example shows:

union name1
{
struct name2
{
int a;
float b;
char c;
} svar;
int d;
} uvar;

This example defines a variable uvar as a union (tagged as name1), which contains two members, a structure (tagged as name2) named svar (which in turn contains three members), and an integer variable named d.

Unions may occur within structures and arrays,and vice versa:

struct
{
int flags;
char *name;
int utype;
union {
int ival;
float fval;
char *sval;
} u;
} symtab[NSYM];

The number ival is referred to as symtab[i].u.ival and the first character of string sval by either of *symtab[i].u.sval or symtab[i].u.sval[0].

Difference between Union and Structure

A union is a class all of whose data members are mapped to the same address within its object.The size of an object of a union is, therefore, the size of its largest data member.

In a structure,all of its data members are stored in contiguous memory locations.The size of an object of a struct is,therefore,the size of the sum of all its data members.

This gain in space efficiency, while valuable in certain circumstances, comes at a great cost of safety: the program logic must ensure that it only reads the field most recently written along all possible execution paths. The exception is when unions are used for type conversion
Type conversion
In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies or type representations...

: in this case, a certain field is written and the subsequently read field is deliberately different.

An example illustrating this point is:

+-----+-----+
struct { int a; float b } gives | a | b |
+-----+-----+
^ ^
| |
memory location: 150 154
|
V
+-----+
union { int a; float b } gives | a |
| b |
+-----+


Structures are used where an "object" is composed of other objects, like a point object consisting of two integers, those being the x and y coordinates:

typedef struct {
int x; // x and y are separate
int y;
} tPoint;

Unions are typically used in situation where an object can be one of many things but only one at a time, such as a type-less storage system:

typedef enum { STR, INT } tType;
typedef struct {
tType typ; // typ is separate.
union {
int ival; // ival and sval occupy same memory.
char *sval;
}
} tVal;

See also

  • Tagged union
    Tagged union
    In computer science, a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly...

  • UNION operator
  • C++ Essentials by Sharam Hekmat

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK