Tagged pointer
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a tagged pointer is a common example of a tagged union
Tagged union
In computer science, a tagged union, also called a variant, variant record, discriminated union, or disjoint union, is a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly...

, where the primary type of data to be stored in the union is a pointer. Often, the tag in a tagged pointer will be "folded" into the data representing the pointer, taking advantage of one of the following properties of pointers:
  • Pointers to certain types of data will often be aligned
    Data structure alignment
    Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks...

    to the size of the data (4 bytes, 8 bytes, etc.), which leaves a few bits of the pointer unused. As long as code that uses the pointer properly masks out these bits, the pointer can be tagged with extra information.
  • The virtual memory
    Virtual memory
    In computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...

     system in most modern operating system
    Operating system
    An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

    s reserves a block of logical memory around address 0 as unusable. This means that, for example, a pointer to 0 is never a valid pointer and can be used as a special null pointer value.


Use of 0 to represent a null pointer is extremely common, with many programming languages (such as Ada) explicitly relying on this behavior. In theory, other values in an operating system-reserved block of logical memory could be used to tag conditions other than a null pointer, but these uses appear to be rare, perhaps because they are at best non-portable
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

. It is generally accepted practice in software design that if a special pointer value distinct from null (such as a sentinel
Sentinel value
In computer programming, a sentinel value is a special value whose presence guarantees termination of a loop that processes structured data...

 in certain data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

s) is needed, the programmer should explicitly provide for it.

Taking advantage of the alignment of pointers provides more flexibility than null pointers/sentinels because it allows pointers to be tagged with information about the type of data pointed to, conditions under which it may be accessed, or other similar information about the pointer's use. This information can be provided along with every valid pointer. In contrast, null pointers/sentinels provide only a finite number of tagged values distinct from valid pointers.

In a tagged architecture
Tagged architecture
In computer science, a tagged architecture is a particular type of computer architecture where every word of memory constitutes a tagged union, being divided into a number of bits of data, and a tag section that describes the type of the data: how it is to be interpreted, and, if it is a reference,...

, a number of bits in every word of memory are reserved to act as a tag. Tagged architectures, such as the Lisp machine
Lisp machine
Lisp machines were general-purpose computers designed to efficiently run Lisp as their main software language. In a sense, they were the first commercial single-user workstations...

s, often have hardware support for interpreting and processing tagged pointers.

GNU libc
GNU C Library
The GNU C Library, commonly known as glibc, is the C standard library released by the GNU Project. Originally written by the Free Software Foundation for the GNU operating system, the library's development has been overseen by a committee since 2001, with Ulrich Drepper from Red Hat as the lead...

 malloc always returns 8-byte aligned memory addresses. Larger alignment values can be obtained with posix_memalign.

Example 1

In the following C code, we see 0 used to indicate a null pointer:

void optionally_return_a_value (int* pOptionalReturnValue) {
// ...
if (pOptionalReturnValue)
*pOptionalReturnValue = 01;
}

Example 2

Here, the programmer has provided a global variable, whose address is then used as a sentinel:

node g_sentinel;
#define SENTINEL &g_sentinel

void do_something_to_a_node (node* p) {
if (p

NULL) {
// do something
} else if (p

SENTINEL) {
// do something else
} else {
// treat p as a valid pointer to a node
}
}

Example 3

Assume we have a data structure table_entry that is always aligned to a 16 byte boundary. In other words, the least significant 4 bits of a table entry's address are always 0 (). We could use these 4 bits to mark the table entry with extra information. For example, bit 0 might mean read only, bit 1 might mean dirty (the table entry needs to be updated), and so on.

If pointers are 16-bit values, then:
  • 0x3421 is a read-only pointer to the table_entry at address 0x3420
  • 0xf472 is a pointer to a dirty table_entry at address 0xf470

Advantages

The major advantage of tagged pointers is that they take up less space than a pointer along with a separate tag field. This can be especially important when a pointer is a return value from a function. It can also be important in large tables of pointers.

A more subtle advantage is that by storing a tag in the same place as the pointer, it is often possible to guarantee the atomicity
Linearizability
In concurrent programming, an operation is atomic, linearizable, indivisible or uninterruptible if it appears to the rest of the system to occur instantaneously. Atomicity is a guarantee of isolation from concurrent processes...

 of an operation that updates both the pointer and its tag without external synchronization
Synchronization
Synchronization is timekeeping which requires the coordination of events to operate a system in unison. The familiar conductor of an orchestra serves to keep the orchestra in time....

 mechanisms. This can be an extremely large performance gain, especially in operating systems.

Disadvantages

Tagged pointers have some of the same difficulties as xor linked list
XOR linked list
An XOR linked list is a data structure used in computer programming. They take advantage of the bitwise exclusive disjunction operation, here denoted by ⊕, to decrease storage requirements for doubly linked lists. An ordinary doubly linked list stores addresses of the previous and next list items...

s, although to a lesser extent. For example, not all debugger
Debugger
A debugger or debugging tool is a computer program that is used to test and debug other programs . The code to be examined might alternatively be running on an instruction set simulator , a technique that allows great power in its ability to halt when specific conditions are encountered but which...

s will be able to properly follow tagged pointers; however, this is not an issue for a debugger that is designed with tagged pointers in mind.

The use of 0 to represent a null pointer does not suffer from these disadvantages: it is pervasive, and most programming languages treat 0 as a special null value. It has thoroughly proven its robustness. Usually when "tagged pointers" are spoken of, this common use is excluded.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK