XOR linked list
Encyclopedia
An XOR linked list is a data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 used in computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

. They take advantage of the bitwise exclusive disjunction
Exclusive disjunction
The logical operation exclusive disjunction, also called exclusive or , is a type of logical disjunction on two operands that results in a value of true if exactly one of the operands has a value of true...

 (XOR) operation, here denoted by ⊕, to decrease storage requirements for doubly linked lists. An ordinary doubly linked list stores addresses of the previous and next list items in each list node, requiring two address fields:

... A B C D E ...
–> next –> next –> next –>
<– prev <– prev <– prev <–

An XOR linked list compresses the same information into one address field by storing the bitwise XOR of the address for previous and the address for next in one field:

... A B C D E ...
<–> A⊕C <-> B⊕D <-> C⊕E <->

When you traverse the list from left to right: supposing you are at C, you can take the address of the previous item, B, and XOR it with the value in the link field (B⊕D). You will then have the address for D and you can continue traversing the list. The same pattern applies in the other direction.

To start traversing the list in either direction from some point, you need the address of two consecutive items, not just one. If the addresses of the two consecutive items are reversed, you will end up traversing the list in the opposite direction.

This form of linked list may be inadvisable:
  • General-purpose debugging tools cannot follow the XOR chain, making debugging more difficult;
  • The price for the decrease in memory usage is an increase in code complexity, making maintenance more expensive;
  • Most garbage collection
    Garbage collection (computer science)
    In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

     schemes do not work with data structures that do not contain literal pointers;
  • XOR of pointers is not defined in some contexts (e.g., the C
    C (programming language)
    C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

     language), although many languages provide some kind of type conversion
    Type conversion
    In computer science, type conversion, typecasting, and coercion are different ways of, implicitly or explicitly, changing an entity of one data type into another. This is done to take advantage of certain features of type hierarchies or type representations...

     between pointers and integers;
  • The pointers will be unreadable if one isn't traversing the list — for example, if the pointer to a list item was contained in another data structure;
  • While traversing the list you need to remember the address of the previously accessed node in order to calculate the next node's address.


Computer systems have increasingly cheap and plentiful memory, and storage overhead is not generally an overriding issue outside specialized embedded system
Embedded system
An embedded system is a computer system designed for specific control functions within a larger system. often with real-time computing constraints. It is embedded as part of a complete device often including hardware and mechanical parts. By contrast, a general-purpose computer, such as a personal...

s. Where it is still desirable to reduce the overhead of a linked list, unrolling
Unrolled linked list
In computer programming, an unrolled linked list is a variation on the linked list which stores multiple elements in each node. It can drastically increase cache performance, while decreasing the memory overhead associated with storing list metadata such as references...

 provides a more practical approach (as well as other advantages, such as increasing cache performance and speeding random access
Random access
In computer science, random access is the ability to access an element at an arbitrary position in a sequence in equal time, independent of sequence size. The position is arbitrary in the sense that it is unpredictable, thus the use of the term "random" in "random access"...

).

Features

  • Given only one list item, one cannot immediately obtain the addresses of the other elements of the list.
  • Two XOR operations suffice to do the traversal from one item to the next, the same instructions sufficing in both cases. Consider a list with items {…B C D…} and with R1 and R2 being registers
    Processor register
    In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

     containing, respectively, the address of the current (say C) list item and a work register containing the XOR of the current address with the previous address (say C⊕D). Cast as System/360
    System/360
    The IBM System/360 was a mainframe computer system family first announced by IBM on April 7, 1964, and sold between 1964 and 1978. It was the first family of computers designed to cover the complete range of applications, from small to large, both commercial and scientific...

     instructions:


X R2,Link R2 <- C⊕D ⊕ B⊕D (i.e. B⊕C, "Link" being the link field
in the current record, containing B⊕D)
XR R1,R2 R1 <- C ⊕ B⊕C (i.e. B, voilà: the next record)
  • End of list is signified by imagining a list item at address zero placed adjacent to an end point, as in {0 A B C…}. The link field at A would be 0⊕B. An additional instruction is needed in the above sequence after the two XOR operations to detect a zero result in developing the address of the current item,
  • A list end point can be made reflective by making the link pointer be zero. A zero pointer is a mirror. (The XOR of the left and right neighbor addresses, being the same, is zero.)

Why does it work?

The key is the first operation, and the properties of XOR:
  • X⊕X=0
  • X⊕0=X
  • X⊕Y=Y⊕X
  • (X⊕Y)⊕Z=X⊕(Y⊕Z)


The R2 register always contains the XOR of the address of current item C with the address of the predecessor item P: C⊕P. The Link fields in the records contain the XOR of the left and right successor addresses, say L⊕R. XOR of R2 (C⊕P) with the current link field (L⊕R) yields C⊕P⊕L⊕R.
  • If the predecessor was L, the P(=L) and L cancel out leaving C⊕R.
  • If the predecessor had been R, the P(=R) and R cancel, leaving C⊕L.


In each case, the result is the XOR of the current address with the next address. XOR of this with the current address in R1 leaves the next address. R2 is left with the requisite XOR pair of the (now) current address and the predecessor.

Variations

The underlying principle of the XOR linked list can be applied to any reversible binary operation. Replacing XOR by addition or subtraction gives slightly different, but largely equivalent, formulations:

Addition linked list


... A B C D E ...
<–> A+C <-> B+D <-> C+E <->

This kind of list has exactly the same properties as the XOR linked list, except that a zero link field is not a "mirror". The address of the next node in the list is given by subtracting the previous node's address from the current node's link field.

Subtraction linked list


... A B C D E ...
<–> C-A <-> D-B <-> E-C <->

This kind of list differs from the "traditional" XOR linked list in that the instruction sequences needed to traverse the list forwards is different from the sequence needed to traverse the list in reverse. The address of the next node, going forwards, is given by adding the link field to the previous node's address; the address of the preceding node is given by subtracting the link field from the next node's address.

The subtraction linked list is also special in that the entire list can be relocated in memory without needing any patching of pointer values, since adding a constant offset to each address in the list will not require any changes to the values stored in the link fields. (See also Serialization
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...

.) This is an advantage over both XOR linked lists and traditional linked lists.

Note about implementations in C:

The subtraction linked list also does not require casting C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

pointers to integers, provided the whole list structure is inside a single contiguous block of memory. In that case the subtraction of two C pointers yields an integer. Note that on most platforms the maximum size of a contiguous block of memory will be considerably smaller than the total available memory, so large lists will typically not fit into a single contiguous block of memory. This is not a problem as long as the platform provides the C99 type uintptr_t, because then pointers can be portably cast to uintptr_t and back again.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK