All Topics  
Linked list

 

   Email Print
   Bookmark   Link






 

Linked list



 
 
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a linked list is one of the fundamental data structure
Data structure

A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
s, and can be used to implement other data structures. It consists of a sequence of node
Node (computer science)

A node is an abstract basic unit used to build linked data structures such as tree data structure, linked lists, and computer-based representations of graph ....
s, each containing arbitrary data field
Field (computer science)

In computer science, data that has several parts can be divided into fields. For example, a computer may represent today's date as three distinct fields: the day, the month and the year....
s and one or two reference
Reference (computer science)

In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
s ("links") pointing to the next and/or previous nodes. The principal benefit of a linked list over a conventional array
Array

In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
 is that the order of the linked items may be different from the order that the data items are stored in memory or on disk, allowing the list of items to be traversed in a different order.






Discussion
Ask a question about 'Linked list'
Start a new discussion about 'Linked list'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a linked list is one of the fundamental data structure
Data structure

A data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data....
s, and can be used to implement other data structures. It consists of a sequence of node
Node (computer science)

A node is an abstract basic unit used to build linked data structures such as tree data structure, linked lists, and computer-based representations of graph ....
s, each containing arbitrary data field
Field (computer science)

In computer science, data that has several parts can be divided into fields. For example, a computer may represent today's date as three distinct fields: the day, the month and the year....
s and one or two reference
Reference (computer science)

In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
s ("links") pointing to the next and/or previous nodes. The principal benefit of a linked list over a conventional array
Array

In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
 is that the order of the linked items may be different from the order that the data items are stored in memory or on disk, allowing the list of items to be traversed in a different order. A linked list is a self-referential datatype because it contains a pointer or link to another datum of the same type. Linked lists permit insertion and removal of nodes at any point in the list in constant time, but do not allow random access
Random access

In computer science, random access is the ability to access an arbitrary element of a sequence in equal time. The opposite is sequential access, where a remote element takes longer time to access....
. Several different types of linked list exist: singly-linked lists, doubly-linked lists, and circularly-linked lists.

Linked lists can be implemented in most languages. Languages such as Lisp
Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older....
 and Scheme have the data structure built in, along with operations to access the linked list. Procedural or object-oriented languages such as C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
, C++
C++

C++ is a general-purpose programming language. It is regarded as a middle-level language, as it comprises a combination of both high-level programming language and low-level programming language language features....
, and Java
Java (programming language)

Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java ....
 typically rely on mutable reference
Reference (computer science)

In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
s to create linked lists.

History

Linked lists were developed in 1955-56 by Allen Newell
Allen Newell

Allen Newell was a researcher in computer science and cognitive psychology at the RAND corporation and at Carnegie Mellon University?s Carnegie Mellon School of Computer Science, Tepper School of Business, and Department of Psychology....
, Cliff Shaw
Cliff Shaw

J.C. Shaw was a systems programmer at the RAND Corporation. He is a coauthor of the first artificial intelligence program, the Logic Theorist, and was one of the developers of Information Processing Language, a programming language of the 1950s....
 and Herbert Simon
Herbert Simon

Herbert Alexander Simon was an United States psychologist whose research ranged across the fields of cognitive psychology, computer science, public administration, economics, management, philosophy of science and sociology and was a professor, most notably, at Carnegie Mellon University....
 at RAND Corporation
Rand

Rand may refer to a number of places, people, organizations, and acronyms:...
 as the primary data structure for their Information Processing Language
Information Processing Language

Information Processing Language is a programming language developed by Allen Newell, Cliff Shaw, and Herbert Simon at RAND Corporation and the Carnegie Institute of Technology from about 1956....
. IPL was used by the authors to develop several early artificial intelligence
Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science which aims to create it. Major AI textbooks define the field as "the study and design of intelligent agents,"...
 programs, including the Logic Theory Machine, the General Problem Solver
General Problem Solver

General Problem Solver was a computer program created in 1957 by Herbert Simon and Allen Newell to build a universal problem solver machine. Any formalized symbolic problem can be solved, in principle, by GPS....
, and a computer chess program. Reports on their work appeared in IRE Transactions on Information Theory in 1956, and several conference proceedings from 1957-1959, including Proceedings of the Western Joint Computer Conference in 1957 and 1958, and Information Processing (Proceedings of the first UNESCO
UNESCO

United Nations Educational, Scientific and Cultural Organization is a specialized agency of the United Nations established on 16 November 1945....
 International Conference on Information Processing) in 1959. The now-classic diagram consisting of blocks representing list nodes with arrows pointing to successive list nodes appears in "Programming the Logic Theory Machine" by Newell and Shaw in Proc. WJCC, February 1957. Newell and Simon were recognized with the ACM Turing Award
Turing Award

The A. M. Turing Award is given annually by the Association for Computing Machinery to "an individual selected for contributions of a technical nature made to the computing community....
 in 1975 for having "made basic contributions to artificial intelligence, the psychology of human cognition, and list processing".

The problem of machine translation
Machine translation

Machine translation, sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translation text or speech from one natural language to another....
 for natural language processing
Natural language processing

Natural language processing is a field of computer science concerned with the interactions between computers and human languages. Natural language generation systems convert information from computer databases into readable human language....
 led Victor Yngve
Victor Yngve

Victor Yngve is professor emeritus of linguistics at the University of Chicago. He was one of the earliest researchers in computational linguistics and natural language processing, the use of computers to analyze and process languages....
 at Massachusetts Institute of Technology
Massachusetts Institute of Technology

The Massachusetts Institute of Technology is a private university research university located in Cambridge, Massachusetts, Massachusetts, United States....
 (MIT) to use linked lists as data structures in his COMIT
COMIT

COMIT was the first string processing language , developed on the IBM 700/7000 series computers by Dr. Vic Yngve and collaborators at Massachusetts Institute of Technology from 1957-1965....
 programming language for computer research in the field of linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
. A report on this language entitled "A programming language for mechanical translation" appeared in Mechanical Translation in 1958.

LISP
Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older....
, standing for list processor, was created by John McCarthy
John McCarthy (computer scientist)

John McCarthy , is an United States computer scientist and cognitive scientist who received the Turing Award in 1971 for his major contributions to the field of Artificial Intelligence ....
 in 1958 while he was at MIT and in 1960 he published its design in a paper in the Communications of the ACM
Communications of the ACM

Communications of the ACM is the flagship monthly journal of the Association for Computing Machinery . First published in 1957, CACM is sent to all ACM members, currently numbering about 80,000....
, entitled "Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I". One of LISP's major data structures is the linked list.

By the early 1960s, the utility of both linked lists and languages which use these structures as their primary data representation was well established. Bert Green of the MIT Lincoln Laboratory
Lincoln Laboratory

MIT Lincoln Laboratory, also known as Lincoln Lab, is a federally funded research and development center managed by the Massachusetts Institute of Technology and primarily funded by the United States Department of Defense....
 published a review article entitled "Computer languages for symbol manipulation" in IRE Transactions on Human Factors in Electronics in March 1961 which summarized the advantages of the linked list approach. A later review article, "A Comparison of list-processing computer languages" by Bobrow and Raphael, appeared in Communications of the ACM in April 1964.

Several operating systems developed by Technical Systems Consultants
Technical Systems Consultants

Technical Systems Consultants was a US software company that was instrumental in the first wave of the personal computer revolution.Headquartered first in West Lafayette, Indiana and later moved to Chapel Hill, North Carolina, it was the foremost supplier of software for SWTPC compatible hardware, as well as many other early makes of pers...
 (originally of West Lafayette Indiana, and later of Chapel Hill, North Carolina) used singly linked lists as file structures. A directory entry pointed to the first sector of a file, and succeeding portions of the file were located by traversing pointers. Systems using this technique included Flex (for the Motorola 6800 CPU), mini-Flex (same CPU), and Flex9 (for the Motorola 6809 CPU). A variant developed by TSC for and marketed by Smoke Signal Broadcasting in California, used doubly linked lists in the same manner.

The TSS operating system, developed by IBM for the System 360/370 machines, used a double linked list for their file system catalog. The directory structure was similar to Unix, where a directory could contain files and/or other directories and extend to any depth. A utility flea was created to fix file system problems after a crash, since modified portions of the file catalog were sometimes in memory when a crash occurred. Problems were detected by comparing the forward and backward links for consistency. If a forward link was corrupt, then if a backward link to the infected node was found, the forward link was set to the node with the backward link. A humorous comment in the source code where this utility was invoked stated "Everyone knows a flea collar gets rid of bugs in cats".

Types of linked lists


Linearly linked list


Singly-linked list
The simplest kind of linked list is a singly-linked list (or slist for short), which contains a node having two fields, one is information field and another is link field. This link points to the next node in the list, and last node points to a null value

A singly-linked list containing two values: the value of the current node and a link to the next node
A singly linked list's node is divided into two parts. The first part holds or points to information about the node, and second part holds the address of next node. A singly linked list travels one way.

Doubly-linked list
A more sophisticated kind of linked list is a doubly-linked list or two-way linked list. Each node has two links: one points to the previous node, or points to a null value or empty list if it is the first node; and one points to the next, or points to a null value or empty list if it is the final node.


A doubly-linked list containing three integer values: the value, the link forward to the next node, and the link backward to the previous node


In some very low level languages, XOR-linking
XOR linked list

XOR linked lists are a data structure used in computer programming. They take advantage of the bitwise exclusive disjunction operation, here denoted by ?, to decrease storage requirements for linked lists....
 offers a way to implement doubly-linked lists using a single word for both links, although the use of this technique is usually discouraged.

Circularly-linked list

In a circularly-linked list, the first and final nodes are linked together. This can be done for both singly and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in either direction until you return to the original node. Viewed another way, circularly-linked lists can be seen as having no beginning or end. This type of list is most useful for managing buffers for data ingest, and in cases where you have one object in a list and wish to iterate through all other objects in the list in no particular order.

The pointer pointing to the whole list may be called the access pointer.


A circularly-linked list built on a singly linked list


Sentinel nodes

Linked lists sometimes have a special dummy or sentinel node
Sentinel node

A sentinel node is a programming idiom used to speed up some operations on linked lists and Tree . It refers to a special type of object that represents the end of a data structure....
 at the beginning and/or at the end of the list, which is not used to store data. Its purpose is to simplify or speed up some operations, by ensuring that every data node always has a previous and/or next node, and that every list (even one that contains no data elements) always has a "first" and "last" node. Lisp
Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older....
 has such a design - the special value nil is used to mark the end of a 'proper' singly-linked list, or chain of cons cells
Cons

In computer programming, cons is a fundamental subroutine in most dialects of the Lisp programming language. cons constructs memory objects which hold two values or pointers to values....
 as they are called. A list does not have to end in nil, but a list that did not would be termed 'improper'.

Applications of linked lists

Linked lists are used as a building block for many other data structures, such as stack
Stack (data structure)

In computer science, a stack is an abstract data type and data structure based on the principle of LIFO . Stacks are used extensively at every level of a modern computer system....
s, queues and their variations.

The "data" field of a node can be another linked list. By this device, one can construct many linked data structures with lists; this practice originated in the Lisp programming language
Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older....
, where linked lists are a primary (though by no means the only) data structure, and is now a common feature of the functional programming style.

Sometimes, linked lists are used to implement associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
s, and are in this context called association lists. This use of linked lists is easily outperformed by other data structures such as self-balancing binary search tree
Self-balancing binary search tree

In computer science, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically....
s even on small data sets (see the discussion in associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
). However, sometimes a linked list is dynamically created out of a subset of nodes in such a tree, and used to more efficiently traverse that set.

Tradeoffs

As with most choices in computer programming and design, no method is well suited to all circumstances. A linked list data structure might work well in one case, but cause problems in another. This is a list of some of the common tradeoffs involving linked list structures. In general, if you have a dynamic collection, where elements are frequently being added and deleted, and the location of new elements added to the list is significant, then benefits of a linked list increase.

Linked lists vs. arrays

Array Linked list
Indexing O(1) O(n)
Inserting / Deleting at end O(1) O(1) or O(n)
Inserting / Deleting in middle (with iterator) O(n) O(1)
Persistent
Persistent data structure

In computing, a persistent data structure is a data structure which always preserves the previous version of itself when it is modified; such data structures are effectively immutable, as their operations do not update the structure in-place, but instead always yield a new updated structure....
 
No Singly yes
Locality
Locality of reference

In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related computer storage locations being frequently accessed....
 
Great Bad
Linked lists have several advantages over array
Array

In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
s. Arbitrarily many elements may be inserted into a linked list, while an array will eventually either fill up or need to be resized, an expensive operation that may not even be possible if memory is fragmented. Similarly, an array from which many elements are removed may become wastefully empty or need to be made smaller.

Further memory savings can be achieved, in certain cases, by sharing the same "tail" of elements among two or more lists — that is, the lists end in the same sequence of elements. In this way, one can add new elements to the front of the list while keeping a reference to both the new and the old versions — a simple example of a persistent data structure
Persistent data structure

In computing, a persistent data structure is a data structure which always preserves the previous version of itself when it is modified; such data structures are effectively immutable, as their operations do not update the structure in-place, but instead always yield a new updated structure....
.

On the other hand, arrays allow random access
Random access

In computer science, random access is the ability to access an arbitrary element of a sequence in equal time. The opposite is sequential access, where a remote element takes longer time to access....
, while linked lists allow only sequential access
Sequential access

In computer science, sequential access means that a group of elements is accessed in a predetermined, ordered sequence. Sequential access is sometimes the only way of accessing the data, for example if it is on a tape....
 to elements. Singly-linked lists, in fact, can only be traversed in one direction. This makes linked lists unsuitable for applications where it's useful to look up an element by its index quickly, such as heapsort
Heapsort

Heapsort is a comparison sort sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines than a good implementation of quicksort, it has the advantage of a worst-case big O notation runtime....
. Sequential access on arrays is also faster than on linked lists on many machines due to locality of reference
Locality of reference

In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related computer storage locations being frequently accessed....
 and data caches. Linked lists receive almost no benefit from the cache.

Another disadvantage of linked lists is the extra storage needed for references, which often makes them impractical for lists of small data items such as characters
Character (computing)

In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written language form of a natural language....
 or boolean values. It can also be slow, and with a naïve allocator, wasteful, to allocate memory separately for each new element, a problem generally solved using memory pool
Memory pool

Memory pools, also called Memory_allocation#Fixed-size-blocks_allocation , allow dynamic memory allocation comparable to malloc or C++'s new . As those implementations suffer from fragmentation because of variable block sizes, it can be impossible to use them in a Real-time computing due to performance....
s.

A number of linked list variants exist that aim to ameliorate some of the above problems. Unrolled linked list
Unrolled linked list

In computer programming, an unrolled linked list is a variation on the linked list which stores multiple elements in each node. It can drastically increase CPU cache performance, while decreasing the memory overhead associated with storing list metadata such as references....
s store several elements in each list node, increasing cache performance while decreasing memory overhead for references. CDR coding
CDR coding

In computer science CDR coding is a data compression data for Lisp programming language cons. It was developed and patented by the MIT Artificial Intelligence Laboratory, and implemented in computer hardware in a number of Lisp machines derived from the MIT CADR ....
 does both these as well, by replacing references with the actual data referenced, which extends off the end of the referencing record.

A good example that highlights the pros and cons of using arrays vs. linked lists is by implementing a program that resolves the Josephus problem
Josephus problem

The Josephus problem is a theoretical problem occurring in computer science and mathematics.There are people standing in a circle waiting to be executed....
. The Josephus problem is an election method that works by having a group of people stand in a circle. Starting at a predetermined person, you count around the circle n times. Once you reach the nth person, take them out of the circle and have the members close the circle. Then count around the circle the same n times and repeat the process, until only one person is left. That person wins the election. This shows the strengths and weaknesses of a linked list vs. an array, because if you view the people as connected nodes in a circular linked list then it shows how easily the linked list is able to delete nodes (as it only has to rearrange the links to the different nodes). However, the linked list will be poor at finding the next person to remove and will need to recurse through the list until it finds that person. An array, on the other hand, will be poor at deleting nodes (or elements) as it cannot remove one node without individually shifting all the elements up the list by one. However, it is exceptionally easy to find the nth person in the circle by directly referencing them by their position in the array.

The list ranking
List ranking

In parallel algorithms, the list ranking problem involves determining the position, or rank, of each item in a linked list. That is, the first item in the list should be assigned the number 1, the second item in the list should be assigned the number 2, etc....
 problem concerns the efficient conversion of a linked list representation into an array. Although trivial for a conventional computer, solving this problem by a parallel algorithm
Parallel algorithm

In computer science, a parallel algorithm, as opposed to a traditional sequential algorithm, is one which can be executed a piece at a time on many different processing devices, and then put back together again at the end to get the correct result....
 is complicated and has been the subject of much research.

Doubly-linked vs. singly-linked

Double-linked lists require more space per node (unless one uses xor-linking
XOR linked list

XOR linked lists are a data structure used in computer programming. They take advantage of the bitwise exclusive disjunction operation, here denoted by ?, to decrease storage requirements for linked lists....
), and their elementary operations are more expensive; but they are often easier to manipulate because they allow sequential access to the list in both directions. In particular, one can insert or delete a node in a constant number of operations given only that node's address. Comparing with singly-linked lists, it requires the previous node's address in order to correctly insert or delete. Some algorithms require access in both directions. On the other hand, they do not allow tail-sharing, and cannot be used as persistent data structures.

Circularly-linked vs. linearly-linked

Circular linked lists are most useful for describing naturally circular structures, and have the advantage of regular structure and being able to traverse the list starting at any point. They also allow quick access to the first and last records through a single pointer (the address of the last element). Their main disadvantage is the complexity of iteration, which has subtle special cases.

Sentinel nodes (header nodes, starter nodes)

Doubly linked lists can be structured without using a front and NULL pointer to the ends of the list. Instead, a node of object type T set with specified default values is used to indicate the "beginning" of the list. This node is known as a Sentinel node and is commonly referred to as a "header" node. Common searching and sorting algorithms are made less complicated through the use of a header node, as every element now points to another element, and never to NULL. The header node, like any other, contains a "next" pointer that points to what is considered by the linked list to be the first element. It also contains a "previous" pointer which points to the last element in the linked list. In this way, a doubly linked list structured around a Sentinel Node is circular.

The Sentinel node is defined as another node in a doubly linked list would be, but the allocation of a front pointer is unnecessary as the next and previous pointers of the Sentinel node will point to itself. This is defined in the default constructor of the list.

next

this; prev

this;

If the previous and next pointers point to the Sentinel node, the list is considered empty. Otherwise, if one or more elements is added, both pointers will point to another node, and the list will contain those elements.

Sentinel node may simplify certain list operations, by ensuring that the next and/or previous nodes exist for every element. However sentinel nodes use up extra space (especially in applications that use many short lists), and they may complicate other operations. To avoid the extra space requirement the sentinel nodes can often be reused as references to the first and/or last node of the list.

The Sentinel node eliminates the need to keep track of a pointer to the beginning of the list, and also eliminates any errors that could result in the deletion of the first pointer, or any accidental relocation.'

Linked list operations


When manipulating linked lists in-place, care must be taken to not use values that you have invalidated in previous assignments. This makes algorithms for inserting or deleting linked list nodes somewhat subtle. This section gives pseudocode
Pseudocode

Pseudocode is a compact and informal high-level description of a computer programming algorithm that uses the structural conventions of some programming language, but is intended for human reading rather than machine reading....
 for adding or removing nodes from singly, doubly, and circularly linked lists in-place. Throughout we will use null to refer to an end-of-list marker or sentinel
Sentinel value

In computer programming, a sentinel value is a special value that is used to terminate a Control flow that processes data structure data . The value should be selected in such a way that it will not be confused with legal data values....
, which may be implemented in a number of ways.

Linearly-linked lists


Singly-linked lists

Our node data structure will have two fields. We also keep a variable firstNode which always points to the first node in the list, or is null for an empty list.

record Node

record List

Traversal of a singly-linked list is simple, beginning at the first node and following each next link until we come to the end:

node := list.firstNode while node not null

The following code inserts a node after an existing node in a singly linked list. The diagram shows how it works. Inserting a node before an existing one cannot be done; instead, you have to locate it while keeping track of the previous node.

Singly Linked List Insert After
function insertAfter(Node node, Node newNode)

Inserting at the beginning of the list requires a separate function. This requires updating firstNode.

function insertBeginning(List list, Node newNode)

Similarly, we have functions for removing the node after a given node, and for removing a node from the beginning of the list. The diagram demonstrates the former. To find and remove a particular node, one must again keep track of the previous element.

Singly Linked List Delete After
function removeAfter(node node)

function removeBeginning(List list)

Notice that removeBeginning sets list.firstNode to null when removing the last node in the list.

Since we can't iterate backwards, efficient "insertBefore" or "removeBefore" operations are not possible.

Appending one linked list to another can be inefficient unless a reference to the tail is kept as part of the List structure, because we must traverse the entire first list in order to find the tail, and then append the second list to this. Thus, if two linearly-linked lists are each of length , list appending has asymptotic time complexity of . In the Lisp family of languages, list appending is provided by the append
Append

In general, to append is to join or add on to the end of something. For example, an appendix is a section appended of a document....
procedure.

Many of the special cases of linked list operations can be eliminated by including a dummy element at the front of the list. This ensures that there are no special cases for the beginning of the list and renders both insertBeginning and removeBeginning unnecessary. In this case, the first useful data in the list will be found at list.firstNode.next.

Doubly-linked lists

With doubly-linked lists there are even more pointers to update, but also less information is needed, since we can use backwards pointers to observe preceding elements in the list. This enables new operations, and eliminates special-case functions. We will add a prev field to our nodes, pointing to the previous element, and a lastNode field to our list structure which always points to the last node in the list. Both list.firstNode and list.lastNode are null for an empty list.

record Node

record List

Iterating through a doubly linked list can be done in either direction. In fact, direction can change many times, if desired.

Forwards node := list.firstNode while node ? null node := node.next

Backwards node := list.lastNode while node ? null node := node.prev

These symmetric functions add a node either after or before a given node, with the diagram demonstrating after:

Doubly Linked List Insert After
function insertAfter(List list, Node node, Node newNode) newNode.prev := node newNode.next := node.next if node.next = null list.lastNode := newNode else node.next.prev := newNode node.next := newNode

function insertBefore(List list, Node node, Node newNode) newNode.prev := node.prev newNode.next := node if node.prev is null list.firstNode := newNode else node.prev.next := newNode node.prev := newNode

We also need a function to insert a node at the beginning of a possibly-empty list:

function insertBeginning(List list, Node newNode) if list.firstNode = null list.firstNode := newNode list.lastNode := newNode newNode.prev := null newNode.next := null else insertBefore(list, list.firstNode, newNode)

A symmetric function inserts at the end:

function insertEnd(List list, Node newNode) if list.lastNode = null insertBeginning(list, newNode) else insertAfter(list, list.lastNode, newNode)

Removing a node is easier, only requiring care with the firstNode and lastNode:

function remove(List list, Node node) if node.prev = null list.firstNode := node.next else node.prev.next := node.next if node.next = null list.lastNode := node.prev else node.next.prev := node.prev destroy node

One subtle consequence of this procedure is that deleting the last element of a list sets both firstNode and lastNode to null, and so it handles removing the last node from a one-element list correctly. Notice that we also don't need separate "removeBefore" or "removeAfter" methods, because in a doubly-linked list we can just use "remove(node.prev)" or "remove(node.next)" where these are valid.

Circularly-linked list


Circularly-linked lists can be either singly or doubly linked. In a circularly linked list, all nodes are linked in a continuous circle, without using null. For lists with a front and a back (such as a queue), one stores a reference to the last node in the list. The next node after the last node is the first node. Elements can be added to the back of the list and removed from the front in constant time.

Both types of circularly-linked lists benefit from the ability to traverse the full list beginning at any given node. This often allows us to avoid storing firstNode and lastNode, although if the list may be empty we need a special representation for the empty list, such as a lastNode variable which points to some node in the list or is null if it's empty; we use such a lastNode here. This representation significantly simplifies adding and removing nodes with a non-empty list, but empty lists are then a special case.

Doubly-circularly-linked lists
Assuming that someNode is some node in a non-empty list, this code iterates through that list starting with someNode (any node will do):

Forwards node := someNode do do something with node.value node := node.next while node ? someNode

Backwards node := someNode do do something with node.value node := node.prev while node ? someNode

Notice the postponing of the test to the end of the loop. This is important for the case where the list contains only the single node someNode.

This simple function inserts a node into a doubly-linked circularly-linked list after a given element:

function insertAfter(Node node, Node newNode) newNode.next := node.next newNode.prev := node node.next.prev := newNode node.next := newNode

To do an "insertBefore", we can simply "insertAfter(node.prev, newNode)". Inserting an element in a possibly empty list requires a special function:

function insertEnd(List list, Node node) if list.lastNode = null node.prev := node node.next := node else insertAfter(list.lastNode, node) list.lastNode := node

To insert at the beginning we simply "insertAfter(list.lastNode, node)". Finally, removing a node must deal with the case where the list empties:

function remove(List list, Node node) if node.next = node list.lastNode := null else node.next.prev := node.prev node.prev.next := node.next if node = list.lastNode list.lastNode := node.prev; destroy node

As in doubly-linked lists, "removeAfter" and "removeBefore" can be implemented with "remove(list, node.prev)" and "remove(list, node.next)".

example: node 'tom' next pointing to node 'jerry'...till the last point at node 'fix'.. data in node 'fix' previous to node 'tom'...looping instruction..

Linked lists using arrays of nodes


Languages that do not support any type of reference
Reference (computer science)

In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
 can still create links by replacing pointers with array indices. The approach is to keep an array
Array

In computer science, an array is a data structure consisting of a group of element s that are accessed by index . In most programming languages each element has the same data type and the array occupies a contiguous area of computer memory....
 of record
Record (computer science)

In computer science, a record type or struct is a type whose values are records, i.e. aggregates of several items of possibly different types....
s, where each record has integer fields indicating the index of the next (and possibly previous) node in the array. Not all nodes in the array need be used. If records are not supported as well, parallel array
Parallel array

In computing, a parallel array is a data structure for representing arrays of Record . It keeps a separate, homogeneous array for each field of the record, each having the same number of elements....
s can often be used instead.

As an example, consider the following linked list record that uses arrays instead of pointers:

record Entry

By creating an array of these structures, and an integer variable to store the index of the first element, a linked list can be built:

integer listHead; Entry Records[1000];

Links between elements are formed by placing the array index of the next (or previous) cell into the Next or Prev field within a given element. For example:











IndexNextPrevNameBalance
014Jones, John123.45
1-10Smith, Joseph234.56
2 (listHead)4-1Adams, Adam0.00
3Ignore, Ignatius999.99
402Another, Anita876.54
5
6
7


In the above example, ListHead would be set to 2, the location of the first entry in the list. Notice that entry 3 and 5 through 7 are not part of the list. These cells are available for any additions to the list. By creating a ListFree integer variable, a free list
Free list

A free list is a data structure used in a scheme for dynamic memory allocation. It operates by connecting unallocated regions of memory together in a linked list, using the first word of each unallocated region as a pointer to the next....
 could be created to keep track of what cells are available. If all entries are in use, the size of the array would have to be increased or some elements would have to be deleted before new entries could be stored in the list.

The following code would traverse the list and display names and account balance: i := listHead; while i >= 0

When faced with a choice, the advantages of this approach include:
  • The linked list is relocatable, meaning it can be moved about in memory at will, and it can also be quickly and directly serialized
    Serialization

    In computer science, in the context of data storage and transmission, serialization is the process of converting an object into a sequence of bits so that it can be stored on a storage medium or transmitted across a computer network connection link....
     for storage on disk or transfer over a network.
  • Especially for a small list, array indexes can occupy significantly less space than a full pointer on many architectures.
  • Locality of reference
    Locality of reference

    In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related computer storage locations being frequently accessed....
     can be improved by keeping the nodes together in memory and by periodically rearranging them, although this can also be done in a general store.
  • Naïve dynamic memory allocators
    Dynamic memory allocation

    In computer science, dynamic memory allocation is the allocation of computer storage storage for use in a computer program during the runtime of that program....
     can produce an excessive amount of overhead storage for each node allocated; almost no allocation overhead is incurred per node in this approach.
  • Seizing an entry from a pre-allocated array is faster than using dynamic memory allocation for each node, since dynamic memory allocation typically requires a search for a free memory block of the desired size.


This approach has one main disadvantage, however: it creates and manages a private memory space for its nodes. This leads to the following issues:
  • It increase complexity of the implementation.
  • Growing a large array when it is full may be difficult or impossible, whereas finding space for a new linked list node in a large, general memory pool may be easier.
  • Adding elements to a dynamic array will occasionally (when it is full) unexpectedly take linear (O(n)) instead of constant time (although it's still an amortized
    Amortized analysis

    In computer science, especially analysis of algorithms, amortized analysis finds the average running time per operation over a worst-case sequence of operations....
     constant).
  • Using a general memory pool leaves more memory for other data if the list is smaller than expected or if many nodes are freed.
For these reasons, this approach is mainly used for languages that do not support dynamic memory allocation. These disadvantages are also mitigated if the maximum size of the list is known at the time the array is created.

Language support


Many programming language
Programming language

A programming language is a machine-readable artificial language designed to express computations that can be performed by a machine, particularly a computer....
s such as Lisp
Lisp programming language

Lisp is a family of computer programming languages with a long history and a distinctive, fully parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older....
 and Scheme have singly linked lists built in. In many functional languages, these lists are constructed from nodes, each called a cons
Cons

In computer programming, cons is a fundamental subroutine in most dialects of the Lisp programming language. cons constructs memory objects which hold two values or pointers to values....
 or cons cell. The cons has two fields: the car
Car and cdr

Introduced in the Lisp programming language, car and cdr are primitive operations upon linked lists composed of cons cells. A cons cell is composed of two pointers; the car operation extracts the first pointer, and the cdr operation extracts the second....
, a reference to the data for that node, and the cdr
Car and cdr

Introduced in the Lisp programming language, car and cdr are primitive operations upon linked lists composed of cons cells. A cons cell is composed of two pointers; the car operation extracts the first pointer, and the cdr operation extracts the second....
, a reference to the next node. Although cons cells can be used to build other data structures, this is their primary purpose.

In languages that support Abstract data type
Abstract data type

In computing, an abstract data type is a specification of a set of data and the set of operations that can be performed on the data. Such a data type is abstract in the sense that it is independent of various concrete implementations....
s or templates, linked list ADTs or templates are available for building linked lists. In other languages, linked lists are typically built using reference
Reference (computer science)

In computer science, a reference is an object containing information about how to locate and access the particular data item, as opposed to containing the data itself....
s together with record
Record (computer science)

In computer science, a record type or struct is a type whose values are records, i.e. aggregates of several items of possibly different types....
s. Here is a complete example in C
C (programming language)

C is a general-purpose computer programming language originally developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the Unix operating system....
:

  1. include /* for printf */
  2. include /* for malloc */


struct node ;

struct node *list_add(struct node **p, int i)

void list_remove(struct node **p) /* remove head */

struct node **list_search(struct node **n, int i)

void list_print(struct node *n)

int main(void)



Internal and external storage


When constructing a linked list, one is faced with the choice of whether to store the data of the list directly in the linked list nodes, called internal storage, or merely to store a reference to the data, called external storage. Internal storage has the advantage of making access to the data more efficient, requiring less storage overall, having better locality of reference
Locality of reference

In computer science, locality of reference, also known as the principle of locality, is the phenomenon of the same value or related computer storage locations being frequently accessed....
, and simplifying memory management for the list (its data is allocated and deallocated at the same time as the list nodes).

External storage, on the other hand, has the advantage of being more generic, in that the same data structure and machine code can be used for a linked list no matter what the size of the data is. It also makes it easy to place the same data in multiple linked lists. Although with internal storage the same data can be placed in multiple lists by including multiple next references in the node data structure, it would then be necessary to create separate routines to add or delete cells based on each field. It is possible to create additional linked lists of elements that use internal storage by using external storage, and having the cells of the additional linked lists store references to the nodes of the linked list containing the data.

In general, if a set of data structures needs to be included in multiple linked lists, external storage is the best approach. If a set of data structures need to be included in only one linked list, then internal storage is slightly better, unless a generic linked list package using external storage is available. Likewise, if different sets of data that can be stored in the same data structure are to be included in a single linked list, then internal storage would be fine.

Another approach that can be used with some languages involves having different data structures, but all have the initial fields, including the next (and prev if double linked list) references in the same location. After defining separate structures for each type of data, a generic structure can be defined that contains the minimum amount of data shared by all the other structures and contained at the top (beginning) of the structures. Then generic routines can be created that use the minimal structure to perform linked list type operations, but separate routines can then handle the specific data. This approach is often used in message parsing routines, where several types of messages are received, but all start with the same set of fields, usually including a field for message type. The generic routines are used to add new messages to a queue when they are received, and remove them from the queue in order to process the message. The message type field is then used to call the correct routine to process the specific type of message.

Example of internal and external storage


Suppose you wanted to create a linked list of families and their members. Using internal storage, the structure might look like the following:

record member record family

To print a complete list of families and their members using internal storage, we could write:

aFamily := Families // start at head of families list while aFamily ? null

Using external storage, we would create the following structures:

record node record member record family

To print a complete list of families and their members using external storage, we could write:

famNode := Families // start at head of families list while famNode ? null

Notice that when using external storage, an extra step is needed to extract the record from the node and cast it into the proper data type. This is because both the list of families and the list of members within the family are stored in two linked lists using the same data structure (node), and this language does not have parametric types.

As long as the number of families that a member can belong to is known at compile time, internal storage works fine. If, however, a member needed to be included in an arbitrary number of families, with the specific number known only at run time, external storage would be necessary.

Speeding up search


Finding a specific element in a linked list, even if it is sorted, normally requires O(n) time (linear search
Linear search

In computer science, linear search is a search algorithm, also known as sequential search, that is suitable for searching a list of data for a particular value....
). This is one of the primary disadvantages of linked lists over other data structures. In addition to the variants discussed above, below are two simple ways to improve search time.

In an unordered list, one simple heuristic for decreasing average search time is the move-to-front heuristic, which simply moves an element to the beginning of the list once it is found. This scheme, handy for creating simple caches, ensures that the most recently used items are also the quickest to find again.

Another common approach is to "index
Index (database)

A database index is a data structure that improves the speed of operations on a Table . Indexes can be created using one or more column , providing the basis for both rapid random look ups and efficient access of ordered records....
" a linked list using a more efficient external data structure. For example, one can build a red-black tree
Red-black tree

A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays....
 or hash table
Hash table

In computer science, a hash table, or a hash map, is a data structure that associates Unique key with value .The primary operation that hash functions support efficiently is a lookup: given a key , find the corresponding value ....
 whose elements are references to the linked list nodes. Multiple such indexes can be built on a single list. The disadvantage is that these indexes may need to be updated each time a node is added or removed (or at least, before that index is used again).

Related data structures


Both stacks
Stack (data structure)

In computer science, a stack is an abstract data type and data structure based on the principle of LIFO . Stacks are used extensively at every level of a modern computer system....
 and queues are often implemented using linked lists, and simply restrict the type of operations which are supported.

The skip list
Skip list

A skip list is a probabilistic data structure, based on multiple parallel, sorted linked lists, with algorithmic efficiency comparable to a binary search tree ....
 is a linked list augmented with layers of pointers for quickly jumping over large numbers of elements, and then descending to the next layer. This process continues down to the bottom layer, which is the actual list.

A binary tree
Binary tree

In computer science, a binary tree is a Tree in which each node has at most two child node. Typically the child nodes are called left and right....
 can be seen as a type of linked list where the elements are themselves linked lists of the same nature. The result is that each node may include a reference to the first node of one or two other linked lists, which, together with their contents, form the subtrees below that node.

An unrolled linked list
Unrolled linked list

In computer programming, an unrolled linked list is a variation on the linked list which stores multiple elements in each node. It can drastically increase CPU cache performance, while decreasing the memory overhead associated with storing list metadata such as references....
 is a linked list in which each node contains an array of data values. This leads to improved cache performance, since more list elements are contiguous in memory, and reduced memory overhead, because less metadata needs to be stored for each element of the list.

A hash table
Hash table

In computer science, a hash table, or a hash map, is a data structure that associates Unique key with value .The primary operation that hash functions support efficiently is a lookup: given a key , find the corresponding value ....
 may use linked lists to store the chains of items that hash to the same position in the hash table.

A heap
Heap (data structure)

In computer science, a heap is a specialized tree data structure-based data structure that satisfies the heap property: if B is a child node of A, then key ≥ key....
 shares some of the ordering properties of a linked list, but is almost always implemented using an array. Instead of references from node to node, the next and previous data indexes are calculated using the current data's index.

External links

  • from the Dictionary of Algorithms and Data Structures
    Dictionary of Algorithms and Data Structures

    The Dictionary of Algorithms and Data Structures is a dictionary style reference for many algorithms, "algorithmic techniques", "archetypal problems" and data structures found in the field of Computer Science....
  • Some linked list materials are available from the Stanford University
    Stanford University

    Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is a private university research university located in Stanford, California, California, United States....
     Computer Science department:
  • (note that this technique was widely used for many decades before the patent was granted)
  • — opensource library
  • — opensource library