All Topics  
Self-balancing binary search tree

 

   Email Print
   Bookmark   Link






 

Self-balancing binary search tree



 
 
In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree
Binary search tree

In computer science, a binary search tree is a binary tree data structurewhich has the following properties:* Each node has a distinct value....
 (BST) that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. It is one of the most efficient ways of implementing ordered lists and can be used for other data structures such as associative arrays and sets
Set (computer science)

In computer science, a set is a collection of certain values, without any particular Canonical order, and no repeated values. It corresponds with a finite set in mathematics....
.

Overview
Most operations on a binary search tree take time directly proportional to the tree's height, so it is desirable to keep the height small.






Discussion
Ask a question about 'Self-balancing binary search tree'
Start a new discussion about 'Self-balancing binary search tree'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In computer science
Computer science

Computer science is the study of the theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems....
, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree
Binary search tree

In computer science, a binary search tree is a binary tree data structurewhich has the following properties:* Each node has a distinct value....
 (BST) that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. It is one of the most efficient ways of implementing ordered lists and can be used for other data structures such as associative arrays and sets
Set (computer science)

In computer science, a set is a collection of certain values, without any particular Canonical order, and no repeated values. It corresponds with a finite set in mathematics....
.

Overview


Most operations on a binary search tree take time directly proportional to the tree's height, so it is desirable to keep the height small. Ordinary binary search tree
Binary search tree

In computer science, a binary search tree is a binary tree data structurewhich has the following properties:* Each node has a distinct value....
s have the primary disadvantage that they can attain very large heights in rather ordinary situations, such as when the keys are inserted in order. The result is a data structure similar to a linked list
Linked list

In computer science, a linked list is one of the fundamental data structures, and can be used to implement other data structures. It consists of a sequence of node s, each containing arbitrary data Field s and one or two reference s pointing to the next and/or previous nodes....
, making all operations on the tree expensive. If we know all the data ahead of time, we can keep the height small on average by adding values in a random order, but we don't always have this luxury, particularly in online algorithm
Online algorithm

In computer science, an online algorithm is one that can process its input piece-by-piece in a serial fashion, i.e. in the order that the input is fed to the algorithm, without having the entire input available from the start....
s.

Self-balancing binary trees solve this problem by performing transformations on the tree (such as tree rotation
Tree rotation

A tree rotation is an operation on a binary search tree that changes the structure without interfering with the order of the elements. A tree rotation moves one node up in the tree and one node down....
s) at key times, in order to reduce the height. Although a certain overhead
Overhead

Overhead may be:* Overhead , the ongoing operating costs of running a business* Engineering overhead, ancillary design features required by a component of a device...
 is involved, it is justified in the long run by ensuring fast execution of later operations.

The height must always be at most the ceiling
Floor function

In mathematics and computer science, the floor and ceiling function s map a real number to the next smallest or next largest integer. More precisely, floor is the largest integer not greater than x and ceiling is the smallest integer not less than x....
 of log2 n, since there are at most 2k nodes on the kth level; a complete or full binary tree has exactly this many levels. Balanced BSTs are not always so precisely balanced, since it can be expensive to keep a tree at minimum height at all times; instead, most algorithms keep the height within a constant factor of this lower bound.

Times for various operations in terms of number of nodes in the tree n:

For some implementations these times are worst-case, while for others they are amortized
Amortized analysis

In computer science, especially analysis of algorithms, amortized analysis finds the average running time per operation over a worst-case sequence of operations....
.

Implementations


Popular data structures implementing this type of tree include:

  • AA tree
    AA tree

    An AA tree in computer science is a form of Self-balancing binary search tree used for storing and retrieving ordered data efficiently. AA trees are named for Arne Andersson , their inventor....
  • AVL tree
    AVL tree

    In computer science, an AVL tree is a self-balancing binary search tree, and it is the first such data structure to be invented. In an AVL tree, the tree height of the two child node subtrees of any node differ by at most one; therefore, it is also said to be height-balanced tree....
  • Red-black tree
    Red-black tree

    A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays....
  • Scapegoat tree
    Scapegoat tree

    In computer science, a scapegoat tree is a self-balancing binary search tree, invented by Igal Galperin and Ronald L. Rivest. It provides worst-case big O notation lookup time, and O amortized analysis insertion and deletion time....
  • Splay tree
    Splay tree

    A splay tree is a self-balancing binary search tree with the additional property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in big O notation amortized analysis time....
  • Treap
    Treap

    In computer science, a treap is a binary search tree that orders the nodes by adding a priority attribute to a node, as well as a key. The nodes are ordered so that the keys form a binary search tree and the priorities obey the...


Applications


Self-balancing binary search trees can be used in a natural way to construct and maintain ordered list
Ordered list

In HTML an ordered list .. is a HTML element#Lists for a list of items where each item is automatically prefixed by an indication of its position in the list....
s, such as priority queue
Priority queue

A priority queue is an abstract data type in computer programming that supports the following three operations:* InsertWithPriority: add an element to the Queue_ with an associated priority...
s.

They can also be used for associative array
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
s; key-value pairs are simply inserted with an ordering based on the key alone. In this capacity, self-balancing BSTs have a number of advantages and disadvantages
Associative array

An associative array is an abstract data type composed of a Collection of unique keys and a collection of values, where each key is associated with one value ....
 over their main competitor, hash table
Hash table

In computer science, a hash table, or a hash map, is a data structure that associates Unique key with value .The primary operation that hash functions support efficiently is a lookup: given a key , find the corresponding value ....
s. Lookup is somewhat complicated in the case where the same key can be used multiple times.

Many algorithms can exploit self-balancing BSTs to achieve good worst-case bounds with very little effort. For example, if binary tree sort is done with a BST, we have a very simple-to-describe yet asymptotically optimal
Asymptotically optimal

In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor worse than the best possible algorithm....
 O(n log n) sorting algorithm (although such an algorithm has practical disadvantages due to bad cache behavior). Similarly, many algorithms in computational geometry
Computational geometry

Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems are also considered to be part of computational geometry....
 exploit variations on self-balancing BSTs to solve problems such as the line segment intersection
Line segment intersection

In computational geometry, the line segment intersection problem supplies a list of line segments in the plane and asks us to determine whether any two of them intersect, or cross....
 problem and the point location
Point location

The point location problem is a fundamental topic of computational geometry. It finds applications in areas that deal with processing geometrical data: computer graphics, geographic information systems , motion planning, and computer aided design ....
 problem efficiently.

Self-balancing BSTs are a flexible data structure, in that it's easy to extend them to efficiently record additional information or perform new operations. For example, one can record the number of nodes in each subtree having a certain property, allowing one to count the number of nodes in a certain key range with that property in O(log n) time. These extensions can be used, for example, to optimize database queries or other list-processing algorithms.

See also

  • DSW algorithm
    DSW algorithm

    The DSW algorithm, or in full Day/Stout/Warren algorithm, is a method for efficiently balancing binary search trees — that is, decreasing their height to Big-O notation nodes, where n is the total number of nodes....
  • Fusion tree
    Fusion tree

    A fusion tree is a tree data structure that implements an associative array with integer keys up to a fixed size; by exploiting the constant-time machine word multiplication operation available on many real processors, it is able to achieve all operations in...
  • Skip list
    Skip list

    A skip list is a probabilistic data structure, based on multiple parallel, sorted linked lists, with algorithmic efficiency comparable to a binary search tree ....
  • Sorting
    Sorting

    Sorting is any process of arranging items in some sequence and/or in different sets, and accordingly, it has two common, yet distinct meanings:...


External links