Reverse index
Encyclopedia
Database management systems provide multiple types of indexes
Index (information technology)
In computer science, an index can be:# an integer that identifies an array element# a data structure that enables sublinear-time lookup -Array element identifier:...

 to improve performance and data integrity across diverse application. Index types include b-tree
B-tree
In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children...

s, bitmaps
Bitmap Index
A bitmap index is a special kind of database index that uses bitmaps.Bitmap indexes have traditionally been considered to work well for data such as gender, which has a small number of distinct values, for example male and female, but many occurrences of those values. This would happen if, for...

, and r-tree
R-tree
R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both research and real-world applications...

s.

In database management systems, a reverse key index strategy reverses the key
Unique key
In relational database design, a unique key can uniquely identify each row in a table, and is closely related to the Superkey concept. A unique key comprises a single column or a set of columns. No two distinct rows in a table can have the same value in those columns if NULL values are not used...

 value before entering it in the index
Index (database)
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space...

. E.g., the value 24538 becomes 83542 in the index. Reversing the key value is particularly useful for indexing data such as sequence numbers, where each new key value is greater than the prior value, i.e., values monotonically increase. Reverse key indexes have become particularly important in high volume transaction processing system
Transaction Processing System
A transaction processing system is a type of information system. TPSs collect, store, modify, and retrieve the transactions of an organization. A transaction is an event that generates or modifies data that is eventually stored in an information system. To be considered a transaction processing...

s because they reduce contention for index blocks
Block (data storage)
In computing , a block is a sequence of bytes or bits, having a nominal length . Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data...

.

Creating data

Reversed key indexes use b-tree structures, but preprocess key values before inserting them. Simplifying, b-trees place similar values on a single index block, e.g., storing 24538 on the same block as 24539. This makes them efficient both for looking up a specific value and for finding values within a range. However if the application inserts values in sequence, each insert must have access to the newest block in the index in order to add the new value. If many users attempt to insert at the same time, they all must write to that block and have to get in line, slowing down the application. This is particularly a problem in clustered databases
Cluster (computing)
A computer cluster is a group of linked computers, working together closely thus in many respects forming a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks...

, which may require the block to be copied from one computer's memory to another's to allow the next user to perform their insert.

Reversing the key spreads similar new values across the entire index instead of concentrating them in any one leaf block. This means that 24538 appears on the same block as 14538 while 24539 goes to a different block, eliminating this cause of contention
Shared resource
In computing, a shared resource or network share is a device or piece of information on a computer that can be remotely accessed from another computer, typically via a local area network or an enterprise Intranet, transparently as if it were a resource in the local machine.Examples are shared file...

. (Since 14538 would have been created long before 24538, their inserts don't interfere with each other.)

Querying data

Reverse indexes are just as efficient as unreversed indexes for finding specific values, although they aren't helpful for range queries. Range queries are uncommon for artificial values such as sequence numbers. When searching the index, the query processor simply reverses the search target before looking it up.

Deleting data

Typically, applications delete data that is older on average before deleting newer data. Thus, data with lower sequence numbers generally go before those with higher values. As time passes, in standard b-tree
B-tree
In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children...

s, index blocks for lower values end up containing few values, with a commensurate increase in unused space, referred to as "rot". Rot not only wastes space, but slows query speeds, because a smaller fraction of a rotten index's blocks fit in memory at any one time. In a b-tree, if 14538 gets deleted, its index space remains empty. In a reverse index, if 14538 goes before 24538 arrives, 24538 can reuse 14538's space.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK