Grid file
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a grid file or bucket grid is a point access method which splits a space into a non-periodic grid where one or more cells of the grid refer to a small set of points. Grid files (a symmetric data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

) provide an efficient method of storing these indexes on disk to perform complex data lookups.

It provides a grid of n-dimensions where n represents how many keys can be used to reference a single point.

Grid files do not contain any data themselves but instead contain references to the correct bucket
Bucket (computing)
In computing, the term bucket can have several meanings. It is used both as a live metaphor, and as a generally accepted technical term in some specialised areas. A bucket is most commonly a type of data buffer or a type of document in which data is divided into regions.-Features of a...

.

Uses

A grid file is usually used in cases where a single value can be referenced by multiple keys.

A grid file began being used because "traditional file structures that provide multikey access to records, for example, inverted files, are extensions of file structures originally designed for single-key access. They manifest various deficiencies in particular for multikey access to highly dynamic files."

In a traditional single dimensional data structure (e.g. hash
Associative array
In computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....

), a search on a single criterion is usually very simple but searching for a second criterion can be much more complex.

Grid files represent a special kind of hashing, where the traditional hash is replaced by a grid directory.

Consensus Database

Consider a database containing data from a census. A single record represents a single household, and all records are grouped into buckets. All records in a bucket can be indexed by either their city (which is the same for all records in the bucket), and the streets in that city whose names begin with the same letter.

A grid file can be used to provide an efficient index for this structure, where records come in groupings of 26, each of them relating to street names in a city starting with one of the letters of the alphabet. This structure can be thought of as an array, table
Table (database)
In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical columns and horizontal rows. A table has a specified number of columns, but can have any number of rows...

, or grid
Grid (spatial index)
In the context of a spatial index, a grid is a regular tessellation of a manifold or 2-D surface that divides it into a series of contiguous cells, which can then be assigned unique identifiers and used for spatial indexing purposes...

 with two dimensions which we will call the x and y axes.

One may consider the x-axis to be the city and the y-axis to be each of the letters in the alphabet, or alternatively, the first letter of each street.

Each record in this structure is known as a cell. Each cell will contain a pointer to the appropriate bucket in the database where the actual data is stored. An extra cell, or record header, may be required to store the name of the city. Other cells grouped with it will only need to contain the pointer to their respective bucket, since the first cell corresponds to street names beginning with "A", the second to "B", and so on.

The database can be further extended to contain a continent field to expand the census to other continents. This would cause records in the same bucket to correspond to households on a street beginning with the same letter, in the same city, in the same continent.

The cells in the grid file would then consist of a city header, and six (one for each continent, not including Antarctica) groupings of 26 cells relating to the streets with the same starting letter, in the same city, on the same continent and could now be thought of as a three-dimensional array.

Advantages

Since a single entry in the grid file contains pointers to all records indexed by the specified keys:
  • No special computations are required
  • Only the right records are retrieved
  • Can also be used for single search key queries
  • Easy to extend to queries on n search keys
  • Significant improvement in processing time for multiple-key queries
  • Has a two-disk-access upper bound for accessing data.

Disadvantages

However, because of the nature of the grid file, which gives it its advantages, there are also some disadvantages:
  • Imposes space overhead
  • Performance overhead on insertion and deletion

See also

  • Lattice graph
    Lattice graph
    The terms lattice graph, mesh graph, or grid graph refer to a number of categories of graphs whose drawing corresponds to some grid/mesh/lattice, i.e., its vertices correspond to the nodes of the mesh and its edges correspond to the ties between the nodes.-Square grid graph:A common type of a...

  • Grid (spatial index)
    Grid (spatial index)
    In the context of a spatial index, a grid is a regular tessellation of a manifold or 2-D surface that divides it into a series of contiguous cells, which can then be assigned unique identifiers and used for spatial indexing purposes...

  • Index (database)
    Index (database)
    A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space...

    , Quadtree
    Quadtree
    A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes. This...

    , Kd-tree
    Kd-tree
    In computer science, a k-d tree is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key...

    , UB-tree
    UB-tree
    The UB-tree as proposed by Rudolf Bayer and Volker Markl is a balanced tree for storing and efficiently retrieving multidimensional data. It is basically a B+ tree with records stored according to Z-order, also called Morton order...

    , R-tree
    R-tree
    R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both research and real-world applications...

    , range tree
    Range tree
    In computer science, a range tree is an ordered tree data structure to hold a list of points. It allows all points within a given range to be efficiently retrieved, and is typically used in two or higher dimensions....

    as alternatives.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK