Geohash
Encyclopedia
Geohash is a latitude
Latitude
In geography, the latitude of a location on the Earth is the angular distance of that location south or north of the Equator. The latitude is an angle, and is usually measured in degrees . The equator has a latitude of 0°, the North pole has a latitude of 90° north , and the South pole has a...

/longitude
Longitude
Longitude is a geographic coordinate that specifies the east-west position of a point on the Earth's surface. It is an angular measurement, usually expressed in degrees, minutes and seconds, and denoted by the Greek letter lambda ....

 geocode
Geocoding
Geocoding is the process of finding associated geographic coordinates from other geographic data, such as street addresses, or zip codes...

 system invented
by Gustavo Niemeyer when writing the web service at geohash.org, and
put into the public domain. It is a hierarchical spatial data structure
which subdivides space into buckets of grid
Grid (spatial index)
In the context of a spatial index, a grid is a regular tessellation of a manifold or 2-D surface that divides it into a series of contiguous cells, which can then be assigned unique identifiers and used for spatial indexing purposes...

 shape.

Geohashes offer properties like arbitrary precision and the possibility of
gradually removing characters from the end of the code to reduce its size
(and gradually lose precision).

As a consequence of the gradual precision degradation, nearby places will
often (but not always) present similar prefixes. Conversely, the longer a
shared prefix is, the closer the two places are.

Service

The purpose of the geohash.org service, launched in February 2008, is to offer short URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

s which uniquely identify positions on the Earth
Earth
Earth is the third planet from the Sun, and the densest and fifth-largest of the eight planets in the Solar System. It is also the largest of the Solar System's four terrestrial planets...

, so that referencing them in email
Email
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

s, forums
Internet forum
An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...

, and website
Website
A website, also written as Web site, web site, or simply site, is a collection of related web pages containing images, videos or other digital assets. A website is hosted on at least one web server, accessible via a network such as the Internet or a private local area network through an Internet...

s is more convenient.

To obtain the Geohash, the user provides an address to be geocoded
Geocoding
Geocoding is the process of finding associated geographic coordinates from other geographic data, such as street addresses, or zip codes...

, or latitude
Latitude
In geography, the latitude of a location on the Earth is the angular distance of that location south or north of the Equator. The latitude is an angle, and is usually measured in degrees . The equator has a latitude of 0°, the North pole has a latitude of 90° north , and the South pole has a...

 and longitude
Longitude
Longitude is a geographic coordinate that specifies the east-west position of a point on the Earth's surface. It is an angular measurement, usually expressed in degrees, minutes and seconds, and denoted by the Greek letter lambda ....

 coordinates, in a single input box (most commonly used formats for latitude and longitude pairs are accepted), and performs the request.

Besides showing the latitude and longitude corresponding to the given Geohash, users who navigate to a Geohash at geohash.org are also presented with an embedded map, and may download a GPX file, or transfer the waypoint directly to certain GPS receivers. Links are also provided to external sites that may provide further details around the specified
location.

For example, the coordinate pair 57.64911,10.40744 (near the tip of the peninsula
Peninsula
A peninsula is a piece of land that is bordered by water on three sides but connected to mainland. In many Germanic and Celtic languages and also in Baltic, Slavic and Hungarian, peninsulas are called "half-islands"....

 of Jutland
Jutland
Jutland , historically also called Cimbria, is the name of the peninsula that juts out in Northern Europe toward the rest of Scandinavia, forming the mainland part of Denmark. It has the North Sea to its west, Kattegat and Skagerrak to its north, the Baltic Sea to its east, and the Danish–German...

, in Denmark
Denmark
Denmark is a Scandinavian country in Northern Europe. The countries of Denmark and Greenland, as well as the Faroe Islands, constitute the Kingdom of Denmark . It is the southernmost of the Nordic countries, southwest of Sweden and south of Norway, and bordered to the south by Germany. Denmark...

) produces a hash of u4pruydqqvj, which can be used in the URL http://geohash.org/u4pruydqqvj

Uses

The main usages of Geohashes are
  • as a unique identifier.
  • represent point data e.g. in databases.


Geohashes have also been proposed to be used for geotagging
GeoTagging
Geotagging is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata...

.

When used in a database, the structure of geohashed data has two advantages. First, data indexed by geohash will have all points for a given rectangular area in contiguous slices (the number of slices depends on the precision required and the presence of geohash "fault lines"). This is especially useful in database systems where queries on a single index are much easier or faster than multiple-index queries. Second, this index structure can be used for a quick-and-dirty proximity search - the closest points are often among the closest geohashes.

Example

Using the hash ezs42 as an example, here is how it is decoded into a decimal latitude and longitude

Decode from base 32

The first step is decoding it from base 32 using the following character map:
Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Base 32 0 1 2 3 4 5 6 7 8 9 b c d e f g
 
Decimal 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Base 32 h j k m n p q r s t u v w x y z


This operation results in the bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

s 01101 11111 11000 00100 00010. Assuming that counting starts at 0 in the left side, the even bits are taken for the longitude code (0111110000000), while the odd bits are taken for the latitude code (101111001001).

Decode binary to decimal

Each binary code is then used in a series of divisions, considering one bit at a time,
again from the left to the right side. For the latitude value, the interval
-90 to +90 is divided by 2, producing two intervals: -90 to 0, and 0 to +90. Since
the first bit is 1, the higher interval is chosen, and becomes the current interval.
The procedure is repeated for all bits in the code. Finally, the latitude value is
the center of the resulting interval. Longitudes are processed in an equivalent way,
keeping in mind that the initial interval is -180 to +180.

Finishing the procedure should yield approximately latitude 42.6 and
longitude -5.6.

Worked example

Here's a worked example decoding 101111001001 into 42.6. To start with, we know the latitude is somewhere in the range -90 to 90. With no bits, we'd have to guess the latitude was 0, giving us an error of ±90. With one bit, we can decide whether its in the range -90 to 0, or 0 to 90. The first bit is high, so we know our latitude is somewhere between 0 and 90. Without any more bits, we'd guess the latitude was 45, giving us an error of ±45

Each subsequent bit halves this error. This table shows the effect of each bit. At each stage, the relevant half of the range is highlighted in green - a low bit selects the lower range, a high bit the upper range.

The last column shows the latitude, simply the mean value of the range. Each subsequent bit makes this value more precise.
bit min mid max val err
1 -90.000 0.000 90.000 45.000 45.000
0 0.000 45.000 90.000 22.500 22.500
1 0.000 22.500 45.000 33.750 11.250
1 22.500 33.750 45.000 39.375 5.625
1 33.750 39.375 45.000 42.188 2.813
1 39.375 42.188 45.000 43.594 1.406
0 42.188 43.594 45.000 42.891 0.703
0 42.188 42.891 43.594 42.539 0.352
1 42.188 42.539 42.891 42.715 0.176
0 42.539 42.715 42.891 42.627 0.088
0 42.539 42.627 42.715 42.583 0.044
1 42.539 42.583 42.627 42.605 0.022


(The numbers in the above table have been rounded to 3 decimal places for clarity)

Final rounding should be done carefully in a way that


So if rounding 42.605 to 42.61 or 42.6 is correct, rounding to 43 it is not.
geohash length lat bits lng bits lat error lng error km error
1 2 3 ±23 ±23 ±2500
2 5 5 ± 2.8 ± 5.6 ±630
3 7 8 ± 0.70 ± 0.7 ±78
4 10 10 ± 0.087 ± 0.18 ±20
5 12 13 ± 0.022 ± 0.022 ±2.4
6 15 15 ± 0.0027 ± 0.0055 ±0.61
7 17 18 ±0.00068 ±0.00068 ±0.076
8 20 20 ±0.000085 ±0.00017 ±0.019

Limitations

One limitation of the Geohash algorithm is in attempting to utilize it to find points in proximity to each other based on a common prefix. Edge case
Edge case
An edge case is a problem or situation that occurs only at an extreme operating parameter.For example, a stereo speaker might distort audio when played at its maximum rated volume, even in the absence of other extreme settings or conditions.An edge case can be expected or unexpected...

 locations close to each other but on opposite sides of the Equator
Equator
An equator is the intersection of a sphere's surface with the plane perpendicular to the sphere's axis of rotation and containing the sphere's center of mass....

 or a meridian
Meridian (geography)
A meridian is an imaginary line on the Earth's surface from the North Pole to the South Pole that connects all locations along it with a given longitude. The position of a point along the meridian is given by its latitude. Each meridian is perpendicular to all circles of latitude...

 can result in Geohash codes with no common prefix.

Secondly a geohash essentially defines a bounding box within which a location lies, therefore two locations may be spatially very close but have different geohashes. In order to be useful to proximity searches, the surrounding eight geohashes of a geohash must be calculated and the locations matching these pulled out, therefore complicating potential usage in proximity search
Proximity search
In text processing, a proximity search looks for documents where two or more separately matching term occurrences are within a specified distance, where distance is the number of intermediate words or characters...

es.

Despite those issues, there are possible workarounds, and the algorithm has been successfully used in MongoDB to implement proximity searches.

License and patents

The Geohash geocode has been put in the public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...

 by its inventor in
the public announcement date, on February 26, 2008.

While comparable algorithms have been successfully
patented and
had copyright claimed upon, GeoHash is based on an entirely different algorithm and approach.

See also

  • Grid (spatial index)
    Grid (spatial index)
    In the context of a spatial index, a grid is a regular tessellation of a manifold or 2-D surface that divides it into a series of contiguous cells, which can then be assigned unique identifiers and used for spatial indexing purposes...

  • Morton number (number theory)
  • Natural Area Code
    Natural Area Code
    The Natural Area Code is a proprietary geocode system for identifying an area anywhere on the Earth, or a volume of space anywhere around the Earth...

  • Maidenhead Locator System
    Maidenhead Locator System
    The Maidenhead Locator System is a geographic coordinate system used by amateur radio operators. Dr. John Morris, G4ANB, originally devised the system, and a group of VHF managers, meeting in Maidenhead, England in 1980, adopted it...

  • Military grid reference system
    Military grid reference system
    The Military Grid Reference System is the geocoordinate standard used by NATO militaries for locating points on the earth. The MGRS is derived from the UTM grid system and the UPS grid system, but uses a different labeling convention...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK