Geostatistics

# Geostatistics

Overview
Geostatistics is a branch of statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

focusing on spatial or spatiotemporal
Spacetime
In physics, spacetime is any mathematical model that combines space and time into a single continuum. Spacetime is usually interpreted with space as being three-dimensional and time playing the role of a fourth dimension that is of a different sort from the spatial dimensions...

datasets. Developed originally to predict probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....

s of ore grades for mining
Mining
Mining is the extraction of valuable minerals or other geological materials from the earth, from an ore body, vein or seam. The term also includes the removal of soil. Materials recovered by mining include base metals, precious metals, iron, uranium, coal, diamonds, limestone, oil shale, rock...

operations, it is currently applied in diverse disciplines including petroleum geology
Petroleum geology
Petroleum geology refers to the specific set of geological disciplines that are applied to the search for hydrocarbons .-Sedimentary basin analysis:...

, hydrogeology
Hydrogeology
Hydrogeology is the area of geology that deals with the distribution and movement of groundwater in the soil and rocks of the Earth's crust, . The term geohydrology is often used interchangeably...

, hydrology
Hydrology
Hydrology is the study of the movement, distribution, and quality of water on Earth and other planets, including the hydrologic cycle, water resources and environmental watershed sustainability...

, meteorology
Meteorology
Meteorology is the interdisciplinary scientific study of the atmosphere. Studies in the field stretch back millennia, though significant progress in meteorology did not occur until the 18th century. The 19th century saw breakthroughs occur after observing networks developed across several countries...

, oceanography
Oceanography
Oceanography , also called oceanology or marine science, is the branch of Earth science that studies the ocean...

, geochemistry
Geochemistry
The field of geochemistry involves study of the chemical composition of the Earth and other planets, chemical processes and reactions that govern the composition of rocks, water, and soils, and the cycles of matter and energy that transport the Earth's chemical components in time and space, and...

, geometallurgy
Geometallurgy
Geometallurgy relates to the practice of combining geology or geostatistics with metallurgy, or, more specifically, extractive metallurgy, to create a spatially- or geologically-based predictive model for mineral processing plants. It is used in the hard rock mining industry for risk management...

, geography
Geography
Geography is the science that studies the lands, features, inhabitants, and phenomena of Earth. A literal translation would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes...

, forestry
Forestry
Forestry is the interdisciplinary profession embracing the science, art, and craft of creating, managing, using, and conserving forests and associated resources in a sustainable manner to meet desired goals, needs, and values for human benefit. Forestry is practiced in plantations and natural stands...

, environmental control, landscape ecology
Landscape ecology
Landscape ecology is the science of studying and improving relationships between urban development and ecological processes in the environment and particular ecosystems...

, soil science
Soil science
Soil science is the study of soil as a natural resource on the surface of the earth including soil formation, classification and mapping; physical, chemical, biological, and fertility properties of soils; and these properties in relation to the use and management of soils.Sometimes terms which...

, and agriculture
Agriculture
Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

(esp. in precision farming). Geostatistics is applied in varied branches of geography
Geography
Geography is the science that studies the lands, features, inhabitants, and phenomena of Earth. A literal translation would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes...

, particularly those involving the spread of diseases (epidemiology
Epidemiology
Epidemiology is the study of health-event, health-characteristic, or health-determinant patterns in a population. It is the cornerstone method of public health research, and helps inform policy decisions and evidence-based medicine by identifying risk factors for disease and targets for preventive...

), the practice of commerce and military planning (logistics
Logistics
Logistics is the management of the flow of goods between the point of origin and the point of destination in order to meet the requirements of customers or corporations. Logistics involves the integration of information, transportation, inventory, warehousing, material handling, and packaging, and...

), and the development of efficient spatial network
Spatial network
A spatial network is a network of spatial elements. In physical space spatial networks are derived from maps of open space within the urban context or building. One might think of the 'space map' as being the negative image of the standard map, with the open space cut out of the background...

s.
Discussion
 Ask a question about 'Geostatistics' Start a new discussion about 'Geostatistics' Answer questions from other users Full Discussion Forum

Unanswered Questions
Recent Discussions
Encyclopedia
Geostatistics is a branch of statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

focusing on spatial or spatiotemporal
Spacetime
In physics, spacetime is any mathematical model that combines space and time into a single continuum. Spacetime is usually interpreted with space as being three-dimensional and time playing the role of a fourth dimension that is of a different sort from the spatial dimensions...

datasets. Developed originally to predict probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....

s of ore grades for mining
Mining
Mining is the extraction of valuable minerals or other geological materials from the earth, from an ore body, vein or seam. The term also includes the removal of soil. Materials recovered by mining include base metals, precious metals, iron, uranium, coal, diamonds, limestone, oil shale, rock...

operations, it is currently applied in diverse disciplines including petroleum geology
Petroleum geology
Petroleum geology refers to the specific set of geological disciplines that are applied to the search for hydrocarbons .-Sedimentary basin analysis:...

, hydrogeology
Hydrogeology
Hydrogeology is the area of geology that deals with the distribution and movement of groundwater in the soil and rocks of the Earth's crust, . The term geohydrology is often used interchangeably...

, hydrology
Hydrology
Hydrology is the study of the movement, distribution, and quality of water on Earth and other planets, including the hydrologic cycle, water resources and environmental watershed sustainability...

, meteorology
Meteorology
Meteorology is the interdisciplinary scientific study of the atmosphere. Studies in the field stretch back millennia, though significant progress in meteorology did not occur until the 18th century. The 19th century saw breakthroughs occur after observing networks developed across several countries...

, oceanography
Oceanography
Oceanography , also called oceanology or marine science, is the branch of Earth science that studies the ocean...

, geochemistry
Geochemistry
The field of geochemistry involves study of the chemical composition of the Earth and other planets, chemical processes and reactions that govern the composition of rocks, water, and soils, and the cycles of matter and energy that transport the Earth's chemical components in time and space, and...

, geometallurgy
Geometallurgy
Geometallurgy relates to the practice of combining geology or geostatistics with metallurgy, or, more specifically, extractive metallurgy, to create a spatially- or geologically-based predictive model for mineral processing plants. It is used in the hard rock mining industry for risk management...

, geography
Geography
Geography is the science that studies the lands, features, inhabitants, and phenomena of Earth. A literal translation would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes...

, forestry
Forestry
Forestry is the interdisciplinary profession embracing the science, art, and craft of creating, managing, using, and conserving forests and associated resources in a sustainable manner to meet desired goals, needs, and values for human benefit. Forestry is practiced in plantations and natural stands...

, environmental control, landscape ecology
Landscape ecology
Landscape ecology is the science of studying and improving relationships between urban development and ecological processes in the environment and particular ecosystems...

, soil science
Soil science
Soil science is the study of soil as a natural resource on the surface of the earth including soil formation, classification and mapping; physical, chemical, biological, and fertility properties of soils; and these properties in relation to the use and management of soils.Sometimes terms which...

, and agriculture
Agriculture
Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

(esp. in precision farming). Geostatistics is applied in varied branches of geography
Geography
Geography is the science that studies the lands, features, inhabitants, and phenomena of Earth. A literal translation would be "to describe or write about the Earth". The first person to use the word "geography" was Eratosthenes...

, particularly those involving the spread of diseases (epidemiology
Epidemiology
Epidemiology is the study of health-event, health-characteristic, or health-determinant patterns in a population. It is the cornerstone method of public health research, and helps inform policy decisions and evidence-based medicine by identifying risk factors for disease and targets for preventive...

), the practice of commerce and military planning (logistics
Logistics
Logistics is the management of the flow of goods between the point of origin and the point of destination in order to meet the requirements of customers or corporations. Logistics involves the integration of information, transportation, inventory, warehousing, material handling, and packaging, and...

), and the development of efficient spatial network
Spatial network
A spatial network is a network of spatial elements. In physical space spatial networks are derived from maps of open space within the urban context or building. One might think of the 'space map' as being the negative image of the standard map, with the open space cut out of the background...

s. Geostatistical algorithms are incorporated in many places, including geographic information systems (GIS) and the R statistical environment
R (programming language)
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

.

## Background

Geostatistics is intimately related to interpolation methods, but extends far beyond simple interpolation problems. It consists of a collection of numerical and mathematical techniques dealing with the characterization of spatial phenomena. Geostatistical techniques rely on statistical model that is based on random function (or random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

) theory to model the uncertainty associated with spatial estimation and simulation.

A number of simpler interpolation methods/algorithms, such as inverse distance weighting
Inverse distance weighting
Inverse distance weighting is a method for multivariate interpolation, a process of assigning values to unknown points by using values from usually scattered set of known points...

, bilinear interpolation
Bilinear interpolation
In mathematics, bilinear interpolation is an extension of linear interpolation for interpolating functions of two variables on a regular grid. The interpolated function should not use the term of x^2 or y^2, but x y, which is the bilinear form of x and y.The key idea is to perform linear...

and nearest-neighbor interpolation, were already well known before geostatistics. Geostatistics goes beyond the interpolation problem by considering the studied phenomenon at unknown locations as a set of correlated random variables.

Let be the value of the variable of interest at a certain location . This value is unknown (e.g. temperature, rainfall, piezometric level, geological facies, etc.). Although there exists a value at location that could be measured, geostatistics considers this value as random since it was not measured, or has not been measured yet. However, the randomness of is not complete, but defined by a cumulative distribution function
Cumulative distribution function
In probability theory and statistics, the cumulative distribution function , or just distribution function, describes the probability that a real-valued random variable X with a given probability distribution will be found at a value less than or equal to x. Intuitively, it is the "area so far"...

(cdf) that depends on certain information that is known about the value :

Typically, if the value of is known at locations close to (or in the neighborhood
Neighbourhood (mathematics)
In topology and related areas of mathematics, a neighbourhood is one of the basic concepts in a topological space. Intuitively speaking, a neighbourhood of a point is a set containing the point where you can move that point some amount without leaving the set.This concept is closely related to the...

of ) one can constrain the pdf of by this neighborhood: if a high spatial continuity is assumed, can only have values similar to the ones found in the neighborhood. Conversely, in the absence of spatial continuity can take any value. The spatial continuity of the random variables is described by a model of spatial continuity that can be either a parametric function in the case of variogram
Variogram
In spatial statistics the theoretical variogram 2\gamma is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z...

-based geostatistics, or have a non-parametric form when using other methods such as multiple-point simulation or pseudo-genetic techniques.

By applying a single spatial model on an entire domain, one makes the assumption that is a stationary process
Stationary process
In the mathematical sciences, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space...

. It means that the same statistical properties are applicable on the entire domain. Several geostatistical methods provide ways of relaxing this stationarity assumption.

In this framework, one can distinguish two modeling goals:
• 1) Estimating
Estimation theory
Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...

the value for , typically by the expectation
Expected value
In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...

, the median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...

or the mode
Mode (statistics)
In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....

of the pdf . This is usually denoted as an estimation problem.

• 2) Sampling
Sampling (statistics)
In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population....

from the entire probability density function by actually considering each possible outcome of it at each location. This is generally done by creating several alternative maps of , called realizations. Consider a domain discretized in grid nodes (or pixels). Each realization is a sample of the complete -dimensional joint distribution function

In this approach, the presence of multiple solutions to the interpolation problem is acknowledged. Each realization is considered as a possible scenario of what the real variable could be. All associated workflows are then considering ensemble of realizations, and consequently ensemble of predictions that allow for probabilistic forecasting. Therefore, geostatistics is often used to generate or update spatial models when solving inverse problem
Inverse problem
An inverse problem is a general framework that is used to convert observed measurements into information about a physical object or system that we are interested in...

s.

A number of methods exist for both geostatistical estimation and multiple realizations approaches. Several reference books provide a comprehensive overview of the discipline.

### Simulation

Aggregation
Dissagregation
Turning bands
Spectral simulation
SGS
Transition probabilities
Markov chain geostatistics
Markov chain geostatistics
Markov chain geostatistics refer to the Markov chain models, simulation algorithms and associated spatial correlation measures based on the Markov chain random field theory, which extends a single Markov chain into a multi-dimensional field for geostatistical modeling. A Markov chain random field...

Markov mesh models
Support vector machine
Support vector machine
A support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...

Boolean simulation
Genetic models
Pseudo-genetic models
Cellular automata
Multiple-Point Geostatistics (MPS)

## Definitions and tools

• Regionalized variable theory
Regionalized variable theory
Regionalized variable theory is a geostatistical method used for interpolation in space.The concept of the theory is that interpolation from points in space should not be based on a smooth continuous object. It should be, however, based on a stochastic model that takes into consideration the...

• Covariance function
Covariance function
In probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field...

• Semi-variance
• Variogram
Variogram
In spatial statistics the theoretical variogram 2\gamma is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z...

• Kriging
Kriging
Kriging is a group of geostatistical techniques to interpolate the value of a random field at an unobserved location from observations of its value at nearby locations....

• Range (geostatistics)
• Sill (geostatistics)
• Nugget effect
• Training image

## Related software

• gslib is a set of Fortran 77 routines (open source) implementing most of the classical geostatistics estimation and simulation algorithms
• sgems is a cross-platform (Windows, Unix), open-source software that implements most of the classical geostatistics algorithms (kriging, Gaussian and indicator simulation, etc.) as well as new developments (multiple-points geostatistics). It also provides an interactive 3D visualization and offers the scripting capabilities of Python.
• mgstat is a free MATLAB toolbox that allows calling sgems from MATLAB and transparent import/export of objects.
• gstat is an open source computer code for multivariable geostatistical modelling, prediction and simulation. It is also available as R package.
• R has around 20 other packages dedicated to geostatistics, and around 30 dedicated to other areas of spatial statistics.

## See also

• Inverse distance weighting
Inverse distance weighting
Inverse distance weighting is a method for multivariate interpolation, a process of assigning values to unknown points by using values from usually scattered set of known points...

• Multivariate interpolation
Multivariate interpolation
In numerical analysis, multivariate interpolation or spatial interpolation is interpolation on functions of more than one variable.The function to be interpolated is known at given points and the interpolation problem consist of yielding values at arbitrary points .-Regular grid:For function...

• Nearest-neighbor interpolation
• Spline interpolation
Spline interpolation
In the mathematical field of numerical analysis, spline interpolation is a form of interpolation where the interpolant is a special type of piecewise polynomial called a spline. Spline interpolation is preferred over polynomial interpolation because the interpolation error can be made small even...

• Geology
Geology
Geology is the science comprising the study of solid Earth, the rocks of which it is composed, and the processes by which it evolves. Geology gives insight into the history of the Earth, as it provides the primary evidence for plate tectonics, the evolutionary history of life, and past climates...

• Geodemographic segmentation
Geodemographic Segmentation
In marketing, Geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics with the assumption that the differences within any...

• Geographic information system
Geographic Information System
A geographic information system, geographical information science, or geospatial information studies is a system designed to capture, store, manipulate, analyze, manage, and present all types of geographically referenced data...

(GIS)
• Remote sensing
Remote sensing
Remote sensing is the acquisition of information about an object or phenomenon, without making physical contact with the object. In modern usage, the term generally refers to the use of aerial sensor technologies to detect and classify objects on Earth by means of propagated signals Remote sensing...

• Pedometrics
Pedometrics
Pedometrics is the application of mathematical and statistical methods for the study of the distribution and genesis of soils.Pedometrics is a neologism derived from the Greek roots pedos, soil and, metron, measurement...