Segmentation (image processing)
Encyclopedia
In computer vision
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

, segmentation refers to the process of partitioning a digital image
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

 into multiple segments
Image segment
In computer vision segmentation of an image is the division of a given image into contiguous regions. In current computer vision algorithms the similarity of image parts is usually defined in terms of color and texture. The goal to automatically segment images into semantically meaningful parts is...

 (sets of pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

s, also known as superpixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics.

The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contour
Contour line
A contour line of a function of two variables is a curve along which the function has a constant value. In cartography, a contour line joins points of equal elevation above a given level, such as mean sea level...

s extracted from the image (see edge detection
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color
Color
Color or colour is the visual perceptual property corresponding in humans to the categories called red, green, blue and others. Color derives from the spectrum of light interacting in the eye with the spectral sensitivities of the light receptors...

, intensity
Luminous intensity
In photometry, luminous intensity is a measure of the wavelength-weighted power emitted by a light source in a particular direction per unit solid angle, based on the luminosity function, a standardized model of the sensitivity of the human eye...

, or texture
Image texture
An image texture is a set of metrics calculated in image processing designed to quantify the perceived texture of an image. Image Texture gives us information about the spatial arrangement of color or intensities in an image or selected region of an image....

. Adjacent
Adjacent
Adjacent is an adjective meaning contiguous, adjoining or abuttingIn geometry, adjacent is when sides meet to make an angle.In graph theory adjacent nodes in a graph are linked by an edge....

 regions are significantly different with respect to the same characteristic(s).
When applied to a stack of images, typical in Medical imaging
Medical imaging
Medical imaging is the technique and process used to create images of the human body for clinical purposes or medical science...

, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like Marching cubes
Marching cubes
Marching cubes is a computer graphics algorithm, published in the 1987 SIGGRAPH proceedings by Lorensen and Cline, for extracting a polygonal mesh of an isosurface from a three-dimensional scalar field...

.

Applications

Some of the practical applications of image segmentation are:
  • Medical imaging
    Medical imaging
    Medical imaging is the technique and process used to create images of the human body for clinical purposes or medical science...

    • Locate tumors and other pathologies
    • Measure tissue volumes
    • Computer-guided surgery
    • Diagnosis
    • Treatment planning
    • Study of anatomical structure
  • Locate objects in satellite images (roads, forests, etc.)
  • Face recognition
  • Iris recognition
    Iris recognition
    Iris recognition is an automated method of biometric identification that uses mathematical pattern-recognition techniques on video images of the irides of an individual's eyes, whose complex random patterns are unique and can be seen from some distance....

  • Fingerprint recognition
  • Traffic control systems
  • Brake light detection
  • Machine vision
    Machine vision
    Machine vision is the process of applying a range of technologies and methods to provide imaging-based automatic inspection, process control and robot guidance in industrial applications. While the scope of MV is broad and a comprehensive definition is difficult to distil, a "generally accepted...

  • Agricultural imaging – crop disease detection


Several general-purpose algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s and techniques have been developed for image segmentation. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain.

Thresholding

The simplest method of image segmentation is called the thresholding
Thresholding (image processing)
Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images Thresholding is the simplest method of image segmentation. From a grayscale image, thresholding can be used to create binary images Thresholding is the simplest method...

 method. This method is based on a clip-level (or a threshold value) to turn a gray-scale image into a binary image.

The key of this method is to select the threshold value (or values when multiple-levels are selected). Several popular methods are used in industry including the maximum entropy method, Otsu's method
Otsu's method
In computer vision and image processing, Otsu's method is used to automatically perform histogram shape-based image thresholding, or, the reduction of a graylevel image to a binary image. The algorithm assumes that...

 (maximum variance), and et al. k-means clustering can also be used.

Clustering methods

The K-means algorithm
K-means algorithm
In statistics and data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean...

 is an iterative technique that is used to partition an image into K clusters. The basic algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 is:
  1. Pick K cluster centers, either randomly or based on some heuristic
    Heuristic
    Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...

  2. Assign each pixel in the image to the cluster that minimizes the distance
    Distance
    Distance is a numerical description of how far apart objects are. In physics or everyday discussion, distance may refer to a physical length, or an estimation based on other criteria . In mathematics, a distance function or metric is a generalization of the concept of physical distance...

     between the pixel and the cluster center
  3. Re-compute the cluster centers by averaging all of the pixels in the cluster
  4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)


In this case, distance
Distance
Distance is a numerical description of how far apart objects are. In physics or everyday discussion, distance may refer to a physical length, or an estimation based on other criteria . In mathematics, a distance function or metric is a generalization of the concept of physical distance...

 is the squared or absolute difference between a pixel and a cluster center. The difference is typically based on pixel color
Hue
Hue is one of the main properties of a color, defined technically , as "the degree to which a stimulus can be describedas similar to or different from stimuli that are described as red, green, blue, and yellow,"...

, intensity
Brightness
Brightness is an attribute of visual perception in which a source appears to be radiating or reflecting light. In other words, brightness is the perception elicited by the luminance of a visual target...

, texture, and location, or a weighted combination of these factors. K can be selected manually, randomly, or by a heuristic
Heuristic
Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...

.

This algorithm is guaranteed to converge, but it may not return the optimal
Global optimum
In mathematics, a global optimum is a selection from a given domain which yields either the highest value or lowest value , when a specific function is applied. For example, for the function...

 solution. The quality of the solution depends on the initial set of clusters and the value of K.

In statistics and machine learning, the k-means algorithm is a clustering algorithm to partition n objects into k clusters, where k < n. It is similar to the expectation-maximization algorithm
Expectation-maximization algorithm
In statistics, an expectation–maximization algorithm is an iterative method for finding maximum likelihood or maximum a posteriori estimates of parameters in statistical models, where the model depends on unobserved latent variables...

 for mixtures of Gaussians
Gaussian filter
In electronics and signal processing, a Gaussian filter is a filter whose impulse response is a Gaussian function. Gaussian filters are designed to give no overshoot to a step function input while minimizing the rise and fall time. This behavior is closely connected to the fact that the Gaussian...

 in that they both attempt to find the centers of natural clusters in the data.
The model requires that the object attributes correspond to elements of a vector space. The objective it tries to achieve is to minimize total intra-cluster variance, or, the squared error function.
The k-means clustering was invented in 1956. The most common form of the algorithm uses an iterative refinement heuristic known as Lloyd's algorithm
Lloyd's algorithm
In computer science and electrical engineering, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm for grouping data points into a given number of categories, used for k-means clustering....

. Lloyd's algorithm starts by partitioning the input points into k initial sets, either at random or using some heuristic data. It then calculates the mean point, or centroid, of each set. It constructs a new partition by associating each point with the closest centroid. Then the centroids are recalculated for the new clusters, and algorithm repeated by alternate application of these two steps until convergence, which is obtained when the points no longer switch clusters (or alternatively centroids are no longer changed).
Lloyd's algorithm and k-means are often used synonymously, but in reality Lloyd's algorithm is a heuristic for solving the k-means problem, as with certain combinations of starting points and centroids, Lloyd's algorithm can in fact converge to the wrong answer. Other variations exist, but Lloyd's algorithm has remained popular, because it converges extremely quickly in practice.
In terms of performance the algorithm is not guaranteed to return a global optimum. The quality of the final solution depends largely on the initial set of clusters, and may, in practice, be much poorer than the global optimum. Since the algorithm is extremely fast, a common method is to run the algorithm several times and return the best clustering found.
A drawback of the k-means algorithm is that the number of clusters k is an input parameter. An inappropriate choice of k may yield poor results. The algorithm also assumes that the variance is an appropriate measure of cluster scatter.

Compression-based methods

Compression based methods postulate that the optimal segmentation is the one that minimizes, over all possible segmentations, the coding length of the data. The connection between these two concepts is that segmentation tries to find patterns in an image and any regularity in the image can be used to compress it. The method describes each segment by its texture and boundary shape. Each of these components is modeled by a probability distribution function and its coding length is computed as follows:
  1. The boundary encoding leverages the fact that regions in natural images tend to have a smooth contour. This prior is used by huffman coding
    Huffman coding
    In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol where the variable-length code table has been derived in a particular way based on...

     to encode the difference chain code
    Chain code
    A chain code is a lossless compression algorithm for monochrome images. The basic principle of chain codes is to separately encode each connected component, or "blot", in the image. For each such region, a point on the boundary is selected and its coordinates are transmitted...

     of the contours in an image. Thus, the smoother a boundary is, the shorter coding length it attains.
  2. Texture is encoded by lossy compression in a way similar to minimum description length
    Minimum description length
    The minimum description length principle is a formalization of Occam's Razor in which the best hypothesis for a given set of data is the one that leads to the best compression of the data. MDL was introduced by Jorma Rissanen in 1978...

     (MDL) principle, but here the length of the data given the model is approximated by the number of samples times the entropy
    Entropy
    Entropy is a thermodynamic property that can be used to determine the energy available for useful work in a thermodynamic process, such as in energy conversion devices, engines, or machines. Such devices can only be driven by convertible energy, and have a theoretical maximum efficiency when...

     of the model. The texture in each region is modeled by a multivariate normal distribution whose entropy has closed form expression. An interesting property of this model is that the estimated entropy bounds the true entropy of the data from above. This is because among all distributions with a given mean and covariance, normal distribution has the largest entropy. Thus, the true coding length cannot be more than what the algorithm tries to minimize.


For any given segmentation of an image, this scheme yields the number of bits required to encode that image based on the given segmentation. Thus, among all possible segmentations of an image, the goal is to find the segmentation which produces the shortest coding length. This can be achieved by a simple agglomerative clustering method. The distortion in the lossy compression determines the coarseness of the segmentation and its optimal value may differ for each image. This parameter can be estimated heuristically from the contrast of textures in an image. For example, when the textures in an image are similar, such as in camouflage images, stronger sensitivity and thus lower quantization is required.

Histogram-based methods

Histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

-based methods are very efficient when compared to other image segmentation methods because they typically require only one pass through the pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

s. In this technique, a histogram is computed from all of the pixels in the image, and the peaks and valleys in the histogram are used to locate the clusters in the image. Color
Hue
Hue is one of the main properties of a color, defined technically , as "the degree to which a stimulus can be describedas similar to or different from stimuli that are described as red, green, blue, and yellow,"...

 or intensity
Brightness
Brightness is an attribute of visual perception in which a source appears to be radiating or reflecting light. In other words, brightness is the perception elicited by the luminance of a visual target...

 can be used as the measure.

A refinement of this technique is to recursive
Recursive
Recursive may refer to:*Recursion, the technique of functions calling themselves*Recursive function, a total computable function*Recursive language, a language which is decidable...

ly apply the histogram-seeking method to clusters in the image in order to divide them into smaller clusters. This is repeated with smaller and smaller clusters until no more clusters are formed.

One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks and valleys in the image. In this technique of image classification distance metric and integrated region matching are familiar.

Histogram-based approaches can also be quickly adapted to occur over multiple frames, while maintaining their single pass efficiency. The histogram can be done in multiple fashions when multiple frames are considered. The same approach that is taken with one frame can be applied to multiple, and after the results are merged, peaks and valleys that were previously difficult to identify are more likely to be distinguishable. The histogram can also be applied on a per pixel basis where the information result are used to determine the most frequent color for the pixel location. This approach segments based on active objects and a static environment, resulting in a different type of segmentation useful in Video tracking
Video tracking
Video tracking is the process of locating a moving object over time using a camera. It has a variety of uses, some of which are: human-computer interaction, security and surveillance, video communication and compression, augmented reality, traffic control, medical imaging and video editing...

.

Edge detection

Edge detection
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

 is a well-developed field on its own within image processing.
Region boundaries and edges are closely related,
since there is often a sharp adjustment in intensity at the region boundaries.
Edge detection techniques have therefore been used as the base of another segmentation technique.

The edges identified by edge detection are often disconnected.
To segment an object from an image however, one needs closed region boundaries.
Edge is nothing but boundary between two images.
Edge detection technique refers to the identification and locating the sharp dis-continues in the image .

Region growing methods

The first region growing
Region growing
Region growing is a simple region-based image segmentation method. It is also classified as a pixel-based image segmentation method since it involves the selection of initial seed points....

 method was the seeded region growing method. This method takes a set of seeds as input along with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown by comparing all unallocated neighbouring pixels to the regions. The difference between a pixel's intensity value and the region's mean, , is used as a measure of similarity. The pixel with the smallest difference measured this way is allocated to the respective region. This process continues until all pixels are allocated to a region.

Seeded region growing requires seeds as additional input. The segmentation results are dependent on the choice of seeds. Noise in the image can cause the seeds to be poorly placed. Unseeded region growing is a modified algorithm that doesn't require explicit seeds. It starts off with a single region – the pixel chosen here does not significantly influence final segmentation. At each iteration it considers the neighbouring pixels in the same way as seeded region growing. It differs from seeded region growing in that if the minimum is less than a predefined threshold then it is added to the respective region . If not, then the pixel is considered significantly different from all current regions and a new region is created with this pixel.

One variant of this technique, proposed by Haralick and Shapiro (1985), is based on pixel intensities
Brightness
Brightness is an attribute of visual perception in which a source appears to be radiating or reflecting light. In other words, brightness is the perception elicited by the luminance of a visual target...

. The mean
Arithmetic mean
In mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...

 and scatter
Scatter
In ordinary English, to scatter is to distribute randomly. Scatter also has the following meanings:*In physics, scattering is the study of collisions, especially of waves and particles...

 of the region and the intensity of the candidate pixel is used to compute a test statistic. If the test statistic is sufficiently small, the pixel is added to the region, and the region’s mean and scatter are recomputed. Otherwise, the pixel is rejected, and is used to form a new region.

A special region growing method is called -connected segmentation (see also lambda-connectedness
Lambda-connectedness
In applied mathematics, lambda-connectedness deals with partial connectivity for a discrete space.Assume that a function on a discrete space is given. A degree of connectivity will be defined to measure the connectedness of the space with respect to the function...

). It is based on pixel intensities
Brightness
Brightness is an attribute of visual perception in which a source appears to be radiating or reflecting light. In other words, brightness is the perception elicited by the luminance of a visual target...

 and neighborhood linking paths. A degree of connectivity (connectedness) will be calculated based on a path that is formed by pixels. For a certain value of , two pixels are called -connected if there is a path linking those two pixels and the connectedness of this path is at least . -connectedness is an equivalence relation.

Split-and-merge methods

Split-and-merge segmentation is based on a quadtree
Quadtree
A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes. This...

 partition of an image. It is sometimes called quadtree segmentation.

This method starts at the root of the tree that represents the whole image. If it is found non-uniform (not homogeneous), then it is split into four son-squares (the splitting process), and so on so forth. Conversely, if four son-squares are homogeneous, they can be merged as several connected components (the merging process). The node in the tree is a segmented node. This process continues recursively until no further splits or merges are possible. When a special data structure is involved in the implementation of the algorithm of the method, its time complexity can reach , an optimal algorithm of the method.

Partial differential equation-based methods

Using a partial differential equation
Partial differential equation
In mathematics, partial differential equations are a type of differential equation, i.e., a relation involving an unknown function of several independent variables and their partial derivatives with respect to those variables...

 (PDE)-based method and solving the PDE equation by a numerical scheme, one can segment the image.

Level set methods

Curve propagation is a popular technique in image analysis for object extraction, object tracking, stereo reconstruction, etc. The central idea behind such an approach is to evolve a curve towards the lowest potential of a cost function, where its definition reflects the task to be addressed and imposes certain smoothness constraints. Lagrangian
Lagrangian
The Lagrangian, L, of a dynamical system is a function that summarizes the dynamics of the system. It is named after Joseph Louis Lagrange. The concept of a Lagrangian was originally introduced in a reformulation of classical mechanics by Irish mathematician William Rowan Hamilton known as...

 techniques are based on parameterizing the contour according to some sampling strategy and then evolve each element according to image and internal terms. While such a technique can be very efficient, it suffers from various limitations like deciding on the sampling strategy, estimating the internal geometric properties of the curve, changing its topology, addressing problems in higher dimensions, etc. In each case, a partial differential equation (PDE) called the level set equation is solved by finite differences.

The level set method was initially proposed to track moving interfaces by Osher and Sethian in 1988 and has spread across various imaging domains in the late nineties. It can be used to efficiently address the problem of curve/surface/etc. propagation in an implicit manner. The central idea is to represent the evolving contour using a signed function, where its zero level corresponds to the actual contour. Then, according to the motion equation of the contour, one can easily derive a similar flow for the implicit surface that when applied to the zero-level will reflect the propagation of the contour. The level set method encodes numerous advantages: it is implicit, parameter free, provides a direct way to estimate the geometric properties of the evolving structure, can change the topology and is intrinsic. Furthermore, they can be used to define an optimization framework as proposed by Zhao, Merriman and Osher in 1996. Therefore, one can conclude that it is a very convenient framework to address numerous applications of computer vision and medical image analysis. Furthermore, research into various level set data structures has led to very efficient implementations of this method.

Graph partitioning methods

Graph
Graph (data structure)
In computer science, a graph is an abstract data structure that is meant to implement the graph and hypergraph concepts from mathematics.A graph data structure consists of a finite set of ordered pairs, called edges or arcs, of certain entities called nodes or vertices...

 partitioning methods can effectively be used for image segmentation. In these methods, the image is modeled as a weighted, undirected graph. Usually a pixel or a group of pixels are associated with nodes
Vertex (graph theory)
In graph theory, a vertex or node is the fundamental unit out of which graphs are formed: an undirected graph consists of a set of vertices and a set of edges , while a directed graph consists of a set of vertices and a set of arcs...

 and edge weights define the (dis)similarity between the neighborhood pixels. The graph (image) is then partitioned according to a criterion designed to model "good" clusters. Each partition of the nodes (pixels) output from these algorithms are considered an object segment in the image. Some popular algorithms of this category are normalized cuts, random walker
Random Walker (Computer Vision)
The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels , e.g., "object" and "background"...

, minimum cut, isoperimetric partitioning and minimum spanning tree-based segmentation
Minimum spanning tree-based segmentation
-Image segmentation introduction:Image segmentation strives to partition a digital image into regions of pixels with similar properties, e.g. homogeneity. The higher-level region representation simplifies image analysis tasks such as counting objects or detecting changes, because region attributes...

.

Watershed transformation

The watershed transformation
Watershed (algorithm)
A grey-level image may be seen as a topographic relief, where the grey level of a pixel is interpreted as its altitude in the relief.A drop of water falling on a topographic relief flows along a path to finally reach a local minimum...

 considers the gradient magnitude of an image as a topographic surface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines, which represent the region boundaries. Water placed on any pixel enclosed by a common watershed line flows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum form a catch basin, which represents a segment.

Model based segmentation

The central assumption of such an approach is that structures of interest/organs have a repetitive form of geometry. Therefore, one can seek for a probabilistic model towards explaining the variation of the shape of the organ and then when segmenting an image impose constraints using this model as prior. Such a task involves (i) registration of the training examples to a common pose, (ii) probabilistic representation of the variation of the registered samples, and (iii) statistical inference between the model and the image. State of the art methods in the literature for knowledge-based segmentation involve active shape and appearance models, active contours and deformable templates and level-set based methods.

Multi-scale segmentation

Image segmentations are computed at multiple scales in scale-space and sometimes propagated from coarse to fine scales; see scale-space segmentation
Scale-space segmentation
Scale-space segmentation or multi-scale segmentation is a general framework for signal and image segmentation, based on the computation of image descriptors at multiple scales of smoothing.-One-dimensional hierarchical signal segmentation:...

.

Segmentation criteria can be arbitrarily complex and may take into account global as well as local criteria. A common requirement is that each region must be connected in some sense.

One-dimensional hierarchical signal segmentation

Witkin's seminal work in scale space included the notion that a one-dimensional signal could be unambiguously segmented into regions, with one scale parameter controlling the scale of segmentation.

A key observation is that the zero-crossings of the second derivatives (minima and maxima of the first derivative or slope) of multi-scale-smoothed versions of a signal form a nesting tree, which defines hierarchical relations between segments at different scales. Specifically, slope extrema at coarse scales can be traced back to corresponding features at fine scales. When a slope maximum and slope minimum annihilate each other at a larger scale, the three segments that they separated merge into one segment, thus defining the hierarchy of segments.

Image segmentation and primal sketch

There have been numerous research works in this area, out of which a few have now reached a state where they can be applied either with interactive manual intervention (usually with application to medical imaging) or fully automatically. The following is a brief overview of some of the main research ideas that current approaches are based upon.

The nesting structure that Witkin described is, however, specific for one-dimensional signals and does not trivially transfer to higher-dimensional images. Nevertheless, this general idea has inspired several other authors to investigate coarse-to-fine schemes for image segmentation. Koenderink proposed to study how iso-intensity contours evolve over scales and this approach was investigated in more detail by Lifshitz and Pizer.
Unfortunately, however, the intensity of image features changes over scales, which implies that it is hard to trace coarse-scale image features to finer scales using iso-intensity information.

Lindeberg studied the problem of linking local extrema and saddle points over scales, and proposed an image representation called the scale-space primal sketch which makes explicit the relations between structures at different scales, and also makes explicit which image features are stable over large ranges of scale including locally appropriate scales for those. Bergholm proposed to detect edges at coarse scales in scale-space and then trace them back to finer scales with manual choice of both the coarse detection scale and the fine localization scale.

Gauch and Pizer studied the complementary problem of ridges and valleys at multiple scales and developed a tool for interactive image segmentation based on multi-scale watersheds. The use of multi-scale watershed with application to the gradient map has also been investigated by Olsen and Nielsen and been carried over to clinical use by Dam
Vincken et al. proposed a hyperstack for defining probabilistic relations between image structures at different scales. The use of stable image structures over scales has been furthered by Ahuja and his co-workers into a fully automated system.

More recently, these ideas for multi-scale image segmentation by linking image structures over scales have been picked up by Florack and Kuijper. Bijaoui and Rué associate structures detected in scale-space above a minimum noise threshold into an object tree which spans multiple scales and corresponds to a kind of feature in the original signal. Extracted features are accurately reconstructed using an iterative conjugate gradient matrix method.

Semi-automatic segmentation

In this kind of segmentation, the user outlines the region of interest with the mouse clicks and algorithms are applied so that the path that best fits the edge of the image is shown.

Techniques like SIOX
Simple Interactive Object Extraction
Simple interactive object extraction is an algorithm for extracting foreground objects from color images and videos with very little user interaction. It has been implemented as "foreground selection" tool in the GIMP , as part of the tracer tool in Inkscape , and as function in ImageJ and Fiji...

, Livewire
Livewire Segmentation Technique
Livewire, also known as Intelligent Scissors, is a segmentation technique which allows a user to select regions of interest to be extracted quickly and accurately, using simple mouse clicks. It is based on the lowest cost path algorithm, by Edsger W. Dijkstra. Firstly Convolve the image with a...

, Intelligent Scissors or IT-SNAPS are used in this kind of segmentation.

Neural networks segmentation

Neural Network segmentation relies on processing small areas of an image using an artificial neural network
Artificial neural network
An artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...

 or a set of neural networks. After such processing the decision-making mechanism marks the areas of an image accordingly to the category recognized by the neural network. A type of network designed especially for this is the Kohonen map.

Pulse-coupled neural networks (PCNNs) are neural models proposed by modeling a cat’s visual cortex and developed for high-performance biomimetic image processing.
In 1989, Eckhorn introduced a neural model to emulate the mechanism of cat’s visual cortex. The Eckhorn model provided a simple and effective tool for studying small mammal’s visual cortex, and was soon recognized as having significant application potential in image processing. In 1994, the Eckhorn model was adapted to be an image processing algorithm by Johnson, who termed this algorithm Pulse-Coupled Neural Network. Over the past decade, PCNNs have been utilized for a variety of image processing applications, including: image segmentation, feature generation, face extraction, motion detection, region growing, noise reduction, and so on.
A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel’s color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be utilized for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc.

Proprietary software

Several proprietary
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...

 software packages are available for performing image segmentation
  • Pac-n-Zoom Color has a proprietary software that color segments over 16 million colors at photographic quality. Manual available at http://www.accelerated-io.com/usr_man.htm http://www.accelerated-io.com/usr_man.htm
  • TurtleSeg is a free interactive 3D image segmentation tool. It supports many common medical image file formats and allows the user to export their segmentation as a binary mask or a 3D surface.


Several proprietary software packages are available for comparing the performance of segmentation methods with the state-of-the-art segmentation methods on standardized sets

Software

Several open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 software packages are available for performing image segmentation
  • GIMP
    GIMP
    GIMP is a free software raster graphics editor. It is primarily employed as an image retouching and editing tool and is freely available in versions tailored for most popular operating systems including Microsoft Windows, Apple Mac OS X, and Linux.In addition to detailed image retouching and...

     which includes among other tools SIOX (see Simple Interactive Object Extraction
    Simple Interactive Object Extraction
    Simple interactive object extraction is an algorithm for extracting foreground objects from color images and videos with very little user interaction. It has been implemented as "foreground selection" tool in the GIMP , as part of the tracer tool in Inkscape , and as function in ImageJ and Fiji...

    ).
  • ImageMagick
    ImageMagick
    ImageMagick is an open source software suite for displaying, converting, and editing raster image files. It can read and write over 100 image file formats. ImageMagick is licensed under the Apache 2.0 license.- Features and capabilities:...

     segments using the Fuzzy C-Means  algorithm.
  • MITK has a program module for manual segmentation of gray-scale images.
  • GRASS GIS
    GRASS GIS
    GRASS GIS is a free, open source geographical information system capable of handling raster, topological vector, image processing, and graphic data....

     has the program module i.smap for image segmentation.
  • AForge.NET
    AForge.NET
    AForge.NET is a computer vision and artificial intelligence library originally developed by Andrew Kirillov for the .NET Framework.The source code and binaries of the project are available under the terms of the Lesser GPL license.-Features:...

     - an open source C# framework.
  • Population open-source image processing library in C++


There are also software packages available free of charge for academic purposes:

See also

  • Computer Vision
    Computer vision
    Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

  • Data clustering
    Data clustering
    Cluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....

  • Graph Theory
    Graph theory
    In mathematics and computer science, graph theory is the study of graphs, mathematical structures used to model pairwise relations between objects from a certain collection. A "graph" in this context refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of...

  • Histogram
    Histogram
    In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

    s
  • Image-based meshing
    Image-based meshing
    Image-based meshing is the automated process of creating computer models for computational fluid dynamics and finite element analysis from 3D image data...

  • K-means algorithm
    K-means algorithm
    In statistics and data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean...

  • Pulse-coupled networks
  • Range image segmentation
  • Region growing
    Region growing
    Region growing is a simple region-based image segmentation method. It is also classified as a pixel-based image segmentation method since it involves the selection of initial seed points....

  • Balanced histogram thresholding
    Balanced histogram thresholding
    In image processing, the balanced histogram thresholding method , is a very simple method used for automatic image thresholding. Like Otsu's Method and the Iterative Selection Thesholding Method, this is a histogram based thresholding method. This approach assumes that the image is divided in two...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK