Hough transform
Encyclopedia
The Hough transform is a feature extraction
Feature extraction
In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant then the input data will be transformed into a reduced representation...

 technique used in image analysis
Image analysis
Image analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques...

, computer vision
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

, and digital image processing
Digital image processing
Digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing...

. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space
Parameter space
In science, a parameter space is the set of values of parameters encountered in a particular mathematical model. Often the parameters are inputs of a function, in which case the technical term for the parameter space is domain of a function....

, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform.

The classical Hough transform was concerned with the identification of line
Line (mathematics)
The notion of line or straight line was introduced by the ancient mathematicians to represent straight objects with negligible width and depth. Lines are an idealization of such objects...

s in the image, but later the Hough transform has been extended to identifying positions of arbitrary shapes, most commonly circles or ellipses. The Hough transform as it is universally used today was invented by Richard Duda and Peter Hart
Peter E. Hart
Peter E. Hart is an American computer scientist and entrepreneur. He was chairman and president of Ricoh Innovations, which he founded in 1997...

 in 1972, who called it a "generalized Hough transform" after the related 1962 patent of Paul Hough. The transform was popularized in the computer vision
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

 community by Dana H. Ballard
Dana H. Ballard
Dana Henry Ballard is a professor of computer science currently at the University of Texas at Austin and formerly with the University of Rochester....

 through a 1981 journal article titled "Generalizing the Hough transform to detect arbitrary shapes
Generalised Hough transform
The Generalised Hough Transform or GHT, introduced by D.H. Ballard in 1981, is the modification of the Hough Transform using the principle of template matching. This modification enables the Hough Transform to be used for not only the detection of an object described with an analytic equation...

".

Theory

In automated analysis of digital images, a subproblem often arises of detecting simple shapes, such as straight lines, circles or ellipses. In many cases an edge detector
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

 can be used as a pre-processing stage to obtain image points or image pixels that are on the desired curve in the image space. Due to imperfections in either the image data or the edge detector, however, there may be missing points or pixels on the desired curves as well as spatial deviations between the ideal line/circle/ellipse and the noisy edge points as they are obtained from the edge detector. For these reasons, it is often non-trivial to group the extracted edge features to an appropriate set of lines, circles or ellipses. The purpose of the Hough transform is to address this problem by making it possible to perform groupings of edge points into object candidates by performing an explicit voting procedure over a set of parameterized image objects (Shapiro and Stockman, 304).

The simplest case of Hough transform is the linear transform for detecting straight lines. In the image space, the straight line can be described as y = mx + b and can be graphically plotted for each pair of image points (x, y). In the Hough transform, a main idea is to consider the characteristics of the straight line not as image points (x1, y1), (x2, y2), etc., but instead, in terms of its parameters, i.e., the slope parameter m and the intercept parameter b. Based on that fact, the straight line y = mx + b can be represented as a point (b, m) in the parameter space. However, one faces the problem that vertical lines give rise to unbounded values of the parameters m and b. For computational reasons, it is therefore better to use a different pair of parameters, denoted and (theta), for the lines in the Hough transform.

The parameter represents the distance between the line and the origin
Origin (mathematics)
In mathematics, the origin of a Euclidean space is a special point, usually denoted by the letter O, used as a fixed point of reference for the geometry of the surrounding space. In a Cartesian coordinate system, the origin is the point where the axes of the system intersect...

, while is the angle of the vector from the origin to this closest point (see Coordinates). Using this parameterization, the equation of the line can be written as


which can be rearranged to (Shapiro and Stockman, 304).

It is therefore possible to associate with each line of the image a pair (r,θ) which is unique if and , or if and . The (r,θ) plane is sometimes referred to as Hough space for the set of straight lines in two dimensions. This representation makes the Hough transform conceptually very close to the two-dimensional Radon transform
Radon transform
thumb|right|Radon transform of the [[indicator function]] of two squares shown in the image below. Lighter regions indicate larger function values. Black indicates zero.thumb|right|Original function is equal to one on the white region and zero on the dark region....

. (They can be seen as different ways of looking at the same transform.)

For an arbitrary point on the image plane with coordinates, e.g., (x0, y0), the lines that go through it are

,

where (the distance between the line and the origin
Origin (mathematics)
In mathematics, the origin of a Euclidean space is a special point, usually denoted by the letter O, used as a fixed point of reference for the geometry of the surrounding space. In a Cartesian coordinate system, the origin is the point where the axes of the system intersect...

) is determined by θ.

This corresponds to a sinusoidal curve in the (r,θ) plane, which is unique to that point. If the curves corresponding to two points are superimposed, the location (in the Hough space) where they cross corresponds to a line (in the original image space) that passes through both points. More generally, a set of points that form a straight line will produce sinusoids which cross at the parameters for that line. Thus, the problem of detecting collinear points can be converted to the problem of finding concurrent
Concurrent lines
In geometry, two or more lines are said to be concurrent if they intersect at a single point.In a triangle, four basic types of sets of concurrent lines are altitudes, angle bisectors, medians, and perpendicular bisectors:...

 curves.

Implementation

The Hough transform algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 uses an array, called an accumulator, to detect the existence of a line y = mx + b. The dimension
Dimension
In physics and mathematics, the dimension of a space or object is informally defined as the minimum number of coordinates needed to specify any point within it. Thus a line has a dimension of one because only one coordinate is needed to specify a point on it...

 of the accumulator is equal to the number of unknown parameters of the Hough transform problem. For example, the linear Hough transform problem has two unknown parameters: m and b. The two dimensions of the accumulator array would correspond to quantized values for m and b. For each pixel and its neighborhood, the Hough transform algorithm determines if there is enough evidence of an edge at that pixel. If so, it will calculate the parameters of that line, and then look for the accumulator's bin that the parameters fall into, and increase the value of that bin.
By finding the bins with the highest values, typically by looking for local maxima in the accumulator space, the most likely lines can be extracted, and their (approximate) geometric definitions read off. (Shapiro and Stockman, 304) The simplest way of finding these peaks is by applying some form of threshold, but different techniques may yield better results in different circumstances - determining which lines are found as well as how many. Since the lines returned do not contain any length information, it is often next necessary to find which parts of the image match up with which lines. Moreover, due to imperfection errors in the edge detection step, there will usually be errors in the accumulator space, which may make it non-trivial to find the appropriate peaks, and thus the appropriate lines.

Example

Consider three data points, shown here as black dots.


  • For each data point, a number of lines are plotted going through it, all at different angles. These are shown here as solid lines.
  • For each solid line a line is plotted which is perpendicular
    Perpendicular
    In geometry, two lines or planes are considered perpendicular to each other if they form congruent adjacent angles . The term may be used as a noun or adjective...

     to it and which intersects the origin
    Origin (mathematics)
    In mathematics, the origin of a Euclidean space is a special point, usually denoted by the letter O, used as a fixed point of reference for the geometry of the surrounding space. In a Cartesian coordinate system, the origin is the point where the axes of the system intersect...

    . These are shown as dashed lines.
  • The length (i.e. perpendicular distance to the origin) and angle of each dashed line is measured. In the diagram above, the results are shown in tables.
  • This is repeated for each data point.
  • A graph of the line lengths for each angle, known as a Hough space graph, is then created.




The point where the curves intersect gives a distance and angle. This distance and angle indicate the line which intersects the points being tested. In the graph shown the lines intersect at the pink point; this corresponds to the solid pink line in the diagrams above, which passes through all three points.

The following is a different example showing the results of a Hough transform on a raster image containing two thick lines.



The results of this transform were stored in a matrix. Cell value represents the number of curves through any point. Higher cell values are rendered brighter. The two distinctly bright spots are the Hough parameters of the two lines. From these spots' positions, angle and distance from image center of the two lines in the input image can be determined.

Using the gradient direction to reduce the number of votes

An improvement suggested by O'Gorman and Clowes can be used to detect lines if one takes into account that the local gradient
Gradient
In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....

 of the image intensity will necessarily be orthogonal to the edge. Since edge detection
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

 generally involves computing the intensity gradient
Gradient
In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....

 magnitude, the gradient direction is often found as a side effect. If a given point of coordinates (x,y) happens to indeed be on a line, then the local direction of the gradient gives the θ parameter corresponding to said line, and the r parameter is then immediately obtained. (Shapiro and Stockman, 305) The gradient direction can be estimated to within 20°, which shortens the sinusoid trace from the full 180° to roughly 45°. This reduces the computation time and has the interesting effect of reducing the number of useless votes, thus enhancing the visibility of the spikes corresponding to real lines in the image.

Kernel-based Hough transform

Fernandes and Oliveira suggested an improved voting scheme for the Hough transform that allows a software implementation to achieve real-time performance even on relatively large images (e.g., 1280×960). The Kernel-based Hough transform uses the same parameterization proposed by Duda and Hart but operates on clusters of approximately collinear pixels. For each cluster, votes are cast using an oriented elliptical-Gaussian kernel that models the uncertainty associated with the best-fitting line with respect to the corresponding cluster. The approach not only significantly improves the performance of the voting scheme, but also produces a much cleaner accumulator and makes the transform more robust to the detection of spurious lines.

Hough transform of curves, and Generalised Hough transform

Although the version of the transform described above applies only to finding straight lines, a similar transform can be used for finding any shape which can be represented by a set of parameters. A circle, for instance, can be transformed into a set of three parameters, representing its center and radius, so that the Hough space becomes three dimensional. Arbitrary ellipses and curves can also be found this way, as can any shape easily expressed as a set of parameters. For more complicated shapes, the Generalised Hough transform
Generalised Hough transform
The Generalised Hough Transform or GHT, introduced by D.H. Ballard in 1981, is the modification of the Hough Transform using the principle of template matching. This modification enables the Hough Transform to be used for not only the detection of an object described with an analytic equation...

 is used, which allows a feature to vote for a particular position, orientation and/or scaling of the shape using a predefined look-up table.

Detection of 3D objects (Planes and cylinders)

Hough transform can also be used for the detection of 3D objects in range data or 3D point cloud
Point cloud
A point cloud is a set of vertices in a three-dimensional coordinate system. These vertices are usually defined by X, Y, and Z coordinates, and typically are intended to be representative of the external surface of an object....

s. The extension of classical Hough transform for plane detection is quite straight forward. A plane is represented by its explicit equation for which we can use a 3D Hough space corresponding to , and . This extension suffers from the same problems as its 2D counter part i.e., near horizontal planes can be reliably detected, while the performance deteriorates as planar direction becomes vertical (big values of and amplify the noise in the data). This formulation of the plane has been used for the detection of planes in the point cloud
Point cloud
A point cloud is a set of vertices in a three-dimensional coordinate system. These vertices are usually defined by X, Y, and Z coordinates, and typically are intended to be representative of the external surface of an object....

s acquired from airborne laser scanning and works very well because in that domain all planes are nearly horizontal.

For generalized plane detection using Hough transform, the plane can be parametrized by its normal vector (using spherical coordinates) and its distance from the origin resulting in a three dimensional Hough space. This results in each point in the input data voting for a sinusoidal surface in the Hough space. The intersection of these sinusoidal surfaces indicates presence of a plane.

Hough transform has also been used to find cylindrical objects in point cloud
Point cloud
A point cloud is a set of vertices in a three-dimensional coordinate system. These vertices are usually defined by X, Y, and Z coordinates, and typically are intended to be representative of the external surface of an object....

s using a two step approach. The first step finds the orientation of the cylinder and the second step finds the position and radius.

Using weighted features

One common variation detail. That is, finding the bins with the highest count in one stage can be used to constrain the range of values searched in the next.

Carefully chosen parameter space

A high-dimensional parameter space for the Hough Transform is not only slow, but if implemented without forethought can easily overrun the available memory. Even if the programming environment allows the allocation of an array larger than the available memory space through virtual memory, the number of page swaps required for this will be very demanding because the accumulator array is used in a randomly accessed fashion, rarely stopping in contiguous memory as it skips from index to index.

Consider the task of finding ellipses in an 800x600 image. Assuming that the radii of the ellipses are oriented along principal axes, the parameter space is four-dimensional. (x,y) defines the center of the ellipse, and a and b denote the two radii. Allowing the center to be anywhere in the image, adds the constraint 0Matlab
MATLAB
MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

 has functions aimed specifically for sparse matrices, but they only handle two dimensional matrices, not four dimensional.

A program thus conceived is unlikely to be allowed to allocate sufficient memory. This doesn't mean that the problem can't be solved, but only that new ways to constrain the size of the accumulator array are to be found, which makes it feasible. For instance:
  1. If it is reasonable to assume that the ellipses are each contained entirely within the image, the range of the radii can be reduced. The largest the radii can be is if the center of the ellipse is in the center of the image, allowing the edges of the ellipse to stretch to the edges. In this extreme case, the radii can only each be half the magnitude of the image size oriented in the same direction. Reducing the range of a and b in this fashion reduces the accumulator array to 57 billion values.
  2. Trade accuracy for space in the estimation of the center: If the center is predicted to be off by 3 on both the x and y axis this reduces the size of the accumulator array to about 6 billion values.
  3. Trade accuracy for space in the estimation of the radii: If the radii are estimated to each be off by 5 further reduction of the size of the accumulator array occurs, by about 256 million values.
  4. Crop the image to areas of interest. This is image dependent, and therefore unpredictable, but imagine a case where all of the edges of interest in an image are in the upper left quadrant of that image. The accumulator array can be reduced even further in this case by constraining all 4 parameters by a factor of 2, for a total reduction factor of 16.

By applying just the first three of these constraints to the example stated about, the size of the accumulator array is reduced by almost a factor of 1000, bringing it down to a size that is much more likely to fit within a modern computer's memory.

Efficient Ellipse Detection Algorithm

Yonghong Xie and Qiang Ji give an efficient way of implementing Hough Transform for Ellipse detection by overcoming the memory issues. As discussed in the algorithm (on Page 2 of the paper), this approach uses only a one dimensional accumulator (for minor axis) in order to detect ellipses in the image. The complexity is O(N3) in the number of non-zero points in the image.

Limitations

The Hough Transform is only efficient if a high number of votes fall in the right bin, so that the bin can be easily detected amid the background noise. This means that the bin must not be too small, or else some votes will fall in the neighboring bins, thus reducing the visibility of the main bin.

Also, when the number of parameters is large (that is, when we are using the Hough Transform with typically more than three parameters), the average number of votes cast in a single bin is very low, and those bins corresponding to a real figure in the image do not necessarily appear to have a much higher number of votes than their neighbors. The complexity increases at a rate of with each additional parameter, where is the image space and is the number of parameters. (Shapiro and Stockman, 310) Thus, the Hough Transform must be used with great care to detect anything other than lines or circles.

Finally, much of the efficiency of the Hough Transform is dependent on the quality of the input data: the edges must be detected well for the Hough Transform to be efficient. Use of the Hough Transform on noisy images is a very delicate matter and generally, a denoising stage must be used before. In the case where the image is corrupted by speckle, as is the case in radar images, the Radon transform
Radon transform
thumb|right|Radon transform of the [[indicator function]] of two squares shown in the image below. Lighter regions indicate larger function values. Black indicates zero.thumb|right|Original function is equal to one on the white region and zero on the dark region....

 is sometimes preferred to detect lines, because it attenuates the noise through summation.

History

It was initially invented for machine analysis of bubble chamber
Bubble chamber
A bubble chamber is a vessel filled with a superheated transparent liquid used to detect electrically charged particles moving through it. It was invented in 1952 by Donald A. Glaser, for which he was awarded the 1960 Nobel Prize in Physics...

 photographs (Hough, 1959).

The Hough transform was patented as in 1962 and assigned to the U.S. Atomic Energy Commission with the name "Method and Means for Recognizing Complex Patterns". This patent uses a slope-intercept parametrization for straight lines, which awkwardly leads to an unbounded transform space since the slope can go to infinity.

The rho-theta parametrization universally used today was first described in
Duda, R. O. and P. E. Hart, "Use of the Hough Transformation to Detect Lines and Curves in Pictures," Comm. ACM, Vol. 15, pp. 11–15 (January, 1972),


although it was already standard for the Radon transform
Radon transform
thumb|right|Radon transform of the [[indicator function]] of two squares shown in the image below. Lighter regions indicate larger function values. Black indicates zero.thumb|right|Original function is equal to one on the white region and zero on the dark region....

 since at least the 1930s.

O'Gorman and Clowes' variation is described in
Frank O'Gorman, MB Clowes: Finding Picture Edges Through Collinearity of Feature Points. IEEE Trans. Computers 25(4): 449-456 (1976)


The story of how the modern form of the Hough transform was invented is given in
Hart, P. E., "How the Hough Transform was Invented", IEEE Signal Processing Magazine, Vol 26, Issue 6, pp 18 - 22 (November, 2009) .

See also

  • Generalised Hough transform
    Generalised Hough transform
    The Generalised Hough Transform or GHT, introduced by D.H. Ballard in 1981, is the modification of the Hough Transform using the principle of template matching. This modification enables the Hough Transform to be used for not only the detection of an object described with an analytic equation...

  • Randomized Hough Transform
    Randomized Hough Transform
    Randomized Hough transform is a probabilistic variant to the classical Hough transform, which is a commonly used technique for detecting curves The basic idea of Hough transform is to implement a voting procedure for all potential curves in the image, and at the termination of the algorithm...

  • Radon transform
    Radon transform
    thumb|right|Radon transform of the [[indicator function]] of two squares shown in the image below. Lighter regions indicate larger function values. Black indicates zero.thumb|right|Original function is equal to one on the white region and zero on the dark region....

  • Fourier transform
    Fourier transform
    In mathematics, Fourier analysis is a subject area which grew from the study of Fourier series. The subject began with the study of the way general functions may be represented by sums of simpler trigonometric functions...


External links

  • hough_transform.cpp - C++ code - example of CImg library (open source library, C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

     source code, Grayscale
    Grayscale
    In photography and computing, a grayscale or greyscale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information...

     images)
  • http://www.rob.cs.tu-bs.de/content/04-teaching/06-interactive/Hough.html - Java Applet
    Java applet
    A Java applet is an applet delivered to users in the form of Java bytecode. Java applets can run in a Web browser using a Java Virtual Machine , or in Sun's AppletViewer, a stand-alone tool for testing applets...

     + Source for learning the Hough transformation in slope-intercept form
  • http://www.rob.cs.tu-bs.de/content/04-teaching/06-interactive/HNF.html - Java Applet
    Java applet
    A Java applet is an applet delivered to users in the form of Java bytecode. Java applets can run in a Web browser using a Java Virtual Machine , or in Sun's AppletViewer, a stand-alone tool for testing applets...

     + Source for learning the Hough-Transformation in normal form
  • http://www.sydlogan.com/deskew.html - Deskew images using Hough transform (Grayscale
    Grayscale
    In photography and computing, a grayscale or greyscale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information...

     images, C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

     source code)
  • http://imaging.gmse.net/articledeskew.html - Deskew images using Hough transform (Visual Basic
    Visual Basic
    Visual Basic is the third-generation event-driven programming language and integrated development environment from Microsoft for its COM programming model...

     source code)
  • http://www.mitov.com/products/visionlab - Delphi, C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

     and .NET
    .NET Framework
    The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

    free for educational purposes library containing Line, Circle and Line segment Hough transform components.
  • Tarsha-Kurdi, F., Landes, T., Grussenmeyer, P., 2007a. Hough-transform and extended RANSAC algorithms for automatic detection of 3d building roof planes from Lidar data. ISPRS Proceedings. Workshop Laser scanning. Espoo, Finland, September 12–14, 2007.
  • Into contains open source implementations of linear and circular Hough transform in C++
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK