Editing Hough transform (section)

==Variations and extensions==

===Using the gradient direction to reduce the number of votes===
An improvement suggested by O'Gorman and Clowes can be used to detect lines if one takes into account that the local [[gradient]] of the image intensity will necessarily be orthogonal to the edge. Since [[edge detection]] generally involves computing the intensity gradient magnitude, the gradient direction is often found as a side effect. If a given point of coordinates (''x,y'') happens to indeed be on a line, then the local direction of the gradient gives the ''θ'' parameter corresponding to said line, and the ''r'' parameter is then immediately obtained. (Shapiro and Stockman, 305) The gradient direction can be estimated to within 20°, which shortens the sinusoid trace from the full 180° to roughly 45°. This reduces the computation time and has the interesting effect of reducing the number of useless votes, thus enhancing the visibility of the spikes corresponding to real lines in the image.

===Kernel-based Hough transform (KHT) ===
Fernandes and Oliveira <ref>{{cite journal | last1 = Fernandes | first1 = L.A.F. | last2 = Oliveira | first2 = M.M. | year = 2008 | title = Real-time line detection through an improved Hough transform voting scheme | doi = 10.1016/j.patcog.2007.04.003 | journal = Pattern Recognition | volume = 41 | issue = 1| pages = 299–314 | bibcode = 2008PatRe..41..299F | s2cid = 5996185 }}</ref> suggested an improved voting scheme for the Hough transform that allows a software implementation to achieve real-time performance even on relatively large images (e.g., 1280×960). The Kernel-based Hough transform uses the same <math>(r,\theta)</math> parameterization proposed by Duda and Hart but operates on clusters of approximately collinear pixels. For each cluster, votes are cast using an oriented elliptical-Gaussian kernel that models the uncertainty associated with the best-fitting line with respect to the corresponding cluster. The approach not only significantly improves the performance of the voting scheme, but also produces a much cleaner accumulator and makes the transform more robust to the detection of spurious lines.

===3-D kernel-based Hough transform for plane detection (3DKHT)===
Limberger and Oliveira<ref>{{cite journal | last1 = Limberger | first1 = F.&nbsp;A. | last2 = Oliveira | first2 = M.&nbsp;M. | year = 2015 | title = Real-Time Detection of Planar Regions in Unorganized Point Clouds | doi = 10.1016/j.patcog.2014.12.020 | journal = Pattern Recognition | volume = 48 | issue = 6 | pages = 2043–2053 | bibcode = 2015PatRe..48.2043L | hdl = 10183/97001 | url = https://lume.ufrgs.br/bitstream/10183/97001/1/000919556.pdf | hdl-access = free }}</ref> suggested a deterministic technique for plane detection in unorganized point clouds whose cost is <math>n \log(n)</math> in the number of samples, achieving real-time performance for relatively large datasets (up to <math>10^5</math> points on a 3.4&nbsp;GHz CPU). It is based on a fast Hough-transform voting strategy for planar regions, inspired by the Kernel-based Hough transform (KHT). This 3D kernel-based Hough transform (3DKHT) uses a fast and robust algorithm to segment clusters of approximately co-planar samples, and casts votes for individual clusters (instead of for individual samples) on a (<math>\theta, \phi, \rho</math>) spherical accumulator using a trivariate Gaussian kernel. The approach is several orders of magnitude faster than existing (non-deterministic) techniques for plane detection in point clouds, such as [[Randomized Hough transform|RHT]] and [[RANSAC]], and scales better with the size of the datasets. It can be used with any application that requires fast detection of planar features on large datasets.

===Hough transform of curves, and its generalization for analytical and non-analytical shapes===
Although the version of the transform described above applies only to finding straight lines, a similar transform can be used for finding any shape which can be represented by a set of parameters. A circle, for instance, can be transformed into a set of three parameters, representing its center and radius, so that the Hough space becomes three dimensional. Arbitrary ellipses and curves can also be found this way, as can any shape easily expressed as a set of parameters.

The generalization of the Hough transform for detecting analytical shapes in spaces having any dimensionality was proposed by Fernandes and Oliveira.<ref>{{cite journal | last1 = Fernandes | first1 = L.A.F. | last2 = Oliveira | first2 = M.M. | year = 2012 | title = A general framework for subspace detection in unordered multidimensional data | doi = 10.1016/j.patcog.2012.02.033 | journal = Pattern Recognition | volume = 45 | issue = 9| pages = 3566–3579 | bibcode = 2012PatRe..45.3566F }}</ref> In contrast to other Hough transform-based approaches for analytical shapes, Fernandes' technique does not depend on the shape one wants to detect nor on the input data type. The detection can be driven to a type of analytical shape by changing the assumed model of geometry where data have been encoded (e.g., [[euclidean space]], [[projective space]], [[conformal geometry]], and so on), while the proposed formulation remains unchanged. Also, it guarantees that the intended shapes are represented with the smallest possible number of parameters, and it allows the concurrent detection of different kinds of shapes that best fit an input set of entries with different dimensionalities and different geometric definitions (e.g., the concurrent detection of planes and spheres that best fit a set of points, straight lines and circles).

For more complicated shapes in the plane (i.e., shapes that cannot be represented analytically in some 2D space), the [[Generalised Hough transform]]<ref>{{cite journal | last1 = Ballard | first1 = D.H. | year = 1981 | title = Generalizing the Hough transform to detect arbitrary shapes | journal = Pattern Recognition | volume = 13 | issue = 2| pages = 111–122 | doi = 10.1016/0031-3203(81)90009-1 | bibcode = 1981PatRe..13..111B | hdl = 1802/13802 | hdl-access = free }}</ref> is used, which allows a [[Feature (computer vision)|feature]] to vote for a particular position, orientation and/or scaling of the shape using a predefined look-up table.The Hough transform accumulates contributions from all pixels in the detected edge.

=== Circle detection process ===
{{main|Circle Hough Transform}}
Altering the algorithm to detect circular shapes instead of lines is relatively straightforward.

* First, we create the accumulator space, which is made up of a cell for each pixel. Initially each cell is set to 0.
* For each edge point (i, j) in the image, increment all cells which according to the equation of a circle <math>(i - a)^2 + (j - b)^2 = r^2</math> could be the center of a circle. These cells are represented by the letter <math>a</math> in the equation.
* For each possible value of <math>a</math> found in the previous step, find all possible values of <math>b</math> which satisfy the equation.
* Search for local maxima in the accumulator space. These cells represent circles that were detected by the algorithm.

If we do not know the radius of the circle we are trying to locate beforehand, we can use a three-dimensional accumulator space to search for circles with an arbitrary radius. Naturally, this is more computationally expensive.

This method can also detect circles that are partially outside of the accumulator space, as long as enough of the circle's area is still present within it.

=== Detection of 3D objects (planes and cylinders) ===
Hough transform can also be used for the detection of 3D objects in [[Range imaging|range data]] or 3D [[point cloud]]s. The extension of classical Hough transform for plane detection is quite straightforward. A plane is represented by its explicit equation <math>z = a_x x + a_y y + d</math> for which we can use a 3D Hough space corresponding to <math>a_x</math>, <math>a_y</math> and <math>d</math>. This extension suffers from the same problems as its 2D counterpart i.e., near horizontal planes can be reliably detected, while the performance deteriorates as planar direction becomes vertical (big values of <math>a_x</math> and <math>a_y</math> amplify the noise in the data). This formulation of the plane has been used for the detection of planes in the point clouds acquired from airborne laser scanning <ref>Vosselman, G., Dijkman, S: "[https://pdfs.semanticscholar.org/bb77/54f8e928bd79baf56a1da501484a4567d8ab.pdf 3D Building Model Reconstruction from Point Clouds and Ground Plans]", International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol 34, part 3/W4, October 22–24, 2001, Annapolis, MA, USA, pp. 37–44.</ref> and works very well because in that domain all planes are nearly horizontal.

For generalized plane detection using Hough transform, the plane can be parametrized by its normal vector <math>n</math> (using spherical coordinates) and its distance from the origin <math>\rho</math> resulting in a three dimensional Hough space. This results in each point in the input data voting for a sinusoidal surface in the Hough space. The intersection of these sinusoidal surfaces indicates presence of a plane.<ref>Tahir Rabbani: [http://www.ncg.knaw.nl/Publicaties/Geodesy/62Rabbani.html "Automatic reconstruction of industrial installations – Using point clouds and images"] {{Webarchive|url=https://web.archive.org/web/20081201052227/http://www.ncg.knaw.nl/Publicaties/Geodesy/62Rabbani.html |date=2008-12-01 }}, pages 43–44, Publications on Geodesy 62, Delft, 2006. {{ISBN|978-90-6132-297-9}}.</ref> A more general approach for more than 3 dimensions requires search heuristics to remain feasible.<ref>{{cite journal | last1 = Achtert | first1 = Elke | last2 = Böhm | first2 = Christian | last3 = David | first3 = Jörn | last4 = Kröger | first4 = Peer | last5 = Zimek | first5 = Arthur |author-link5=Arthur Zimek| year = 2008 | title = Global Correlation Clustering Based on the Hough Transform | journal = Statistical Analysis and Data Mining | volume = 1 | issue = 3| pages = 111–127 | doi = 10.1002/sam.10012 | s2cid = 5111283 | citeseerx = 10.1.1.716.6006 }}</ref>

Hough transform has also been used to find cylindrical objects in point clouds using a two step approach. The first step finds the orientation of the cylinder and the second step finds the position and radius.<ref>Tahir Rabbani and Frank van den Heuvel, "[https://pdfs.semanticscholar.org/d205/b6e6f743bc62e31257b47d18b36ada95f4f1.pdf Efficient hough transform for automatic detection of cylinders in point clouds]" in Proceedings of the 11th Annual Conference of the Advanced School for Computing and Imaging (ASCI '05), The Netherlands, June 2005.</ref>

===Using weighted features===
One common variation detail. That is, finding the bins with the highest count in one stage can be used to constrain the range of values searched in the next.

=== Carefully chosen parameter space ===
A high-dimensional parameter space for the Hough transform is not only slow, but if implemented without forethought can easily overrun the available memory. Even if the programming environment allows the allocation of an array larger than the available memory space through virtual memory, the number of [[Page swapping|page swaps]] required for this will be very demanding because the accumulator array is used in a randomly accessed fashion, rarely stopping in contiguous memory as it skips from index to index.

Consider the task of finding ellipses in an 800x600 image. Assuming that the radii of the ellipses are oriented along principal axes, the parameter space is four-dimensional. (''x'', ''y'') defines the center of the ellipse, and ''a'' and ''b'' denote the two radii. Allowing the center to be anywhere in the image, adds the constraint 0<x<800 and 0<y<600. If the radii are given the same values as constraints, what is left is a sparsely filled accumulator array of more than 230 billion values.

A program thus conceived is unlikely to be allowed to allocate sufficient memory. This doesn't mean that the problem can't be solved, but only that new ways to constrain the size of the accumulator array are to be found, which makes it feasible. For instance:

# If it is reasonable to assume that the ellipses are each contained entirely within the image, the range of the radii can be reduced. The largest the radii can be is if the center of the ellipse is in the center of the image, allowing the edges of the ellipse to stretch to the edges. In this extreme case, the radii can only each be half the magnitude of the image size oriented in the same direction. Reducing the range of a and b in this fashion reduces the accumulator array to 57 billion values.
# Trade accuracy for space in the estimation of the center: If the center is predicted to be off by 3 on both the x and y axis this reduces the size of the accumulator array to about 6 billion values.
# Trade accuracy for space in the estimation of the radii: If the radii are estimated to each be off by 5 further reduction of the size of the accumulator array occurs, by about 256 million values.
# Crop the image to areas of interest. This is image dependent, and therefore unpredictable, but imagine a case where all of the edges of interest in an image are in the upper left quadrant of that image. The accumulator array can be reduced even further in this case by constraining all 4 parameters by a factor of 2, for a total reduction factor of 16.
By applying just the first three of these constraints to the example stated about, the size of the accumulator array is reduced by almost a factor of 1000, bringing it down to a size that is much more likely to fit within a modern computer's memory.

==== Efficient ellipse detection algorithm ====

Yonghong Xie and Qiang Ji give an efficient way of implementing the Hough transform for ellipse detection by overcoming the memory issues.<ref name="XieJi2002">{{Cite book| doi = 10.1109/ICPR.2002.1048464| chapter = A new efficient ellipse detection method| year = 2002| volume = 2| last1 = Yonghong Xie| last2 = Qiang Ji| pages = 957–960| title = Object recognition supported by user interaction for service robots| isbn = 978-0-7695-1695-0| citeseerx = 10.1.1.1.8792| s2cid = 9276255}}</ref> As discussed in the algorithm (on page 2 of the paper), this approach uses only a one-dimensional accumulator (for the minor axis) in order to detect ellipses in the image. The complexity is O(N<sup>3</sup>) in the number of non-zero points in the image.