Editing Scale-invariant feature transform (section)

==== Eliminating edge responses ====
The DoG function will have strong responses along edges, even if the candidate keypoint is not robust to small amounts of noise. Therefore, in order to increase stability, we need to eliminate the keypoints that have poorly determined locations but have high edge responses.

For poorly defined peaks in the DoG function, the [[principal curvature]] across the edge would be much larger than the principal curvature along it. Finding these principal curvatures amounts to solving for the [[Eigenvalues and eigenvectors|eigenvalues]] of the second-order [[Hessian matrix]], '''H''':

:<math> \textbf{H} =  \begin{bmatrix}
  D_{xx} & D_{xy} \\
  D_{xy} & D_{yy}
\end{bmatrix} </math>

The eigenvalues of '''H''' are proportional to the principal curvatures of D. It turns out that the ratio of the two eigenvalues, say <math>\alpha</math> is the larger one, and <math>\beta</math> the smaller one, with ratio <math>r = \alpha/\beta</math>, is sufficient for SIFT's purposes. The trace of '''H''', i.e., <math>D_{xx} + D_{yy}</math>, gives us the sum of the two eigenvalues, while its determinant, i.e., <math>D_{xx} D_{yy} - D_{xy}^2</math>, yields the product. The ratio <math> \text{R} = \operatorname{Tr}(\textbf{H})^2 / \operatorname{Det}(\textbf{H})</math> can be shown to be equal to <math>(r+1)^2/r</math>, which depends only on the ratio of the eigenvalues rather than their individual values. R is minimum when the eigenvalues are equal to each other. Therefore, the higher the [[absolute difference]] between the two eigenvalues, which is equivalent to a higher absolute difference between the two principal curvatures of D, the higher the value of R. It follows that, for some threshold eigenvalue ratio <math>r_{\text{th}}</math>, if R for a candidate keypoint is larger than <math>(r_{\text{th}} + 1)^2/r_{\text{th}}</math>, that keypoint is poorly localized and hence rejected. The new approach uses <math>r_{\text{th}} = 10</math>.<ref name="Lowe2004" />

This processing step for suppressing responses at edges is a transfer of a corresponding approach in the [[Harris corner detector|Harris operator]] for corner detection. The difference is that the measure for thresholding is computed from the Hessian matrix instead of a [[Structure tensor|second-moment matrix]].