Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Scale-invariant feature transform
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Scale-space extrema detection === We begin by detecting points of interest, which are termed ''keypoints'' in the SIFT framework. The image is [[Convolution|convolved]] with Gaussian filters at different scales, and then the difference of successive [[Gaussian blur|Gaussian-blurred]] images are taken. Keypoints are then taken as maxima/minima of the [[Difference of Gaussians]] (DoG) that occur at multiple scales. Specifically, a DoG image <math>D \left( x, y, \sigma \right)</math> is given by :<math>D \left( x, y, \sigma \right) = L \left( x, y, k_i\sigma \right) - L \left( x, y, k_j\sigma \right)</math>, :where <math>L \left( x, y, k\sigma \right)</math> is the convolution of the original image <math>I \left( x, y \right)</math> with the [[Gaussian blur]] <math>G \left( x, y, k\sigma \right)</math> at scale <math>k\sigma</math>, i.e., :<math>L \left( x, y, k\sigma \right) = G \left( x, y, k\sigma \right) * I \left( x, y \right)</math> Hence a DoG image between scales <math>k_i\sigma</math> and <math>k_j\sigma</math> is just the difference of the Gaussian-blurred images at scales <math>k_i\sigma</math> and <math>k_j\sigma</math>. For [[scale space]] extrema detection in the SIFT algorithm, the image is first convolved with Gaussian-blurs at different scales. The convolved images are grouped by octave (an octave corresponds to doubling the value of <math>\sigma</math>), and the value of <math>k_i</math> is selected so that we obtain a fixed number of convolved images per octave. Then the Difference-of-Gaussian images are taken from adjacent Gaussian-blurred images per octave. Once DoG images have been obtained, keypoints are identified as local minima/maxima of the DoG images across scales. This is done by comparing each pixel in the DoG images to its eight neighbors at the same scale and nine corresponding neighboring pixels in each of the neighboring scales. If the pixel value is the maximum or minimum among all compared pixels, it is selected as a candidate keypoint. This keypoint detection step is a variation of one of the [[blob detection]] methods developed by Lindeberg by detecting scale-space extrema of the scale normalized Laplacian;<ref name="Lin94Book" /><ref name="Lindeberg1998" /> that is, detecting points that are local extrema with respect to both space and scale, in the discrete case by comparisons with the nearest 26 neighbors in a discretized scale-space volume. The difference of Gaussians operator can be seen as an approximation to the Laplacian, with the implicit normalization in the [[pyramid (image processing)|pyramid]] also constituting a discrete approximation of the scale-normalized Laplacian.<ref name="Lindeberg2012" /> Another real-time implementation of scale-space extrema of the Laplacian operator has been presented by Lindeberg and Bretzner based on a hybrid pyramid representation,<ref name="Lindenberg2003" /> which was used for human-computer interaction by real-time gesture recognition in Bretzner et al. (2002).<ref>Lars Bretzner, Ivan Laptev, Tony Lindeberg [http://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A462620&dswid=608 "Hand gesture recognition using multi-scale colour features, hierarchical models and particle filtering"], Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, Washington, DC, USA, 21β21 May 2002, pages 423-428. {{ISBN|0-7695-1602-5}}, {{doi|10.1109/AFGR.2002.1004190}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)