Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Image segmentation
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Trainable segmentation == Most of the aforementioned segmentation methods are based only on color information of pixels in the image. Humans use much more knowledge when performing image segmentation, but implementing this knowledge would cost considerable human engineering and computational time, and would require a huge [[domain knowledge]] database which does not currently exist. Trainable segmentation methods, such as [[neural network]] segmentation, overcome these issues by modeling the domain knowledge from a dataset of labeled pixels. An image segmentation [[neural network]] can process small areas of an image to extract simple features such as edges.<ref name="Transactions on Engineering, Computing and Technology">[[Mahinda Pathegama]] & Γ GΓΆl (2004): "Edge-end pixel extraction for edge-based image segmentation", ''Transactions on Engineering, Computing and Technology,'' vol. 2, pp 213β216, ISSN 1305-5313</ref> Another neural network, or any decision-making mechanism, can then combine these features to label the areas of an image accordingly. A type of network designed this way is the [[Kohonen map]]. [[Pulse-coupled networks|Pulse-coupled neural networks (PCNNs)]] are neural models proposed by modeling a cat's visual cortex and developed for high-performance [[biomimetic]] [[image processing]]. In 1989, Reinhard Eckhorn introduced a neural model to emulate the mechanism of a cat's visual cortex. The Eckhorn model provided a simple and effective tool for studying the visual cortex of small mammals, and was soon recognized as having significant application potential in image processing. In 1994, the Eckhorn model was adapted to be an image processing algorithm by John L. Johnson, who termed this algorithm Pulse-Coupled Neural Network.<ref>{{cite journal|last1=Johnson|first1=John L.|date=September 1994|title=Pulse-coupled neural nets: translation, rotation, scale, distortion, and intensity signal invariance for images|doi=10.1364/AO.33.006239|pmid=20936043|publisher=OSA|volume=33|journal=Applied Optics|number=26|pages=6239β6253|bibcode=1994ApOpt..33.6239J}}</ref> Over the past decade, PCNNs have been utilized for a variety of image processing applications, including: image segmentation, feature generation, face extraction, motion detection, region growing, noise reduction, and so on. A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel's color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be utilized for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc. In 2015, [[convolutional neural network|convolutional neural networks]] reached state of the art in semantic segmentation.<ref>{{Cite conference|last=Long |first=Jonathan |last2=Shelhamer |first2=Evan |last3=Darrell |first3=Trevor |date=2015 |title=Fully Convolutional Networks for Semantic Segmentation |url=https://openaccess.thecvf.com/content_cvpr_2015/html/Long_Fully_Convolutional_Networks_2015_CVPR_paper.html |conference=Proceedings of the IEEE conference on computer vision and pattern recognition|pages=3431β3440}}</ref> [[U-Net]] is an architecture which takes as input an image and outputs a label for each pixel.<ref>{{cite arXiv|last1=Ronneberger|first1=Olaf|last2=Fischer|first2=Philipp|last3=Brox|first3=Thomas|title=U-Net: Convolutional Networks for Biomedical Image Segmentation|eprint=1505.04597|date=2015|class=cs.CV}}</ref> U-Net initially was developed to detect cell boundaries in biomedical images. U-Net follows classical [[autoencoder]] architecture, as such it contains two sub-structures. The encoder structure follows the traditional stack of convolutional and max pooling layers to increase the receptive field as it goes through the layers. It is used to capture the context in the image. The decoder structure utilizes transposed convolution layers for upsampling so that the end dimensions are close to that of the input image. Skip connections are placed between convolution and transposed convolution layers of the same shape in order to preserve details that would have been lost otherwise. In addition to pixel-level semantic segmentation tasks which assign a given category to each pixel, modern segmentation applications include instance-level semantic segmentation tasks in which each individual in a given category must be uniquely identified, as well as panoptic segmentation tasks which combines these two tasks to provide a more complete scene segmentation.<ref name="Panoptic Segmentation"/>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)