Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Optical character recognition
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Computer recognition of visual text}} {{EngvarB|date=January 2019}} {{Use mdy dates|date=January 2019}} [[File:Portable scanner and OCR (video).webm|thumb|300px|Video of the process of scanning and real-time optical character recognition (OCR) with a portable scanner]] '''Optical character recognition''' or '''optical character reader''' ('''OCR''') is the [[electronics|electronic]] or [[machine|mechanical]] conversion of [[image]]s of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).<ref>{{cite web|title=OCR Document|website=[[HP Autonomy#Products and services|Haven OnDemand]]|url=https://dev.havenondemand.com/apis/ocrdocument#overview|url-status=dead|archive-url=https://web.archive.org/web/20160415060125/https://dev.havenondemand.com/apis/ocrdocument|archive-date=April 15, 2016}}</ref> Widely used as a form of [[data entry]] from printed paper data records{{snd}}whether passport documents, invoices, [[bank statement]]s, computerized receipts, business cards, mail, printed data, or any suitable documentation{{snd}}it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed online, and used in machine processes such as [[cognitive computing]], [[machine translation]], (extracted) [[text-to-speech]], key data and [[text mining]]. OCR is a field of research in [[pattern recognition]], [[artificial intelligence]] and [[computer vision]]. Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of accuracy for most fonts are now common, and with support for a variety of [[image file format]] inputs.<ref>{{cite web|title=Supported Media Formats|website=Haven OnDemand|url=https://dev.havenondemand.com/docs/ImageFormats.html|url-status=dead|archive-url=https://web.archive.org/web/20160419063444/https://dev.havenondemand.com/docs/ImageFormats.html|archive-date=April 19, 2016}}</ref> Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)