OCR
Optical Character Recognition — AI that converts text inside images, scanned documents, or PDFs into editable, searchable text.
In plain English
OCR (Optical Character Recognition) is the technology that reads text from images and turns it into actual characters a computer can search, copy, or edit. It's how scanned PDFs, photos of receipts, and screenshots become useful data.
Common uses:
- Document digitisation — scan books, contracts, archives
- Invoice and receipt processing — extract amounts, dates, vendors
- ID and form capture — fill forms automatically from a photo
- Accessibility — read images aloud for visually impaired users
- Translation — translate signs or menus from a phone photo
Modern OCR: Traditional OCR used hand-crafted character templates. Modern OCR uses deep learning and works on messy, multilingual, hand-written, or low-quality images. Multi-modal LLMs like GPT-4o and Claude can also read images directly — often replacing dedicated OCR tools for complex documents.