Skip to content

Tasks

Tasks define what you want to extract. Models define how.


Available Tasks

Task Input Output Status
Text Extraction Image / PDF Markdown, HTML ✅ Ready
Layout Analysis Image Bounding boxes + labels ✅ Ready
OCR Image Text + coordinates ✅ Ready
Table Extraction Table image Structured table data ✅ Ready
Reading Order Layout + OCR Ordered elements ✅ Ready
Structured Extraction Image + Schema Typed Pydantic objects ✅ Ready

Choosing a Task

"I want readable text from a PDF"Text Extraction

"I need to know where tables and figures are"Layout Analysis

"I need word positions for downstream processing"OCR

"I want structured data from a table"Table Extraction

"I need elements in reading order"Reading Order

"I want typed data from invoices/forms"Structured Extraction


Upcoming Tasks

Task Description Status
Math Recognition LaTeX from equations 🔜 Soon
Chart Understanding Data extraction from charts 🔜 Planned
Image Captioning Caption figures and images 🔜 Planned

See Roadmap for full tracking.