Tasks¶

Tasks define what you want to extract. Models define how.

Available Tasks¶

Task	Input	Output	Status
Text Extraction	Image / PDF	Markdown, HTML	✅ Ready
Layout Analysis	Image	Bounding boxes + labels	✅ Ready
OCR	Image	Text + coordinates	✅ Ready
Table Extraction	Table image	Structured table data	✅ Ready
Reading Order	Layout + OCR	Ordered elements	✅ Ready
Structured Extraction	Image + Schema	Typed Pydantic objects	✅ Ready

"I want readable text from a PDF" → Text Extraction

"I need to know where tables and figures are" → Layout Analysis

"I need word positions for downstream processing" → OCR

"I want structured data from a table" → Table Extraction

"I need elements in reading order" → Reading Order

"I want typed data from invoices/forms" → Structured Extraction

See Roadmap for full tracking.