Skip to content

Usage

Everything you need to use OmniDocs in your projects.


Tasks & Models

Text Extraction

Convert documents to Markdown/HTML.

Model Speed Backends
Qwen 2-3s/page PyTorch, VLLM, MLX, API
DotsOCR 3-5s/page PyTorch, VLLM, API

Layout Analysis

Detect structure (titles, tables, figures).

Model Speed Labels
DocLayoutYOLO 0.1-0.2s/page Fixed (11)
RT-DETR 0.3-0.5s/page Fixed (11)
Qwen Layout 2-3s/page Custom

OCR

Extract text with coordinates.

Model Speed Languages
Tesseract 0.5-1s/page 100+
EasyOCR 1-2s/page 80+
PaddleOCR 0.5-1s/page 80+

Workflows


Upcoming

Tasks: Table Extraction, Math Recognition, Chart Understanding

Models: Chandra, LightOnOCR-2, MinerU, SuryaOCR, SuryaLayout

See Roadmap for full tracking.