Models¶

All supported models and their configurations.

Available Models¶

Model	Speed	Backends	Status
MinerU VL	3-6s/page	PyTorch, VLLM, MLX, API	✅ Ready
Qwen	2-3s/page	PyTorch, VLLM, MLX, API	✅ Ready
DotsOCR	3-5s/page	PyTorch, VLLM, API	✅ Ready
Nanonets OCR2	2-4s/page	PyTorch, VLLM, MLX	✅ Ready

Model	Speed	Backends	Status
MinerU VL	3-6s/page	PyTorch, VLLM, MLX, API	✅ Ready
DocLayoutYOLO	0.1-0.2s/page	PyTorch	✅ Ready
RT-DETR	0.3-0.5s/page	PyTorch	✅ Ready
Qwen Layout	2-3s/page	PyTorch, VLLM, MLX, API	✅ Ready

Model	Speed	Backends	Status
Tesseract	0.5-1s/page	CPU	✅ Ready
EasyOCR	1-2s/page	PyTorch	✅ Ready
PaddleOCR	0.5-1s/page	PaddlePaddle	✅ Ready

Model	Speed	Backends	Status
TableFormer	0.5-1s/table	PyTorch	✅ Ready

Model	Speed	Backends	Status
Rule-based	<0.1s/page	CPU	✅ Ready

Backend	Models
PyTorch	MinerU VL, Qwen, DotsOCR, Nanonets, DocLayoutYOLO, RT-DETR, EasyOCR, TableFormer
VLLM	MinerU VL, Qwen, DotsOCR, Nanonets
MLX	MinerU VL, Qwen, Nanonets
API	MinerU VL, Qwen, DotsOCR
CPU	Tesseract, PaddleOCR, Rule-based Reading Order

Model	Parameters	Description	Status
Granite Docling	258M	Edge deployment, fast inference	🔜 Scripts ready
Chandra	9B	High accuracy text extraction	🔜 Planned

Model	Description	Status
SuryaLayout	Modern layout detection	🔜 Planned

Model	Description	Status
SuryaOCR	Modern multilingual OCR	🔜 Planned

Task	Models	Status
Math Recognition	UniMERNet, Qwen	🔜 Planned
Structured Output	VLM (GPT-4V, Gemini)	🔜 Planned

See Roadmap for full tracking.