Skip to content

OmniDocs

Usage

adithya-s-k/OmniDocs

OmniDocs

adithya-s-k/OmniDocs

Home
Home
Usage
Usage
- Text Extraction
  Text Extraction
  - MinerU VL
  - Qwen
  - DotsOCR
  - Nanonets OCR2
- Layout Analysis
  Layout Analysis
- Structured Extraction
- OCR
  OCR
- Table Extraction
  Table Extraction
  - TableFormer
- Reading Order
- Batch Processing
- Model Cache
- Deployment
API Reference
API Reference
- Batch
- Cache
- Document
- Tasks
  Tasks
  - Overview
  - Layout Extraction
    Layout Extraction
    
    Overview
    
    Base
    
    Doc Layout YOLO
    
    Mineruvl
    Mineruvl
    
    Overview
    
    API
    
    Detector
    
    MLX
    
    PyTorch
    
    VLLM
    
    Models
    
    Qwen
    Qwen
    
    Overview
    
    API
    
    Detector
    
    MLX
    
    PyTorch
    
    VLLM
    
    Rtdetr
    
    Vlm
  - OCR Extraction
    OCR Extraction
    
    Overview
    
    Base
    
    EasyOCR
    
    Models
    
    PaddleOCR
    
    Tesseract
  - Reading Order
    Reading Order
    
    Overview
    
    Base
    
    Models
    
    Rule Based
    Rule Based
    
    Overview
    
    Predictor
  - Structured Extraction
    Structured Extraction
    
    Overview
    
    Base
    
    Models
    
    Vlm
  - Table Extraction
    Table Extraction
    
    Overview
    
    Base
    
    Models
    
    Tableformer
    Tableformer
    
    Overview
    
    Config
    
    PyTorch
  - Text Extraction
    Text Extraction
    
    Overview
    
    Base
    
    Dots OCR
    Dots OCR
    
    Overview
    
    API
    
    Extractor
    
    PyTorch
    
    VLLM
    
    Granitedocling
    Granitedocling
    
    Overview
    
    API
    
    Extractor
    
    MLX
    
    PyTorch
    
    VLLM
    
    Mineruvl
    Mineruvl
    
    Overview
    
    API
    
    Extractor
    
    MLX
    
    PyTorch
    
    Utils
    
    VLLM
    
    Models
    
    Nanonets
    Nanonets
    
    Overview
    
    Extractor
    
    MLX
    
    PyTorch
    
    VLLM
    
    Qwen
    Qwen
    
    Overview
    
    API
    
    Extractor
    
    MLX
    
    PyTorch
    
    VLLM
    
    Vlm
- Utils
  Utils
- Vlm
  Vlm
  - Overview
  - Client
  - Config
Contributing
Contributing
Roadmap

Usage¶

Everything you need to use OmniDocs in your projects.

Tasks & Models¶

Text Extraction ¶

Convert documents to Markdown/HTML.

Model	Speed	Backends
Qwen	2-3s/page	PyTorch, VLLM, MLX, API
DotsOCR	3-5s/page	PyTorch, VLLM, API

Layout Analysis ¶

Detect structure (titles, tables, figures).

Model	Speed	Labels
DocLayoutYOLO	0.1-0.2s/page	Fixed (11)
RT-DETR	0.3-0.5s/page	Fixed (11)
Qwen Layout	2-3s/page	Custom

OCR ¶

Extract text with coordinates.

Model	Speed	Languages
Tesseract	0.5-1s/page	100+
EasyOCR	1-2s/page	80+
PaddleOCR	0.5-1s/page	80+

Workflows¶

Batch Processing - Process multiple documents
Deployment - Deploy on Modal GPUs

Upcoming¶

Tasks: Table Extraction, Math Recognition, Chart Understanding

Models: Chandra, LightOnOCR-2, MinerU, SuryaOCR, SuryaLayout

See Roadmap for full tracking.