Nanonets OCR2
Nanonets OCR2-3B is a Vision-Language Model optimized for document text extraction with excellent accuracy on diverse document types.
Overview
Property
Value
Model
nanonets/Nanonets-OCR-s
Parameters
3B
Task
Text Extraction
Backends
PyTorch, VLLM, MLX
License
Apache 2.0
Installation
# PyTorch backend
pip install omnidocs[ pytorch]
# VLLM backend (high throughput)
pip install omnidocs[ vllm]
# MLX backend (Apple Silicon)
pip install omnidocs[ mlx]
Quick Start
PyTorch Backend
from omnidocs.tasks.text_extraction import NanonetsTextExtractor
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextPyTorchConfig
extractor = NanonetsTextExtractor (
backend = NanonetsTextPyTorchConfig ( device = "cuda" )
)
result = extractor . extract ( image , output_format = "markdown" )
print ( result . content )
VLLM Backend (High Throughput)
from omnidocs.tasks.text_extraction import NanonetsTextExtractor
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextVLLMConfig
extractor = NanonetsTextExtractor (
backend = NanonetsTextVLLMConfig (
gpu_memory_utilization = 0.85 ,
max_model_len = 8192 ,
)
)
result = extractor . extract ( image )
MLX Backend (Apple Silicon)
from omnidocs.tasks.text_extraction import NanonetsTextExtractor
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextMLXConfig
extractor = NanonetsTextExtractor (
backend = NanonetsTextMLXConfig ()
)
result = extractor . extract ( image )
Configuration
PyTorch Config
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextPyTorchConfig
config = NanonetsTextPyTorchConfig (
model = "nanonets/Nanonets-OCR-s" , # Model ID
device = "cuda" , # "cuda", "cpu", or "mps"
torch_dtype = "bfloat16" , # "float16", "bfloat16", "float32"
attn_implementation = "flash_attention_2" , # or "sdpa", "eager"
)
VLLM Config
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextVLLMConfig
config = NanonetsTextVLLMConfig (
model = "nanonets/Nanonets-OCR-s" ,
gpu_memory_utilization = 0.85 , # GPU memory fraction
max_model_len = 8192 , # Max sequence length
tensor_parallel_size = 1 , # Multi-GPU parallelism
)
MLX Config
from omnidocs.tasks.text_extraction.nanonets import NanonetsTextMLXConfig
config = NanonetsTextMLXConfig (
model = "nanonets/Nanonets-OCR-s" ,
max_tokens = 4096 ,
)
Output
result = extractor . extract ( image , output_format = "markdown" )
# Access content
print ( result . content ) # Extracted Markdown text
print ( result . model_name ) # "nanonets/Nanonets-OCR-s"
print ( result . output_format ) # "markdown"
Backend
Device
Load Time
Inference Time
PyTorch
L4 GPU
~44s
~6.3s
VLLM
L4 GPU
~194s
~8.4s
MLX
M1/M2/M3
~8s
~12s
Times measured on single-page document with default settings.
Comparison with Other Models
Model
Speed
Accuracy
Memory
Nanonets OCR2
Fast
High
6-8 GB
Qwen3-VL
Medium
High
8-16 GB
DotsOCR
Medium
High
6-8 GB
Use Cases
Document digitization - Convert scanned documents to editable text
Invoice processing - Extract text from invoices and receipts
Form processing - Extract text from forms and applications
OCR pipelines - High-throughput batch processing
Tips
Use VLLM for batch processing - Higher throughput for multiple documents
Use MLX on Mac - Native performance on Apple Silicon
Set output_format - Use "markdown" for formatted output, "text" for plain text
February 3, 2026
February 3, 2026