Overview¶
Granite Docling text extraction with multi-backend support.
GraniteDoclingTextAPIConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction via API.
Uses litellm for provider-agnostic API access. Supports OpenRouter, Gemini, Azure, OpenAI, and any other litellm-compatible provider.
API keys can be passed directly or read from environment variables.
Example
GraniteDoclingTextExtractor
¶
Bases: BaseTextExtractor
Granite Docling text extractor supporting PyTorch, VLLM, MLX, and API backends.
Granite Docling is IBM's compact vision-language model optimized for document conversion. It outputs DocTags format which is converted to Markdown using the docling_core library.
Example
from omnidocs.tasks.text_extraction.granitedocling import ( ... GraniteDoclingTextExtractor, ... GraniteDoclingTextPyTorchConfig, ... ) config = GraniteDoclingTextPyTorchConfig(device="cuda") extractor = GraniteDoclingTextExtractor(backend=config) result = extractor.extract(image, output_format="markdown") print(result.content)
Initialize Granite Docling extractor with backend configuration.
| PARAMETER | DESCRIPTION |
|---|---|
backend
|
Backend configuration (PyTorch, VLLM, MLX, or API config)
TYPE:
|
Source code in omnidocs/tasks/text_extraction/granitedocling/extractor.py
extract
¶
extract(
image: Union[Image, ndarray, str, Path],
output_format: Literal["html", "markdown"] = "markdown",
) -> TextOutput
Extract text from an image using Granite Docling.
| PARAMETER | DESCRIPTION |
|---|---|
image
|
Input image (PIL Image, numpy array, or file path)
TYPE:
|
output_format
|
Output format ("markdown" or "html")
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
TextOutput
|
TextOutput with extracted content |
Source code in omnidocs/tasks/text_extraction/granitedocling/extractor.py
GraniteDoclingTextMLXConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with MLX backend.
This backend is optimized for Apple Silicon Macs (M1/M2/M3/M4). Uses the MLX-optimized model variant.
GraniteDoclingTextPyTorchConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with PyTorch backend.
GraniteDoclingTextVLLMConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with VLLM backend.
IMPORTANT: This config uses revision="untied" by default, which is required for VLLM compatibility with Granite Docling's tied weights.
api
¶
API backend configuration for Granite Docling text extraction.
Uses litellm for provider-agnostic inference (OpenRouter, Gemini, Azure, etc.).
GraniteDoclingTextAPIConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction via API.
Uses litellm for provider-agnostic API access. Supports OpenRouter, Gemini, Azure, OpenAI, and any other litellm-compatible provider.
API keys can be passed directly or read from environment variables.
Example
extractor
¶
Granite Docling text extractor with multi-backend support.
GraniteDoclingTextExtractor
¶
Bases: BaseTextExtractor
Granite Docling text extractor supporting PyTorch, VLLM, MLX, and API backends.
Granite Docling is IBM's compact vision-language model optimized for document conversion. It outputs DocTags format which is converted to Markdown using the docling_core library.
Example
from omnidocs.tasks.text_extraction.granitedocling import ( ... GraniteDoclingTextExtractor, ... GraniteDoclingTextPyTorchConfig, ... ) config = GraniteDoclingTextPyTorchConfig(device="cuda") extractor = GraniteDoclingTextExtractor(backend=config) result = extractor.extract(image, output_format="markdown") print(result.content)
Initialize Granite Docling extractor with backend configuration.
| PARAMETER | DESCRIPTION |
|---|---|
backend
|
Backend configuration (PyTorch, VLLM, MLX, or API config)
TYPE:
|
Source code in omnidocs/tasks/text_extraction/granitedocling/extractor.py
extract
¶
extract(
image: Union[Image, ndarray, str, Path],
output_format: Literal["html", "markdown"] = "markdown",
) -> TextOutput
Extract text from an image using Granite Docling.
| PARAMETER | DESCRIPTION |
|---|---|
image
|
Input image (PIL Image, numpy array, or file path)
TYPE:
|
output_format
|
Output format ("markdown" or "html")
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
TextOutput
|
TextOutput with extracted content |
Source code in omnidocs/tasks/text_extraction/granitedocling/extractor.py
mlx
¶
MLX backend configuration for Granite Docling text extraction (Apple Silicon).
GraniteDoclingTextMLXConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with MLX backend.
This backend is optimized for Apple Silicon Macs (M1/M2/M3/M4). Uses the MLX-optimized model variant.
pytorch
¶
PyTorch backend configuration for Granite Docling text extraction.
GraniteDoclingTextPyTorchConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with PyTorch backend.
vllm
¶
VLLM backend configuration for Granite Docling text extraction.
GraniteDoclingTextVLLMConfig
¶
Bases: BaseModel
Configuration for Granite Docling text extraction with VLLM backend.
IMPORTANT: This config uses revision="untied" by default, which is required for VLLM compatibility with Granite Docling's tied weights.