Extractor¶
Granite Docling text extractor with multi-backend support.
GraniteDoclingTextExtractor
¶
Bases: BaseTextExtractor
Granite Docling text extractor supporting PyTorch, VLLM, MLX, and API backends.
Granite Docling is IBM's compact vision-language model optimized for document conversion. It outputs DocTags format which is converted to Markdown using the docling_core library.
Example
from omnidocs.tasks.text_extraction.granitedocling import ( ... GraniteDoclingTextExtractor, ... GraniteDoclingTextPyTorchConfig, ... ) config = GraniteDoclingTextPyTorchConfig(device="cuda") extractor = GraniteDoclingTextExtractor(backend=config) result = extractor.extract(image, output_format="markdown") print(result.content)
Initialize Granite Docling extractor with backend configuration.
| PARAMETER | DESCRIPTION |
|---|---|
backend
|
Backend configuration (PyTorch, VLLM, MLX, or API config)
TYPE:
|
Source code in omnidocs/tasks/text_extraction/granitedocling/extractor.py
extract
¶
extract(
image: Union[Image, ndarray, str, Path],
output_format: Literal["html", "markdown"] = "markdown",
) -> TextOutput
Extract text from an image using Granite Docling.
| PARAMETER | DESCRIPTION |
|---|---|
image
|
Input image (PIL Image, numpy array, or file path)
TYPE:
|
output_format
|
Output format ("markdown" or "html")
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
TextOutput
|
TextOutput with extracted content |