Overview¶
MinerU VL layout detection module.
MinerU VL can be used for standalone layout detection, returning detected regions with types and bounding boxes.
For full document extraction (layout + content), use MinerUVLTextExtractor from the text_extraction module instead.
Example
from omnidocs.tasks.layout_extraction import MinerUVLLayoutDetector
from omnidocs.tasks.layout_extraction.mineruvl import MinerUVLLayoutPyTorchConfig
detector = MinerUVLLayoutDetector(
backend=MinerUVLLayoutPyTorchConfig(device="cuda")
)
result = detector.extract(image)
for box in result.bboxes:
print(f"{box.label}: {box.confidence:.2f}")
MinerUVLLayoutAPIConfig
¶
Bases: BaseModel
API backend config for MinerU VL layout detection.
Example
MinerUVLLayoutDetector
¶
Bases: BaseLayoutExtractor
MinerU VL layout detector.
Uses MinerU2.5-2509-1.2B for document layout detection. Detects 22+ element types including text, titles, tables, equations, figures, code, and more.
For full document extraction (layout + content), use MinerUVLTextExtractor from the text_extraction module instead.
Example
from omnidocs.tasks.layout_extraction import MinerUVLLayoutDetector
from omnidocs.tasks.layout_extraction.mineruvl import MinerUVLLayoutPyTorchConfig
detector = MinerUVLLayoutDetector(
backend=MinerUVLLayoutPyTorchConfig(device="cuda")
)
result = detector.extract(image)
for box in result.bboxes:
print(f"{box.label}: {box.confidence:.2f}")
Initialize MinerU VL layout detector.
| PARAMETER | DESCRIPTION |
|---|---|
backend
|
Backend configuration (PyTorch, VLLM, MLX, or API)
TYPE:
|
Source code in omnidocs/tasks/layout_extraction/mineruvl/detector.py
extract
¶
Detect layout elements in the image.
| PARAMETER | DESCRIPTION |
|---|---|
image
|
Input image (PIL Image, numpy array, or file path)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LayoutOutput
|
LayoutOutput with standardized labels and bounding boxes |
Source code in omnidocs/tasks/layout_extraction/mineruvl/detector.py
MinerUVLLayoutMLXConfig
¶
Bases: BaseModel
MLX backend config for MinerU VL layout detection on Apple Silicon.
Example
MinerUVLLayoutPyTorchConfig
¶
Bases: BaseModel
PyTorch/HuggingFace backend config for MinerU VL layout detection.
Example
MinerUVLLayoutVLLMConfig
¶
Bases: BaseModel
VLLM backend config for MinerU VL layout detection.
Example
api
¶
API backend configuration for MinerU VL layout detection.
MinerUVLLayoutAPIConfig
¶
Bases: BaseModel
API backend config for MinerU VL layout detection.
Example
detector
¶
MinerU VL layout detector.
Uses MinerU2.5-2509-1.2B for document layout detection. Detects 22+ element types including text, titles, tables, equations, figures, code.
MinerUVLLayoutDetector
¶
Bases: BaseLayoutExtractor
MinerU VL layout detector.
Uses MinerU2.5-2509-1.2B for document layout detection. Detects 22+ element types including text, titles, tables, equations, figures, code, and more.
For full document extraction (layout + content), use MinerUVLTextExtractor from the text_extraction module instead.
Example
from omnidocs.tasks.layout_extraction import MinerUVLLayoutDetector
from omnidocs.tasks.layout_extraction.mineruvl import MinerUVLLayoutPyTorchConfig
detector = MinerUVLLayoutDetector(
backend=MinerUVLLayoutPyTorchConfig(device="cuda")
)
result = detector.extract(image)
for box in result.bboxes:
print(f"{box.label}: {box.confidence:.2f}")
Initialize MinerU VL layout detector.
| PARAMETER | DESCRIPTION |
|---|---|
backend
|
Backend configuration (PyTorch, VLLM, MLX, or API)
TYPE:
|
Source code in omnidocs/tasks/layout_extraction/mineruvl/detector.py
extract
¶
Detect layout elements in the image.
| PARAMETER | DESCRIPTION |
|---|---|
image
|
Input image (PIL Image, numpy array, or file path)
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LayoutOutput
|
LayoutOutput with standardized labels and bounding boxes |
Source code in omnidocs/tasks/layout_extraction/mineruvl/detector.py
mlx
¶
MLX backend configuration for MinerU VL layout detection (Apple Silicon).
MinerUVLLayoutMLXConfig
¶
Bases: BaseModel
MLX backend config for MinerU VL layout detection on Apple Silicon.
Example
pytorch
¶
PyTorch backend configuration for MinerU VL layout detection.
MinerUVLLayoutPyTorchConfig
¶
Bases: BaseModel
PyTorch/HuggingFace backend config for MinerU VL layout detection.