DocLayout YOLO¶
Here we use yolo layout to detect analyze, its a example, and how it will ideally run.
In [2]:
Copied!
from omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo import YOLOLayoutDetector
from omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo import YOLOLayoutDetector
c:\Users\laxma\OneDrive\Desktop\CogLab\11-07-2025\Omnidocs\new\Lib\site-packages\transformers\utils\hub.py:111: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead. warnings.warn(
In [5]:
Copied!
detector = YOLOLayoutDetector(show_log=True)
image_path = "assets/news_paper.png"
annotated_image, layout_output = detector.detect(image_path)
print(f"Detected {len(layout_output.bboxes)} elements")
detector = YOLOLayoutDetector(show_log=True)
image_path = "assets/news_paper.png"
annotated_image, layout_output = detector.detect(image_path)
print(f"Detected {len(layout_output.bboxes)} elements")
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:53[/] | [info]Initializing YOLOLayoutDetector[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:53[/] | [info]Initializing YOLOLayoutDetector[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:58[/] | [info]Using device: cuda[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:58[/] | [info]Using device: cuda[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:66[/] | [info]Model directory: C:\Users\laxma\OneDrive\Desktop\CogLab\11-07-2025\Omnidocs\omnidocs\models\yolo_layout\juliozhao_DocLayout -YOLO-DocStructBench[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:66[/] | [info]Model directory: C:\Users\laxma\OneDrive\Desktop\CogLab\11-07-2025\Omnidocs\omnidocs\models\yolo_layout\juliozhao_DocLayout -YOLO-DocStructBench[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:141[/] | [info]Loading YOLO model from C:\Users\laxma\OneDrive\Desktop\CogLab\11-07-2025\Omnidocs\omnidocs\models\yolo_layout\juliozhao_DocLayout -YOLO-DocStructBench\doclayout_yolo_docstructbench_imgsz1024.pt[/]
INFO [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]doc_layout_yolo.py:141[/] | [info]Loading YOLO model from C:\Users\laxma\OneDrive\Desktop\CogLab\11-07-2025\Omnidocs\omnidocs\models\yolo_layout\juliozhao_DocLayout -YOLO-DocStructBench\doclayout_yolo_docstructbench_imgsz1024.pt[/]
SUCCESS [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:196[/] | [success]YOLO model loaded successfully on cuda[/]
SUCCESS [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:196[/] | [success]YOLO model loaded successfully on cuda[/]
SUCCESS [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:196[/] | [success]Model initialized successfully[/]
SUCCESS [timestamp]2025-07-29 20:19:30[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:196[/] | [success]Model initialized successfully[/]
0: 1024x672 17 titles, 59 plain texts, 2 abandons, 7 figures, 1 figure_caption, 274.8ms Speed: 12.0ms preprocess, 274.8ms inference, 219.8ms postprocess per image at shape (1, 3, 1024, 672)
INFO [timestamp]2025-07-29 20:19:32[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:150[/] | [info]detect completed in 2.25s[/]
INFO [timestamp]2025-07-29 20:19:32[/] | [logger.name]omnidocs.tasks.layout_analysis.extractors.doc_layout_yolo[/] | [function]logging.py:150[/] | [info]detect completed in 2.25s[/]
Detected 84 elements
In [ ]:
Copied!
output_path = "output/yolo_result.png"
detector.visualize((annotated_image, layout_output), output_path)
print(f"Saved visualization to {output_path}")
output_path = "output/yolo_result.png"
detector.visualize((annotated_image, layout_output), output_path)
print(f"Saved visualization to {output_path}")
Saved visualization & json to output/yolo_result.png
In [7]:
Copied!
from IPython.display import Image, display
# Display in notebook
display(Image(output_path))
from IPython.display import Image, display
# Display in notebook
display(Image(output_path))
In [ ]:
Copied!