SuryaOCR
SuryaOCR¶
In [3]:
Copied!
from omnidocs.tasks.ocr_extraction.extractors.surya_ocr import SuryaOCRExtractor
from omnidocs.tasks.ocr_extraction.extractors.surya_ocr import SuryaOCRExtractor
In [5]:
Copied!
image_path = "../../../../tests/ocr_extraction/assets/invoice.jpg"
extractor = SuryaOCRExtractor()
result = extractor.extract(image_path)
print(f"'{result.full_text[:200]}...'")
image_path = "../../../../tests/ocr_extraction/assets/invoice.jpg"
extractor = SuryaOCRExtractor()
result = extractor.extract(image_path)
print(f"'{result.full_text[:200]}...'")
Detecting bboxes: 100%|██████████| 1/1 [00:00<00:00, 1.74it/s] Recognizing Text: 100%|██████████| 86/86 [00:05<00:00, 14.63it/s]
INFO [timestamp]2025-07-31 12:49:20[/] | [logger.name]omnidocs.tasks.ocr_extraction.extractors.surya_ocr[/] | [function]logging.py:150[/] | [info]extract completed in 6.61s[/]
INFO [timestamp]2025-07-31 12:49:20[/] | [logger.name]omnidocs.tasks.ocr_extraction.extractors.surya_ocr[/] | [function]logging.py:150[/] | [info]extract completed in 6.61s[/]
[2025-07-31 12:49:20,411] [ INFO] logging.py:150 - extract completed in 6.61s
'Invoice Account number: PAT20-32 Need help? Normal business hours are: Invoice number: 6312 Monday - Friday 8:00 am to 5:00 pm LABOR <b>HOURLY RATE</b> <b>HOURS</b> <b>AMOUNT</b> Jamie M. $45.00 82 $3...'
In [ ]:
Copied!