Skip to content

Base

Base class for reading order predictors.

Defines the abstract interface that all reading order predictors must implement.

BaseReadingOrderPredictor

Bases: ABC

Abstract base class for reading order predictors.

Reading order predictors take layout detection and OCR results and produce a properly ordered sequence of document elements.

Example
predictor = RuleBasedReadingOrderPredictor()

# Get layout and OCR
layout = layout_extractor.extract(image)
ocr = ocr_extractor.extract(image)

# Predict reading order
result = predictor.predict(layout, ocr)

# Or with multiple pages
results = predictor.predict_multi_page(layouts, ocrs)

predict abstractmethod

predict(
    layout: LayoutOutput,
    ocr: Optional[OCROutput] = None,
    page_no: int = 0,
) -> ReadingOrderOutput

Predict reading order for a single page.

PARAMETER DESCRIPTION
layout

Layout detection results with bounding boxes

TYPE: LayoutOutput

ocr

Optional OCR results. If provided, text will be matched to layout elements by bbox overlap.

TYPE: Optional[OCROutput] DEFAULT: None

page_no

Page number (for multi-page documents)

TYPE: int DEFAULT: 0

RETURNS DESCRIPTION
ReadingOrderOutput

ReadingOrderOutput with ordered elements and associations

Example
layout = layout_extractor.extract(page_image)
ocr = ocr_extractor.extract(page_image)
order = predictor.predict(layout, ocr, page_no=0)
Source code in omnidocs/tasks/reading_order/base.py
@abstractmethod
def predict(
    self,
    layout: "LayoutOutput",
    ocr: Optional["OCROutput"] = None,
    page_no: int = 0,
) -> ReadingOrderOutput:
    """
    Predict reading order for a single page.

    Args:
        layout: Layout detection results with bounding boxes
        ocr: Optional OCR results. If provided, text will be
             matched to layout elements by bbox overlap.
        page_no: Page number (for multi-page documents)

    Returns:
        ReadingOrderOutput with ordered elements and associations

    Example:
        ```python
        layout = layout_extractor.extract(page_image)
        ocr = ocr_extractor.extract(page_image)
        order = predictor.predict(layout, ocr, page_no=0)
        ```
    """
    pass

predict_multi_page

predict_multi_page(
    layouts: List[LayoutOutput],
    ocrs: Optional[List[OCROutput]] = None,
) -> List[ReadingOrderOutput]

Predict reading order for multiple pages.

PARAMETER DESCRIPTION
layouts

List of layout results, one per page

TYPE: List[LayoutOutput]

ocrs

Optional list of OCR results, one per page

TYPE: Optional[List[OCROutput]] DEFAULT: None

RETURNS DESCRIPTION
List[ReadingOrderOutput]

List of ReadingOrderOutput, one per page

Source code in omnidocs/tasks/reading_order/base.py
def predict_multi_page(
    self,
    layouts: List["LayoutOutput"],
    ocrs: Optional[List["OCROutput"]] = None,
) -> List[ReadingOrderOutput]:
    """
    Predict reading order for multiple pages.

    Args:
        layouts: List of layout results, one per page
        ocrs: Optional list of OCR results, one per page

    Returns:
        List of ReadingOrderOutput, one per page
    """
    results = []

    for i, layout in enumerate(layouts):
        ocr = ocrs[i] if ocrs else None
        result = self.predict(layout, ocr, page_no=i)
        results.append(result)

    return results