Skip to content

🔢 Math Expression Extraction

This section documents the API for mathematical expression extraction tasks, providing various extractors to recognize and retrieve LaTeX math from documents.

Overview

Math expression extraction in OmniDocs focuses on converting mathematical formulas and equations found in documents (e.g., academic papers, textbooks) into a machine-readable format, typically LaTeX. This enables further processing, rendering, or indexing of mathematical content.

Available Extractors

DonutExtractor

NAVER CLOVA Donut model for math/LaTeX extraction.

omnidocs.tasks.math_expression_extraction.extractors.donut.DonutExtractor

DonutExtractor(device: Optional[str] = None, show_log: bool = False, model_name: str = 'naver-clova-ix/donut-base-finetuned-cord-v2', model_path: Optional[Union[str, Path]] = None, **kwargs)

Bases: BaseLatexExtractor

Donut (NAVER CLOVA) based expression extraction implementation.

Initialize Donut Extractor.

extract

extract(input_path: Union[str, Path, Image], **kwargs) -> LatexOutput

Extract LaTeX expressions using Donut.

Usage Example

from omnidocs.tasks.math_expression_extraction.extractors.donut import DonutExtractor

extractor = DonutExtractor()
result = extractor.extract("math_document.pdf")
print(f"Extracted LaTeX: {result.full_text[:200]}...")

NougatExtractor

Facebook's Nougat model for LaTeX extraction from academic documents.

omnidocs.tasks.math_expression_extraction.extractors.nougat.NougatExtractor

NougatExtractor(model_type: str = 'small', device: Optional[str] = None, show_log: bool = False, model_path: Optional[str] = None, **kwargs)

Bases: BaseLatexExtractor

Nougat (Neural Optical Understanding for Academic Documents) based expression extraction.

Initialize Nougat Extractor.

extract

extract(input_path: Union[str, Path, Image], **kwargs) -> LatexOutput

Extract LaTeX expressions using Nougat.

Usage Example

from omnidocs.tasks.math_expression_extraction.extractors.nougat import NougatExtractor

extractor = NougatExtractor()
result = extractor.extract("academic_paper.pdf")
print(f"Extracted LaTeX: {result.full_text[:200]}...")

SuryaMathExtractor

Surya-based mathematical expression extraction.

omnidocs.tasks.math_expression_extraction.extractors.surya_math.SuryaMathExtractor

SuryaMathExtractor(device: Optional[str] = None, show_log: bool = False, model_path: Optional[Union[str, Path]] = None, **kwargs)

Bases: BaseLatexExtractor

Surya-based mathematical expression extraction implementation.

Initialize Surya Math Extractor.

extract

extract(input_path: Union[str, Path, Image], **kwargs) -> LatexOutput

Extract LaTeX expressions using Surya.

Usage Example

from omnidocs.tasks.math_expression_extraction.extractors.surya_math import SuryaMathExtractor

extractor = SuryaMathExtractor()
result = extractor.extract("math_image.png")
print(f"Extracted LaTeX: {result.full_text[:200]}...")

UniMERNetExtractor

Universal Mathematical Expression Recognition Network.

omnidocs.tasks.math_expression_extraction.extractors.unimernet.UniMERNetExtractor

UniMERNetExtractor(model_path: Optional[str] = None, cfg_path: Optional[str] = None, device: Optional[str] = None, show_log: bool = False, **kwargs)

Bases: BaseLatexExtractor

UniMERNet (Universal Mathematical Expression Recognition Network) based expression extraction.

Initialize UniMERNet Extractor.

extract

extract(input_path: Union[str, Path, Image], **kwargs) -> LatexOutput

Extract LaTeX expressions using UniMERNet.

Usage Example

from omnidocs.tasks.math_expression_extraction.extractors.unimernet import UniMERNetExtractor

extractor = UniMERNetExtractor()
result = extractor.extract("math_equation.png")
print(f"Extracted LaTeX: {result.full_text[:200]}...")

MathOutput

The standardized output format for mathematical expression extraction results.

omnidocs.tasks.math_expression_extraction.base.LatexOutput

Bases: BaseModel

Container for extracted LaTeX expressions.

Attributes:

Name Type Description
expressions List[str]

List of extracted LaTeX expressions

confidences Optional[List[float]]

Optional confidence scores for each expression

bboxes Optional[List[List[float]]]

Optional bounding boxes for each expression

source_img_size Optional[Tuple[int, int]]

Optional tuple of source image dimensions

save_json

save_json(output_path: Union[str, Path]) -> None

Save output to JSON file.

to_dict

to_dict() -> Dict

Convert to dictionary representation.

Key Properties

  • expressions (List[MathExpression]): List of detected mathematical expressions.
  • full_text (str): Combined LaTeX string of all expressions.
  • source_file (str): Path to the processed file.

Key Methods

  • save_json(output_path): Save results to a JSON file.

Attributes

  • latex (str): The extracted LaTeX string.
  • bbox (List[float]): Bounding box coordinates [x1, y1, x2, y2].
  • confidence (Optional[float]): Confidence score of the extraction.
  • page_number (int): The page number where the expression is found.

BaseMathExtractor

The abstract base class for all mathematical expression extractors.

omnidocs.tasks.math_expression_extraction.base.BaseLatexExtractor

BaseLatexExtractor(device: Optional[str] = None, show_log: bool = False)

Bases: ABC

Base class for LaTeX expression extraction models.

Initialize the LaTeX extractor.

Parameters:

Name Type Description Default
device Optional[str]

Device to run model on ('cuda' or 'cpu')

None
show_log bool

Whether to show detailed logs

False

extract abstractmethod

extract(input_path: Union[str, Path, Image], **kwargs) -> LatexOutput

Extract LaTeX expressions from input image.

Parameters:

Name Type Description Default
input_path Union[str, Path, Image]

Path to input image or image data

required
**kwargs

Additional model-specific parameters

{}

Returns:

Type Description
LatexOutput

LatexOutput containing extracted expressions

preprocess_input

preprocess_input(input_path: Union[str, Path, Image, ndarray]) -> List[Image.Image]

Convert input to list of PIL Images.

Parameters:

Name Type Description Default
input_path Union[str, Path, Image, ndarray]

Input image path or image data

required

Returns:

Type Description
List[Image]

List of PIL Images