Skip to content

MLX

MLX backend configuration for GLM-OCR text extraction.

GLMOCRMLXConfig

Bases: BaseModel

MLX backend configuration for GLM-OCR.

Uses mlx-vlm for Apple Silicon native inference.
GLM-OCR at 0.9B runs comfortably on any M-series Mac with 8GB+ unified memory.
Requires: mlx, mlx-vlm>=0.3.11

Note: Only works on Apple Silicon Macs. Do NOT use for Modal/cloud deployments.

Available models:
    mlx-community/GLM-OCR-bf16   (default — full precision, 2.21 GB)
    mlx-community/GLM-OCR-6bit   (quantized, smaller)

Example:

python config = GLMOCRMLXConfig() # bf16, default config = GLMOCRMLXConfig(model="mlx-community/GLM-OCR-6bit") # quantized