Files
fil/crates/kreuzberg-paddle-ocr/README.md
Henrik Jess Nielsen b4c07d3693
All checks were successful
Deploy fil (kreuzberg) / deploy (push) Successful in 49s
Nomad changes
2026-06-01 23:40:55 +02:00

2.3 KiB

kreuzberg-paddle-ocr

Bindings

PaddleOCR via ONNX Runtime for Kreuzberg - high-performance text detection and recognition using PaddlePaddle's OCR models.

Based on the original paddle-ocr-rs by mg-chao, this vendored version includes improvements for Kreuzberg integration:

  • Workspace Dependency Alignment: Uses Kreuzberg's workspace dependencies for consistency
  • Edition 2024: Updated to Rust 2024 edition
  • ndarray Compatibility: Aligned with Kreuzberg's ndarray version requirements
  • Integration: Designed to work seamlessly with Kreuzberg's OCR backend system

Features

  • Text detection using DBNet (Differentiable Binarization)
  • Text recognition using CRNN (Convolutional Recurrent Neural Network)
  • Angle detection for rotated text
  • Support for multiple languages via PaddleOCR models
  • ONNX Runtime for efficient CPU inference

ONNX Runtime Requirement

This crate requires ONNX Runtime 1.24+ at runtime.

Install it:

Usage

This crate is used internally by Kreuzberg when the paddle-ocr feature is enabled:

[dependencies]
kreuzberg = { version = "4.2", features = ["paddle-ocr"] }

Models

PaddleOCR models are automatically downloaded and cached on first use. Supported models include:

  • PP-OCRv5 server detection model
  • PP-OCRv5 per-family recognition models (11 script families)
  • PPOCRv2 mobile angle classification model

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

This project is based on the original paddle-ocr-rs by mg-chao, originally licensed under Apache-2.0. We are grateful for the foundational work that made this integration possible.

The original paddle-ocr-rs provides Rust bindings for PaddlePaddle's OCR models via ONNX Runtime, enabling efficient text detection and recognition without Python dependencies.