3.9 KiB
3.9 KiB
priority
| priority |
|---|
| high |
Feature Flag Policy
All features in crates/kreuzberg/Cargo.toml.
ORT-Incompatible Targets (WASM, Android x86_64 emulator)
Only ORT-dependent paths are incompatible. The same paths block both WASM (no native ORT linkage at all) and the x86_64-linux-android emulator triple (no pyke prebuilt; aarch64-linux-android does ship a prebuilt and gets full ORT):
paddle-ocr— ONNX Runtime + native C++ deps: not WASM-safe; no Android x86_64 prebuiltlayout-detection— depends on ONNX Runtime layout models: not WASM-safe; no Android x86_64 prebuiltembeddings— depends on ONNX Runtime sentence-transformer models: not WASM-safe; no Android x86_64 prebuiltauto-rotate— depends on ONNX Runtime orientation classifier: not WASM-safe; no Android x86_64 prebuilt
Pure-Rust type-only companion features expose the public config/result types for the above without pulling in ORT:
layout-types—LayoutDetectionConfig,TableModel,BBox,DetectionResult,LayoutClass,LayoutDetection,RecognizedTable.layout-detectionimplieslayout-types.auto-rotate-types—OrientationResult.auto-rotateimpliesauto-rotate-types.embedding-presets—EmbeddingPreset(already existed; pure-Rust preset metadata).
WASM/Android-safe variants:
ocr(native) →ocr-wasm(usestesseract-wasm+ safe image deps) — Android keeps nativeocrexcel(native) →excel-wasm(dropstokio-runtime) — Android keeps nativeexceltree-sitter(native dlopen) →tree-sitter-wasm(statically-linked grammar pack) — Android keeps nativetree-sitterliter-llm— works on WASM via the upstreamwasm-httpfeature; included inno-ort-targetstopwords— pure-Rust, included inno-ort-targetkeywords— pure-Rust YAKE/RAKE, included inno-ort-target
The no-ort-target aggregate is the shared no-ORT base used by both wasm-target and android-target. wasm-target = no-ort-target + excel-wasm + tree-sitter-wasm + ocr-wasm. android-target = no-ort-target + excel + tree-sitter + ocr + api + mcp.
Experimental (NOT in full)
pdf-oxide— pure-Rust PDF text extraction; opt-in only, excluded from bothfullandformats
ORT Variants (Mutually Exclusive)
ort-bundled— downloads official Microsoft ORT binaries; default when OCR/ML features activeort-dynamic— load ORT from system; only when system ORT is guaranteed present
Platform-Conditional
kreuzberg-paddle-ocr,hf-hub,pprof— excluded onwasm32ureq:rustlson non-Windows;native-tlson Windowskreuzberg-ffiandkreuzberg-dartcargo dependencies are target-conditional:cfg(all(target_os = "android", target_arch = "x86_64"))selectsandroid-target; all other targets (including arm64 Android phones) get the full ORT-enabled feature set.
Aggregate Sets
| Feature | Description |
|---|---|
formats |
All document formats + api/mcp/otel/chunking; no OCR, no ML |
full |
formats + ocr + paddle-ocr + layout + embeddings + tree-sitter + liter-llm; excludes pdf-oxide |
no-ort-target |
Pure-Rust base: every capability that does not depend on ONNX Runtime |
wasm-target |
no-ort-target + excel-wasm + tree-sitter-wasm + ocr-wasm |
android-target |
no-ort-target + excel + tree-sitter + ocr + api + mcp (for x86_64-linux-android emulator) |
Build Profiles
release— LTO thin, codegen-units=1, stripprofiling— inherits release, retains debug infokreuzberg-wasmoverride:opt-level="z"(size-optimized)sevenz-rust2,zipoverride:opt-level=2(prevents SIGBUS on macOS ARM64)