# Development Workflow Everything you need to build, test, and debug Kreuzberg locally. This guide assumes you've already followed the [Contributing Guide](../contributing.md) to fork and clone the repository. --- ## The Task Runner Kreuzberg uses [Task](https://taskfile.dev/) for all build and test workflows. One command to bootstrap everything: ```bash title="Terminal" task setup ``` That installs all toolchains and dependencies. Safe to re-run anytime — it's idempotent. ### The Pattern Tasks follow `:`. Once you learn this pattern, the command for any task is predictable: ```bash title="Terminal" task rust:build # Build the Rust core task rust:build:dev # Debug build (faster compile, no optimizations) task rust:build:release # Release build (slow compile, fast binary) task rust:test # Run Rust tests task rust:test:ci # Same tests, with CI-level diagnostics task python:build # Build Python bindings via maturin task python:test # Run Python test suite task node:build # Build Node.js bindings via napi task node:test # Jest tests ``` The same pattern works for every language: `go:build`, `java:test`, `ruby:build`, `csharp:test`, and so on. ### Bulk Operations ```bash title="Terminal" task build:all # Build every binding task test:all # Test every binding (sequential) task test:all:parallel # Test every binding (parallel — faster, noisier output) task check # Lint + format check across the whole repo ``` --- ## Testing Locally ### Rust The core lives in `crates/kreuzberg/`. Most changes start here. ```bash title="Terminal" task rust:test cargo test -p kreuzberg test_pdf_extraction -- --nocapture RUST_LOG=debug cargo test -p kreuzberg test_name -- --nocapture ``` ### Python Python bindings are in `packages/python/`. Build first, then test: ```bash title="Terminal" task python:build:dev task python:test cd packages/python uv run pytest tests/ -k "test_extract" -v ``` The `RUST_LOG` env var works here too — the Rust core logs through Python's stderr: ```bash title="Terminal" RUST_LOG=debug uv run pytest tests/ -v ``` ### Node.js TypeScript bindings are in `packages/typescript/`: ```bash title="Terminal" task node:build:dev task node:test cd packages/typescript pnpm test -- --testPathPattern="extract" ``` ### Everything Else Same pattern. Build, then test: ```bash title="Terminal" task go:build && task go:test task java:build && task java:test task csharp:build && task csharp:test task ruby:build && task ruby:test task php:build && task php:test task elixir:build && task elixir:test task r:build && task r:test task c:build && task c:test task wasm:build && task wasm:test ``` ### Testing the live browser demo The demo at `docs/demo.html` loads `@kreuzberg/wasm` from a CDN. To test local changes against it, use: ```bash title="Terminal" task demo:dev ``` This builds the Wasm binary and TypeScript dist, patches the demo with local URLs, and starts two servers: | Server | URL | Role | | ------ | ----------------------- | ---------------------------------- | | Docs | `http://localhost:8001` | Serves the patched `demo-dev.html` | | Assets | `http://localhost:9000` | Serves the local Wasm package | Open **`http://localhost:8001/demo-dev.html`** — no manual edits needed. The patched file (`docs/demo-dev.html`) is gitignored and regenerated on every run. The two different ports reproduce the cross-origin setup the CDN creates in production. To skip the slow Rust build when you've only changed TypeScript: ```bash title="Terminal" SKIP_WASM_BUILD=1 task demo:dev ``` --- ## End-to-end Test Suites End-to-end tests guarantee that every language binding produces identical results for the same document. They live in `e2e/` as shared fixtures — test inputs paired with expected outputs. ### Run end-to-end Tests | Language | Directory | Run with | | -------------------- | ----------------- | ---------------------- | | Python | `e2e/python/` | `task python:e2e:test` | | TypeScript / Node.js | `e2e/typescript/` | `task node:e2e:test` | | Rust | `e2e/rust/` | `task rust:e2e:test` | | Go | `e2e/go/` | `task go:e2e:test` | | Java | `e2e/java/` | `task java:e2e:test` | | .NET | `e2e/csharp/` | `task csharp:e2e:test` | | Ruby | `e2e/ruby/` | `task ruby:e2e:test` | | PHP | `e2e/php/` | `task php:e2e:test` | | R | `e2e/r/` | `task r:e2e:test` | ### Regenerate end-to-end Tests When you add a feature that changes extraction behavior, regenerate the affected end-to-end suites: ```bash title="Terminal" task python:e2e:generate task node:e2e:generate task :e2e:generate ``` To regenerate and test all suites at once: ```bash title="Terminal" task e2e:generate:all task e2e:test:all ``` --- ## Benchmarking Measure extraction performance with the benchmark harness in `tools/benchmark-harness/`. Use it to track regressions, compare against alternatives, and identify bottlenecks with flamegraphs. ### Quick Start ```bash title="Terminal" task benchmark:run FRAMEWORK=kreuzberg MODE=single-file task benchmark:run FRAMEWORK=kreuzberg MODE=batch ``` ### Common Modes | Mode | What it measures | | ------------- | --------------------------------------- | | `single-file` | Latency — one file at a time | | `batch` | Throughput — multiple files in parallel | ### With Profiling Generate flamegraphs to see where time is spent: ```bash title="Terminal" task benchmark:profile FRAMEWORK=kreuzberg MODE=single-file ``` Results appear in the `flamegraphs/` directory as interactive SVGs. View live benchmark results at . --- ## Linting and Pre-commit ```bash title="Terminal" task check # Full lint + format check (same as CI validate stage) ``` Language-specific: ```bash title="Terminal" task rust:lint # clippy + rustfmt task python:lint # ruff + mypy task node:lint # eslint + typecheck ``` The repository uses pre-commit hooks that enforce conventional commit messages, code formatting, and linter rules. If a commit is rejected, the hook output tells you exactly what to fix. --- ## Working with Documentation ### Building Locally ```bash title="Terminal" uv sync --group doc zensical build --clean zensical serve ``` ### How Snippets Work Code examples in the docs aren't inline — they're pulled from `docs/snippets/` via the `--8<--` include directive. This keeps examples testable and reusable across pages. ```text docs/snippets/ ├── python/ # Python examples │ ├── api/ # extract_file, batch_extract, etc. │ ├── config/ # ExtractionConfig, OcrConfig, etc. │ ├── ocr/ # OCR backends │ ├── plugins/ # Plugin implementations │ ├── mcp/ # MCP server and client │ └── utils/ # Embeddings, chunking, errors ├── rust/ # Rust examples (same layout) ├── typescript/ # TypeScript examples ├── go/, java/, csharp/, ruby/, r/ ├── docker/ # Docker commands ├── api_server/ # Server startup examples └── cli/ # CLI usage ``` When you change a user-facing API, update the matching snippet. When you add a new feature, create a snippet and include it from the relevant doc page. ### Theme tokens (light mode) Inline `code` and command-style monospace in light mode use the text token **`#26203A`**, defined in `docs/css/extra.css` as `--kb-text` (referenced as `var(--kb-text)`; brand backgrounds use the same value via `--kb-brand-ink`). --- ## Debugging ### Rust Panics ```bash title="Terminal" RUST_BACKTRACE=1 cargo test -p kreuzberg test_name RUST_BACKTRACE=full cargo test -p kreuzberg test_name ``` ### Python FFI Problems When something goes wrong in the Rust core during a Python call, the error introspection API gives you the details: ```python title="debug_ffi.py" from kreuzberg import get_last_error_code, get_error_details, get_last_panic_context details = get_error_details() print(f"Error: {details['message']}") print(f"Code: {details['error_code']}") context = get_last_panic_context() if context: print(f"Panic context: {context}") ``` ### Verbose Logging Crank up the log level to see what the Rust core is doing: ```bash title="Terminal" RUST_LOG=debug task python:test RUST_LOG=trace task rust:test ``` --- ## CI/CD CI runs on every push and PR to `main` via `.github/workflows/ci.yaml`. The pipeline has four stages: 1. **Validate** — conventional commits, formatting, clippy 2. **Build** — FFI libraries, Python wheels, Node packages, all bindings 3. **Test** — per-language test suites on Linux, macOS, and Windows 4. **Integration** — Docker build, Docker smoke tests, CLI tests ### Smart Change Detection CI doesn't rebuild everything on every PR. A `changes` job detects which paths were touched and only runs the relevant build/test jobs. Edit a Python file? Only Python builds and tests run. Touch the Rust core? Everything downstream rebuilds. ### Running CI Checks Locally Before pushing, you can run the same checks CI runs: ```bash title="Terminal" task check # Matches the validate stage task rust:test:ci # Rust tests with CI diagnostics task python:test:ci # Python tests with CI diagnostics task test:all:ci # Everything ``` ### Other Workflows | Workflow | When it runs | What it does | | --------------------- | ------------------------------------- | ---------------------------------- | | `ci.yaml` | Every push/PR to `main` | The main pipeline | | `docs.yaml` | Changes to `docs/` or `zensical.toml` | Builds and validates documentation | | `benchmarks.yaml` | Manual trigger | Runs the full benchmark suite | | `profiling.yaml` | Manual trigger | Generates flamegraphs | | `publish.yaml` | Release events | Publishes packages to registries | | `publish-docker.yaml` | Tags and releases | Builds and pushes Docker images | --- ## Performance Kreuzberg's core is written in Rust, which enables zero-copy memory handling, SIMD acceleration, and true multi-core parallelism — all at compile time with no garbage collection. ### Why Rust Matters - **Native compilation:** LLVM optimizes code ahead of time (inlining, vectorization, dead code elimination) - **Zero-copy strings:** Slicing uses borrowed references, not heap allocations - **SIMD acceleration:** Whitespace detection and character classification run 15-37x faster than scalar operations - **No GIL:** True multi-core parallelism across all CPU cores - **Deterministic memory:** Drop semantics free memory instantly, no GC pauses ### Key Optimizations - **Batch processing:** 6-10x faster than sequential extraction through work-stealing scheduler - **Caching:** 85%+ hit rates for repeated files (SQLite-backed, automatic invalidation) - **Streaming:** Large files processed in 4KB chunks, constant memory regardless of file size - **Lazy initialization:** Expensive subsystems (Tokio, plugins) initialized on first use only ### Benchmarking Your Workload Measure with your actual files using the benchmark harness (see [Benchmarking](#benchmarking) section for full instructions). For detailed analysis and live benchmark results, visit . ---