# Kreuzberg
High-performance document intelligence for Go backed by the Rust core that powers every Kreuzberg binding. > **Version 5.0.0-rc.3** > Report issues at [github.com/kreuzberg-dev/kreuzberg](https://github.com/kreuzberg-dev/kreuzberg/issues). ## What This Package Provides - **Go module over the Rust core** — context-aware extraction with Go structs and errors. - **Structured results** — text, tables, images, metadata, language detection, chunks, and warnings. - **Static-link workflow** — build against `kreuzberg-ffi` and ship a self-contained Go binary. - **Cross-binding parity** — output matches the Python, Node.js, Ruby, Java, .NET, PHP, Elixir, R, Dart, Swift, Zig, WASM, and C FFI packages. ## Install Kreuzberg Go binaries are **statically linked** — once built, they are self-contained and require no runtime library dependencies. Only the static library is needed at build time. ### Quick Start (Monorepo Development) For development in the Kreuzberg monorepo: ```bash # Build the static FFI library cargo build -p kreuzberg-ffi --release # Go build will automatically link against the static library # (from target/release/libkreuzberg_ffi.a) cd packages/go/v5 go build -v # Run your binary (no library path needed - it's statically linked) ./v4 ``` That's it! The resulting binary is self-contained and has no runtime dependencies on Kreuzberg libraries. ### Using Go Modules To use this package via `go get`: ```bash # Get the latest release go get github.com/kreuzberg-dev/kreuzberg/v5@latest # Or a specific version go get github.com/kreuzberg-dev/kreuzberg/v5@v5.0.0-rc.3 ``` You'll need to provide the static library at build time. See [Building with Static Libraries](#building-with-static-libraries) below. ### Building with Static Libraries When building outside the Kreuzberg monorepo, you need to provide the static library (`.a` file on Unix, `.lib` on Windows). #### Option 1: Download Pre-built Static Library Download the static library for your platform from [GitHub Releases](https://github.com/kreuzberg-dev/kreuzberg/releases): ```bash # Example: Linux x86_64 curl -LO https://github.com/kreuzberg-dev/kreuzberg/releases/download/v5.0.0-rc.3/go-ffi-linux-x86_64.tar.gz tar -xzf go-ffi-linux-x86_64.tar.gz # Copy to a permanent location mkdir -p ~/kreuzberg/lib cp kreuzberg-ffi/lib/libkreuzberg_ffi.a ~/kreuzberg/lib/ ``` Then build with `CGO_LDFLAGS`: ```bash # Linux/macOS CGO_LDFLAGS="-L$HOME/kreuzberg/lib -lkreuzberg_ffi" go build # Windows (MSVC) set CGO_LDFLAGS=-L%USERPROFILE%\kreuzberg\lib -lkreuzberg_ffi go build ``` #### Option 2: Build Static Library Yourself If pre-built libraries aren't available for your platform: ```bash # Clone the repository git clone https://github.com/kreuzberg-dev/kreuzberg.git cd kreuzberg # Build the static library cargo build -p kreuzberg-ffi --release # The static library is now at: target/release/libkreuzberg_ffi.a # Copy it to a permanent location mkdir -p ~/kreuzberg/lib cp target/release/libkreuzberg_ffi.a ~/kreuzberg/lib/ # Now you can build Go projects cd ~/my-go-project CGO_LDFLAGS="-L$HOME/kreuzberg/lib -lkreuzberg_ffi" go build ``` ### System Requirements #### ONNX Runtime (for embeddings) If using embeddings functionality, ONNX Runtime must be installed **at build time**: ```bash # macOS brew install onnxruntime # Ubuntu/Debian sudo apt install libonnxruntime libonnxruntime-dev # Windows (MSVC) scoop install onnxruntime # OR download from https://github.com/microsoft/onnxruntime/releases ``` The resulting binary will have ONNX Runtime statically linked or dynamically linked depending on how the FFI library was built. Check the build configuration. **Note:** Windows MinGW builds do not support embeddings (ONNX Runtime requires MSVC). Use Windows MSVC for embeddings support. ## Quickstart ```go package main import ( "fmt" "log" "github.com/kreuzberg-dev/kreuzberg/v5" ) func main() { result, err := v4.ExtractFileSync("document.pdf", nil) if err != nil { log.Fatalf("extract failed: %v", err) } fmt.Println("MIME:", result.MimeType) fmt.Println("First 200 chars:") fmt.Println(result.Content[:200]) } ``` Build and run: ```bash # Build (make sure you have the static library available - see Install) CGO_LDFLAGS="-L$HOME/kreuzberg/lib -lkreuzberg_ffi" go build # Run - no library paths needed! ./myapp ``` The binary is self-contained and can be distributed without any Kreuzberg library dependencies. ## Examples ### Extract bytes ```go data, err := os.ReadFile("slides.pptx") if err != nil { log.Fatal(err) } result, err := v4.ExtractBytesSync(data, "application/vnd.openxmlformats-officedocument.presentationml.presentation", nil) if err != nil { log.Fatal(err) } fmt.Println(result.Metadata.FormatType()) ``` ### Use advanced configuration ```go lang := "eng" cfg := &v4.ExtractionConfig{ UseCache: true, ForceOCR: false, ImageExtraction: &v4.ImageExtractionConfig{Enabled: true}, OCR: &v4.OcrConfig{ Backend: "tesseract", Language: &lang, }, } result, err := v4.ExtractFileSync("scanned.pdf", cfg) ``` ### Async (context-aware) extraction ```go ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() result, err := v4.ExtractFile(ctx, "large.pdf", nil) if err != nil { log.Fatal(err) } fmt.Println("Content length:", len(result.Content)) ``` ### Batch extract ```go paths := []string{"doc1.pdf", "doc2.docx", "report.xlsx"} results, err := v4.BatchExtractFilesSync(paths, nil) if err != nil { log.Fatal(err) } for i, res := range results { if res == nil { continue } fmt.Printf("[%d] %s => %d bytes\n", i, res.MimeType, len(res.Content)) } ``` ### Register a validator ```go //export customValidator func customValidator(resultJSON *C.char) *C.char { // Validate JSON payload and return an error string (or NULL if ok) return nil } func init() { if err := v4.RegisterValidator("go-validator", 50, (C.ValidatorCallback)(C.customValidator)); err != nil { log.Fatalf("validator registration failed: %v", err) } } ``` ## API Reference - **GoDoc**: [pkg.go.dev/github.com/kreuzberg-dev/kreuzberg/v5](