# CLI Usage Command-line access to all Kreuzberg extraction features. ## Installation === "Install Script (Linux/macOS)" --8<-- "snippets/cli/install_script.md" === "Homebrew (macOS/Linux)" --8<-- "snippets/cli/install_homebrew.md" === "Cargo (Cross-platform)" --8<-- "snippets/cli/install_cargo.md" === "Docker" --8<-- "snippets/cli/install_docker.md" === "Go (SDK)" --8<-- "snippets/cli/install_go_sdk.md" !!! Info "Feature Availability" **Homebrew Installation:** - ✅ Text extraction (PDF, Office, images, 90+ formats) - ✅ OCR with Tesseract - ✅ HTTP API server (`serve` command) - ✅ MCP protocol server (`mcp` command) - ✅ Chunking, quality scoring, language detection - ❌ **Embeddings** - Not available via CLI flags. Use config file or Docker image. **Docker Images:** - All features enabled including embeddings (ONNX Runtime included) ## Global Flags v4.5.2 ### Log Level v4.5.2 `--log-level` controls log verbosity and overrides `RUST_LOG`. ```bash title="Terminal" # Set log level to debug for troubleshooting kreuzberg --log-level debug extract document.pdf # Suppress all but error messages kreuzberg --log-level error batch documents/*.pdf # Trace-level logging for maximum detail kreuzberg --log-level trace extract document.pdf ``` Valid levels: `trace`, `debug`, `info` (default), `warn`, `error`. ### Colored Output Output is colored by default. Disable with `NO_COLOR`: ```bash title="Terminal" # Disable colored output NO_COLOR=1 kreuzberg extract document.pdf ``` ## Basic Usage ### Extract from Single File ```bash title="Terminal" # Extract text content to stdout kreuzberg extract document.pdf # Specify MIME type (auto-detected if not provided) kreuzberg extract document.pdf --mime-type application/pdf ``` ### Batch Extract Multiple Files ```bash title="Terminal" # Extract from multiple files kreuzberg batch doc1.pdf doc2.docx doc3.txt # Batch extract all PDFs in directory kreuzberg batch documents/*.pdf # Batch extract recursively kreuzberg batch documents/**/*.pdf ``` ### Extract Structured Data v4.8.0 `extract-structured` pulls typed JSON out of a document via an LLM, using a JSON schema as constraint. ```bash title="Terminal" # Extract invoice fields into JSON matching invoice_schema.json kreuzberg extract-structured invoice.pdf \ --schema invoice_schema.json \ --model openai/gpt-4o \ --strict ``` | Flag | Description | | ----------------------- | ---------------------------------------------------------------------------------------------------- | | `` (positional) | Document file path. Required. | | `--schema ` | Path to a JSON schema file describing the desired output. Required. | | `--model ` | LLM model identifier, for example `openai/gpt-4o` or `anthropic/claude-sonnet-4-20250514`. Required. | | `--api-key ` | LLM provider API key. Falls back to `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, and so on. | | `--prompt