3.4 KiB
3.4 KiB
AI Coding Assistants v4.2.15
Kreuzberg ships with an Agent Skill that teaches AI coding assistants how to use the library correctly — covering extraction, configuration, OCR, chunking, embeddings, batch processing, error handling, and plugins across Python, Node.js/TypeScript, Rust, and CLI.
Supported Assistants
Works with any tool supporting the Agent Skills standard: Claude Code, Codex, Gemini CLI, Cursor, Visual Studio Code (with AI extensions), Amp, Goose, and Roo Code.
Installing
# Install into current project (recommended)
npx skills add kreuzberg-dev/kreuzberg
# Install globally
npx skills add kreuzberg-dev/kreuzberg -g
Or copy manually:
cp -r path/to/kreuzberg/skills/kreuzberg .claude/skills/kreuzberg
What the Skill Provides
When your AI coding assistant discovers the skill, it knows:
- All extraction functions and their correct signatures across languages
- Configuration field names (for example,
max_charsnotmax_charactersin Python) - Rust feature gates (for example,
tokio-runtimefor sync functions) - Language-specific conventions (snake_case in Python/Rust, camelCase in Node.js)
- Error handling patterns for each language
Skill Structure
skills/kreuzberg/
├── SKILL.md # Main skill (~400 lines)
└── references/
├── python-api.md # Complete Python API
├── nodejs-api.md # Complete Node.js API
├── rust-api.md # Complete Rust API
├── cli-reference.md # All CLI commands and flags
├── configuration.md # Config file formats and schema
├── supported-formats.md # All 90+ supported formats
├── advanced-features.md # Plugins, embeddings, MCP, security
└── other-bindings.md # Go, Ruby, Java, C#, PHP, Elixir
The main file stays under 500 lines for efficient AI consumption. Reference files load on demand.
Quick Examples
=== "Python"
```python
from kreuzberg import extract_file, extract_file_sync, ExtractionConfig, OcrConfig
result = await extract_file("document.pdf")
print(result.content)
config = ExtractionConfig(
ocr=OcrConfig(backend="tesseract", language="eng"),
output_format="markdown",
)
result = await extract_file("document.pdf", config=config)
```
=== "Node.js"
```typescript
import { extractFile, extractFileSync } from '@kreuzberg/node';
const result = await extractFile('document.pdf');
console.log(result.content);
```
=== "Rust"
```rust
use kreuzberg::{extract_file, ExtractionConfig};
let config = ExtractionConfig::default();
let result = extract_file("document.pdf", None, &config).await?;
```
=== "CLI"
```bash
kreuzberg extract document.pdf
kreuzberg extract document.pdf --format json --output-format markdown
```
Further Reading
- Agent Skills Standard — the open standard
- Extraction Basics — core extraction API
- Configuration — all configuration options
- Advanced Features — chunking, embeddings, language detection
- Plugin System — custom plugins