This commit is contained in:
13
.ai-rulez/domains/plugin-system/DOMAIN.md
Normal file
13
.ai-rulez/domains/plugin-system/DOMAIN.md
Normal file
@@ -0,0 +1,13 @@
|
||||
---
|
||||
description: Plugin trait system and Python FFI integration
|
||||
---
|
||||
|
||||
- Core traits: Extractor, PostProcessor, MetadataExtractor — each with async extract/process methods returning Result
|
||||
- Discovery: static registration (Rust plugins compiled in) + dynamic discovery (Python plugins via PyO3 FFI)
|
||||
- Priority selection: plugins declare priority per MIME type, registry selects highest-priority match, fallback to next
|
||||
- Registry: PluginRegistry holds all discovered plugins, provides lookup by MIME type, supports hot-reload for Python plugins
|
||||
- Python FFI: Python plugins implement a Python class matching the trait interface, called via PyO3 with GIL management
|
||||
- GIL management: acquire GIL only for Python calls, release immediately after, use py.allow_threads() for Rust-side work
|
||||
- Plugin lifecycle: init → register → validate → ready. Plugins validate their dependencies (e.g., Tesseract binary, Python packages) at startup
|
||||
- Error handling: plugin errors are wrapped in PluginError with source plugin name, converted to ExtractionError at boundary
|
||||
- Testing: test plugins with real files (not mocks), test fallback chains, test Python plugin loading/unloading
|
||||
16
.ai-rulez/domains/plugin-system/agents/plugin-engineer.md
Normal file
16
.ai-rulez/domains/plugin-system/agents/plugin-engineer.md
Normal file
@@ -0,0 +1,16 @@
|
||||
---
|
||||
name: plugin-engineer
|
||||
description: Plugin system architecture, registry management, and Python FFI
|
||||
model: haiku
|
||||
---
|
||||
|
||||
When working on the plugin system:
|
||||
|
||||
1. Key source paths: crates/kreuzberg/src/plugins/ (mod.rs, extractor.rs, ocr.rs, postprocessor.rs, validator.rs, registry.rs), crates/kreuzberg-py/src/plugins.rs
|
||||
2. Plugin types: DocumentExtractor, OcrBackend, PostProcessor, Validator — all extend base Plugin trait (Send + Sync required)
|
||||
3. Priority system: 0-255, default 50, custom override > 50, fallback < 50. Registry selects highest priority for MIME type.
|
||||
4. Registries use Arc<RwLock<>> with MIME type indexing for O(log n) lookup
|
||||
5. Python plugins: validate protocol compliance, use py.allow_threads() for expensive Rust ops, tokio::task::spawn_blocking for async calls
|
||||
6. For new plugin types: define trait extending Plugin, create typed registry, add registration functions, implement priority-based selection
|
||||
7. GIL optimization: cache frequently-accessed Python data in Rust fields, measure GIL overhead
|
||||
8. All plugins must handle errors gracefully — return Result, never panic
|
||||
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: medium
|
||||
---
|
||||
|
||||
- API stability: plugin interfaces are versioned, breaking changes require major version bump
|
||||
- Plugin discovery: support both static (compile-time) and dynamic (runtime) registration
|
||||
- Plugin validation: check capabilities, supported formats, and version compatibility before registration
|
||||
- Plugin chaining: post-processors can be composed in sequence
|
||||
- Configuration: plugins accept typed configuration, validated at registration time
|
||||
- Documentation: every plugin type must have a development guide with examples
|
||||
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: critical
|
||||
---
|
||||
|
||||
- All plugins must implement the base Plugin trait: Send + Sync + 'static required
|
||||
- Plugin types: DocumentExtractor, OcrBackend, PostProcessor, Validator
|
||||
- Async execution: use async trait methods for non-blocking operations
|
||||
- Lifecycle: init() -> process() -> cleanup(). Init must validate all requirements.
|
||||
- Never panic in plugin code — all errors must be returned as Result
|
||||
- Consistent result format: all extractors return ExtractionResult with text, metadata, and confidence
|
||||
@@ -0,0 +1,12 @@
|
||||
---
|
||||
priority: critical
|
||||
---
|
||||
|
||||
- Separate typed registry per plugin type (ExtractorRegistry, OcrRegistry, etc.)
|
||||
- Thread safety: Arc<RwLock<>> for all registries
|
||||
- Priority system: 0-255, default 50, custom > 50, fallback < 50
|
||||
- Selection: highest priority plugin matching the MIME type wins
|
||||
- MIME type indexing for O(log n) lookup
|
||||
- Conflict resolution: if equal priority, prefer Rust-native over FFI plugins
|
||||
- Dynamic registration: plugins can be added/removed at runtime
|
||||
- Validate plugin before registration (check trait compliance, supported formats)
|
||||
10
.ai-rulez/domains/plugin-system/rules/plugin-testing.md
Normal file
10
.ai-rulez/domains/plugin-system/rules/plugin-testing.md
Normal file
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- Mock plugin testing: create test doubles for unit tests
|
||||
- Real plugin testing: integration tests with actual backends
|
||||
- Thread safety tests: run concurrent plugin operations to detect race conditions
|
||||
- Performance baselines: measure and track plugin overhead vs direct calls
|
||||
- Test all error paths: invalid input, backend failure, timeout, resource exhaustion
|
||||
- Test plugin lifecycle: register, use, unregister, verify cleanup
|
||||
11
.ai-rulez/domains/plugin-system/rules/python-ffi-plugins.md
Normal file
11
.ai-rulez/domains/plugin-system/rules/python-ffi-plugins.md
Normal file
@@ -0,0 +1,11 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- GIL management: use py.allow_threads() for expensive Rust operations
|
||||
- Cache frequently-accessed Python data in Rust fields to minimize GIL acquisitions
|
||||
- Use tokio::task::spawn_blocking for async calls to Python backends
|
||||
- Python exception translation: convert Python exceptions to Rust errors with full context
|
||||
- Data type mapping: Python str <-> Rust String, Python bytes <-> Rust Vec<u8>, Python dict <-> Rust HashMap
|
||||
- Validate Python plugin protocol compliance on registration
|
||||
- Target GIL overhead: 5-55us per acquisition
|
||||
Reference in New Issue
Block a user