This commit is contained in:
@@ -0,0 +1,9 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- Follow semantic versioning — breaking changes require major version bump
|
||||
- Document all public API changes in CHANGELOG.md
|
||||
- Maintain backward compatibility for at least one minor version before removing deprecated APIs
|
||||
- All public types must be FFI-friendly or have FFI-compatible equivalents
|
||||
- Version in Cargo.toml is the single source of truth for all binding packages
|
||||
@@ -0,0 +1,9 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- All extraction paths must be fully async using tokio
|
||||
- Never block the async runtime — use spawn_blocking for CPU-intensive work
|
||||
- All public types must be Send + Sync
|
||||
- Use tokio::select! for timeout handling on extraction operations
|
||||
- Cross-platform: test on Linux (amd64, arm64) and macOS at minimum
|
||||
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- Cache keys: content-hash based (hash of file bytes + config), not path-based
|
||||
- Invalidate cache when extraction config changes (output format, OCR settings, etc.)
|
||||
- Check cache before any extraction — cache hits should skip all processing
|
||||
- Concurrent batch processing: use configurable worker pool, default to CPU count
|
||||
- Stream large files instead of loading into memory — use AsyncRead where possible
|
||||
- Monitor cache hit rates — target >80% for repeated extractions
|
||||
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: high
|
||||
---
|
||||
|
||||
- 95% test coverage on core extraction code, 80% on bindings
|
||||
- Test all format categories: text, office, PDF, images, archives, markup
|
||||
- Test corrupted/malformed documents — extraction must fail gracefully, never panic
|
||||
- Benchmark extraction speeds per format — track regressions in CI
|
||||
- Test both success and error paths for every extractor
|
||||
- Use property-based testing for parsers with wide input ranges
|
||||
@@ -0,0 +1,10 @@
|
||||
---
|
||||
priority: critical
|
||||
---
|
||||
|
||||
- Always use `SecurityLimits` to cap archive size, compression ratio, file count, and nesting depth for user content. Use `ZipBombValidator` for archive extraction.
|
||||
- Validate MIME type before extraction — never trust file extensions alone
|
||||
- Implement fallback chains: if primary extractor fails, try next-priority extractor
|
||||
- Preserve partial results on failure — return what was extracted with error context
|
||||
- All errors must include: operation name, input description, root cause, and suggestion
|
||||
- Never expose internal file paths or system details in error messages returned to users
|
||||
Reference in New Issue
Block a user