--- title: "Configuration Reference" --- ## Configuration Reference This page documents all configuration types and their defaults across all languages. ### AccelerationConfig Hardware acceleration configuration for ONNX Runtime models. Controls which execution provider (CPU, CoreML, CUDA, TensorRT) is used for inference in layout detection and embedding generation. | Field | Type | Default | Description | |-------|------|---------|-------------| | `provider` | `ExecutionProviderType` | `ExecutionProviderType.AUTO` | Execution provider to use for ONNX inference. | | `device_id` | `int` | — | GPU device ID (for CUDA/TensorRT). Ignored for CPU/CoreML/Auto. | --- ### ContentFilterConfig Cross-extractor content filtering configuration. Controls whether "furniture" content (headers, footers, page numbers, watermarks, repeating text) is included in or stripped from extraction results. Applies across all extractors (PDF, DOCX, RTF, ODT, HTML, etc.) with format-specific implementation. When `None` on `ExtractionConfig`, each extractor uses its current default behavior unchanged. | Field | Type | Default | Description | |-------|------|---------|-------------| | `include_headers` | `bool` | `False` | Include running headers in extraction output. - PDF: Disables top-margin furniture stripping and prevents the layout model from treating `PageHeader`-classified regions as furniture. - DOCX: Includes document headers in text output. - RTF/ODT: Headers already included; this is a no-op when true. - HTML/EPUB: Keeps `
` element content. Default: `False` (headers are stripped or excluded). | | `include_footers` | `bool` | `False` | Include running footers in extraction output. - PDF: Disables bottom-margin furniture stripping and prevents the layout model from treating `PageFooter`-classified regions as furniture. - DOCX: Includes document footers in text output. - RTF/ODT: Footers already included; this is a no-op when true. - HTML/EPUB: Keeps `