# Keyword Extraction Extract ranked keywords from document text using YAKE or RAKE algorithms. | Algorithm | Scoring | Best for | | --------- | ---------------------------------------- | --------------------------------------------- | | **YAKE** | Lower score = more relevant (0.0–1.0) | General documents, single terms, multilingual | | **RAKE** | Higher score = more relevant (unbounded) | Multi-word phrases, technical docs | ## Quick Start === "Python" --8<-- "snippets/python/utils/keyword_extraction_example.md" === "TypeScript" --8<-- "snippets/typescript/utils/keyword_extraction_example.md" === "Rust" --8<-- "snippets/rust/advanced/keyword_extraction_example.md" === "Go" --8<-- "snippets/go/utils/keyword_extraction_example.md" === "Java" --8<-- "snippets/java/utils/keyword_extraction_example.md" === "C#" --8<-- "snippets/csharp/advanced/keyword_extraction_example.md" === "Ruby" --8<-- "snippets/ruby/utils/keyword_extraction_example.md" Keywords are returned in `result.extracted_keywords` as objects with `text` and `score` fields. ## Configuration See [KeywordConfig reference](../reference/configuration.md#keywordconfig) for all configuration options. === "Python" --8<-- "snippets/python/config/keyword_extraction_config.md" === "TypeScript" --8<-- "snippets/typescript/config/keyword_extraction_config.md" === "Rust" --8<-- "snippets/rust/config/keyword_extraction_config.md" === "Go" --8<-- "snippets/go/config/keyword_extraction_config.md" === "Ruby" --8<-- "snippets/ruby/config/keyword_extraction_config.md" === "R" --8<-- "snippets/r/config/keyword_extraction_config.md" === "C#" --8<-- "snippets/csharp/advanced/keyword_extraction_config.md" ## YAKE Score Tuning Use `min_score` as upper bound. Lower YAKE scores = higher relevance: | `min_score` | Effect | | ----------- | ------------------- | | `0.5` | Keeps most keywords | | `0.3` | Main topics only | | `0.1` | Core concepts only | `yake_params.window_size` controls co-occurrence context: `1–2` for narrow domains, `2–3` for general (default: `2`), `3–4` for discussion-heavy content. ## RAKE Score Tuning Use `min_score` as lower bound. Higher RAKE scores = higher relevance: | `min_score` | Effect | | ----------- | ---------------------------- | | `0.1` | Keeps most keywords | | `5.0` | Main phrases only | | `20.0` | Only highly specific phrases | `rake_params.min_word_length` (default: `1`) and `rake_params.max_words_per_phrase` (default: `3`) control phrase boundaries. ## Troubleshooting - **Too few keywords** — Lower `min_score`, check `result.content` is non-empty, set `language` to match the document or `None` to disable stopword filtering - **Too many irrelevant keywords** — Raise `min_score`, set `language` for stopword filtering, reduce `ngram_range` upper bound - **Multi-word phrases missing (YAKE)** — Switch to RAKE or confirm `ngram_range` upper bound is >= 2 - **Keywords don't match content** — Verify text was extracted (`result.content`) and `language` matches the document See the [KeywordConfig reference](../reference/configuration.md#keywordconfig) for the full parameter list.