Quality and Evaluation
1 min read
After content processing, additional quality steps ensure the most relevant content reaches the orchestrator LLM and that the final answer is checked for accuracy.
Chunk Relevancy Sorting
Uses an AI model to evaluate and re-rank content chunks by relevance to the user's original question. This ensures the most useful content is prioritized when the token budget forces some chunks to be dropped.
Setting | Type | Default | Description |
|---|---|---|---|
| boolean |
| Whether to use AI-based relevancy sorting. When disabled, chunks retain their original order. |
When enabled, each chunk is scored against the user's query and the chunks are reordered from most to least relevant before the token budget reduction step.
Answer Quality Checks (Evaluation)
After the orchestrator LLM generates its answer using the web search content, automated evaluation checks can be run to detect quality issues.
Setting | Type | Default | Description |
|---|---|---|---|
| list of evaluation metrics |
| Which quality checks to run on the generated answer. |
Available Metrics
Metric | Description |
|---|---|
| Detects when the generated answer contains claims that are not supported by the retrieved web search content. |
The evaluation only runs when the tool returns content chunks. If the search returned no results, evaluation is skipped.
Source Citation Instructions
Controls how the AI model cites web search sources in its answers.
Setting | Type | Default | Description |
|---|---|---|---|
| string (textarea) | Built-in citation instructions | Instructions injected into the orchestrator LLM's system prompt that specify how to format source references. |
These instructions tell the AI how to:
Reference specific web sources when making claims.
Format inline citations.
Attribute information to its original source.
<!-- TODO: Add screenshot of the quality and evaluation configuration panel in the Spaces UI -->