Search Engine Configuration
6 min read
The search engine determines how web searches are performed. The available engines are controlled at the platform level via the ACTIVE_SEARCH_ENGINES environment variable (a JSON array of engine names). Only engines listed in that variable appear as options in the space configuration UI.
The search_engine_config field is a discriminated union -- its schema changes dynamically based on which engines are activated.
Common Settings
All search engines share the following setting:
Setting | Type | Default | Description |
|---|---|---|---|
| integer |
| Number of search results to retrieve per query. |
Google Search Engine
Display name: Google Search Engine
Uses the Google Custom Search JSON API to retrieve search results (URLs, titles, snippets). Requires scraping -- a crawler must be used to fetch the actual page content.
Space-Level Config
Setting | Type | Default | Description |
|---|---|---|---|
| integer |
| Number of results to fetch. Pagination is handled automatically (Google API returns max 10 per page). |
| object |
| Optional Google Custom Search API parameters (e.g., language, country, safe search). |
Platform Setup
Environment Variable | Required | Description |
|---|---|---|
| Yes | API key for the Google Custom Search JSON API. |
| Yes | The Custom Search Engine ID (cx parameter). |
| Yes | API endpoint URL (typically |
Bing (Grounding with Bing)
Display name: Grounding with Bing
This is not a standard search engine integration. Instead of calling a search API directly, the tool delegates the entire search to an Azure AI Foundry Agent with a
BingGroundingToolattached. The agent autonomously searches Bing, reads the sources, and produces a structured response containing detailed answers and key facts per source. A response parsing pipeline then converts the agent output into standardWebSearchResultobjects that the rest of the Web Search tool can process.
How It Works
The tool authenticates with Azure using the configured identity credentials (see Platform -- Azure Authentication).
It connects to the Azure AI project via the
AIProjectClient.It either discovers/creates an agent automatically or uses a pre-configured one (see Operating Modes below).
A thread is created with the user's search query. The agent runs against it with Bing grounding and returns a text response.
The response is parsed into structured results via a multi-strategy pipeline (see Response Parsing).
Operating Modes
The Bing engine has two operating modes, determined by whether agent_id is provided:
Auto-Provisioned Mode (default)
When agent_id is left empty (the default), the tool manages the agent lifecycle automatically:
Lists existing agents in the Azure AI project and looks for one named
UNIQUE_GROUNDING_WITH_BING_AGENT.If not found, creates a new agent with that name using the model specified by
AZURE_AI_BING_AGENT_MODEL(required -- no default).Creates a thread and runs it with per-run overrides:
Model -- overridden to
AZURE_AI_BING_AGENT_MODEL(platform env var).Toolset -- overridden to a
BingGroundingToolconfigured withAZURE_AI_BING_RESOURCE_CONNECTION_STRINGand the space-levelfetch_size.Instructions -- overridden to
generation_instructions(space config) + a JSON output format rule.
Because all behavior is controlled at execution time, the persisted agent in Azure is just a shell. Space admins control model, instructions, and fetch size from the Spaces UI without needing access to the Azure portal.
Pre-Configured Mode
When agent_id is set (via AZURE_AI_ASSISTANT_ID env var or space config), the tool uses that agent as-is:
No overrides are applied -- the agent's own model, tools, and instructions are used.
The tool only creates a thread with the user's query and processes the run.
The agent must already be configured in Azure with the correct Bing grounding tool, model, and instructions.
This mode is useful when platform engineers want full control over the agent configuration in Azure, or when the agent has been customized beyond what the space-level settings can express.
agent_id and endpoint Resolution
Setting | Source Priority | Behavior when empty |
|---|---|---|
|
| Error -- at least one must be set. |
|
| Auto-provisioning (discover or create agent). |
Both settings follow a env var takes precedence pattern. The endpoint env var lets the platform lock the project endpoint while still allowing space admins to set engine-specific overrides. The AZURE_AI_ASSISTANT_ID env var lets the platform specify a pre-configured agent -- when set, it overrides the space-level agent_id and forces pre-configured mode.
Space-Level Config
Setting | Type | Default | Applies To | Description |
|---|---|---|---|---|
| integer |
| Both modes | Number of search results to retrieve. In auto-provisioned mode, passed to the |
| boolean |
| Both modes | Whether to additionally crawl result URLs. Normally |
| string |
| Mode selector | The ID of a pre-configured Azure AI Foundry Agent. Leave empty for auto-provisioning. |
| string |
| Both modes | Azure AI project endpoint. Overridden by |
| string (textarea) | Built-in prompt | Auto-provisioned only | Instructions for the agent on how to search and format results. Ignored in pre-configured mode. |
| Language model identifier |
| Both modes | Fallback LLM used to parse agent responses when the output is not valid JSON (see Response Parsing). |
Generation Instructions
The built-in default generation_instructions instruct the agent to:
Search broadly with varied keywords to cover every angle of the query.
Read every source thoroughly -- extract every relevant fact, figure, statistic, date, name, and quote.
Produce one result entry per source with a
detailed_answerand a list ofkey_facts.Preserve detail -- prefer verbosity over brevity.
At run time, a RESPONSE_RULE suffix is appended that instructs the agent to respond with a JSON object matching the GroundingWithBingResults schema. Space admins can customize the main instructions while the output format rule is always enforced.
Response Parsing
The agent returns a free-text response that must be converted into structured WebSearchResult objects. Two parsing strategies are tried in order:
Strategy | How it works | When it succeeds |
|---|---|---|
JSON extraction | Looks for a fenced | When the agent follows the output format rule and returns valid JSON. |
LLM fallback | Sends the raw response to the | When the agent returns useful content but not in the expected JSON format. |
If both strategies fail, the search raises an error.
Platform Setup
Environment Variable | Required | Description |
|---|---|---|
| Yes | Azure AI project endpoint URL. Takes precedence over the space-level |
| Yes (auto) | Bing resource connection string from Azure. Used to configure the |
| Yes (auto) | The deployed model name used when creating/overriding the agent. No default -- must be explicitly set. |
| Yes (pre-configured) | The ID of a pre-configured Foundry Agent. When set, auto-provisioning is skipped. Takes precedence over the space-level |
| No | Azure credential mode: |
Custom API Search Engine
Display name: Customized API Search Engine
Sends search queries to a user-defined REST endpoint. The endpoint must return results matching the WebSearchResults schema (a list of objects with url, title, snippet, and optionally content fields).
Space-Level Config
Setting | Type | Default | Description |
|---|---|---|---|
| integer |
| Number of results to fetch. |
| string |
| URL of the custom search API. Hidden from UI if set via env var. |
|
|
| HTTP method for the API request. Hidden from UI if set via env var. |
| string (JSON) |
| Request headers as a JSON string. Hidden from UI if set via env var. |
| string (JSON) |
| Additional query parameters as JSON. Hidden from UI if set via env var. |
| string (JSON) |
| Additional request body parameters as JSON. Hidden from UI if set via env var. |
| boolean |
| Whether to additionally crawl result URLs. |
| integer |
| Request timeout in seconds. |
For GET requests, the search query is added as a query parameter. For POST requests, it is included in the JSON body as {"query": "..."}.
Platform Setup
Environment Variable | Required | Description |
|---|---|---|
| No | Default API endpoint. When set, the field is hidden from space-level config. |
| No | Default HTTP method ( |
| No | Default headers (JSON string). When set, the field is hidden from space-level config. |
| No | Default query params (JSON string). When set, the field is hidden from space-level config. |
| No | Default body params (JSON string). When set, the field is hidden from space-level config. |
| No | HTTP client configuration (JSON string). |
Scraping Behavior Summary
Engine | Requires Scraping | Notes |
|---|---|---|
Yes | Returns URLs only; crawler must fetch page content. | |
Bing | Configurable (default: No) | Azure AI Agent returns content with detailed answers and key facts per source. |
Custom API | Configurable (default: No) | Depends on what the custom API returns. |