Platform

3 min read

This document explains how the Web Search tool fits into the platform architecture, how it communicates with external services, and how queries flow through the system in each search mode.

Architecture

Data Flow

What leaves the platform (outbound):

Search queries are sent to the configured search engine provider (e.g., Google, Bing).
API keys are transmitted via HTTP headers or request parameters to authenticate with external providers.
When crawling, HTTP requests are sent to target websites to retrieve page content.

What enters the platform (inbound):

Search results: URLs, page titles, and text snippets from the search engine.
Page content: Raw HTML or Markdown text from crawled web pages.
For engines that do not require scraping (Jina, Tavily, Firecrawl, VertexAI), the page content is included directly in the search response.

Security considerations:

API keys should be provisioned as secrets and injected as environment variables into the assistants-core pod at deployment time.
Proxy configuration is available for environments that require outbound traffic to route through a corporate proxy.
The optional privacy filtering feature (LLM Processor with sanitization) can redact personal data from crawled content before it reaches the orchestrator LLM.

Execution Flow -- V1 Mode (Basic Search)

In V1 mode, the orchestrator LLM sends a simple query to the Web Search tool. The tool optionally refines the query, searches, crawls if needed, processes the content, and returns results.

V1 Query Generation

The query refinement mode determines how the orchestrator's raw query is transformed before reaching the search engine:

Mode	Behavior
Deactivated	The query is passed through unchanged. No LLM call is made.
Basic	The query is sent to the language model with instructions to produce a single optimized search query (max ~6 keywords, with optional advanced syntax like quotes, `site:`, `-word`).
Advanced (Beta)	The query is sent to the language model, which generates multiple targeted queries -- each focusing on a different facet of the original question. Capped at `max_queries` (default: 5).

After refinement, each query is executed sequentially against the search engine. If the engine requires scraping, the crawler fetches page content for all result URLs.

Execution Flow -- V2 Mode (AI-Planned Research)

In V2 mode, the orchestrator LLM creates a structured research plan with multiple steps that execute in parallel.

V2 Query Generation

In V2, the orchestrator LLM itself creates the research plan. There is no separate query refinement step -- instead, the LLM produces a structured WebSearchPlan containing:

Field	Description
`objective`	What the research aims to accomplish.
`query_analysis`	Analysis of what information is needed and why.
`steps[]`	A list of research steps, each being either a `SEARCH` (run a search query) or `READ_URL` (crawl a specific URL).
`expected_outcome`	What the research is expected to find.

All steps execute concurrently via parallel async tasks. The plan is capped to max_steps (default: 5) before execution -- excess steps are truncated.

Query Elicitation

Query Elicitation is an optional feature that gives users visibility and control over what the AI searches for.

How it works:

The tool generates search queries (via refinement in V1, or plan creation in V2).
A form appears in the chat UI with the proposed queries pre-populated.
The user reviews and can approve as-is, edit queries, add new queries, remove queries, or decline the search entirely.
If the user does not respond within the configured timeout (default: 60 seconds), the search is cancelled.

V1 vs V2 differences:

In V1, all refined queries are presented for review.
In V2, only SEARCH step queries are presented. READ_URL steps pass through unchanged since they reference specific URLs that the LLM already identified.

Requirements:

The enable_elicitation setting must be true in the space configuration.
The associated platform feature flag must be active.