Security Risk Assessment

11 min read

Risk Summary

Risk Area	Rating	Summary
Infinite loop	Low	Hard-coded iteration caps at the orchestrator level (max 10); the tool is single-shot; the last iteration forces a text-only response.
Prompt injection	Low	Indirect injection via web content is possible but impact is bounded to text output only (no command execution); mitigated by multi-stage content processing, source dilution across many chunks, token budgets, and structured outputs.
Internal KB / ingestion trigger	Very Low	Complete architectural isolation; no code path connects Web Search to internal search or ingestion.
Recursive crawling	Low	No recursive link-following; one-hop only with blacklists, timeouts, and step caps; residual SSRF risk for direct URL reads mitigated at infrastructure level.

1. Infinite Loop Risk

Risk Rating: 2/10 - Low

What could go wrong

A concern sometimes raised is whether the Web Search tool could enter an infinite loop - for example, by repeatedly calling itself, generating unbounded queries, or causing the orchestrator to invoke it endlessly.

How the architecture prevents this

The following diagram illustrates how the orchestrator loop is bounded at every level:

Key bounds: Default max iterations = 5, absolute hard cap = 10. Per-iteration tool calls capped at 10 (hard cap: 50). Duplicates filtered before execution. WebSearch itself is single-shot and never re-invokes itself.

The Unique AI platform enforces loop prevention at multiple independent layers:

Orchestrator-level iteration cap. The agent orchestrator runs a bounded loop that controls how many times the language model can invoke tools before it must produce a final text answer. This loop has the following hard limits:

Default maximum iterations: 5. This means the model gets at most 5 rounds of tool usage before it must answer.
Absolute hard cap: 10 iterations. Even if the configuration is overridden, the system enforces a ceiling of 10 iterations that cannot be exceeded.
Last-iteration safeguard. On the final allowed iteration, tools are removed entirely from the language model request. This forces the model to produce a text response rather than requesting another tool call.

Per-iteration tool call limit. Within a single iteration, the model may request multiple tool calls. These are capped at 10 by default (absolute cap: 50). Duplicate calls (same tool, same arguments) are automatically filtered out before execution.

Tool-internal step limits. Within a single Web Search execution, the number of operations is bounded:

In V2/V3 search modes, the research plan is limited to a configurable maximum number of steps (default: 5). Any steps beyond this limit are truncated before execution.
In V1 Advanced mode, query refinement is capped at a configurable maximum number of queries (default: 5).

Single-shot execution model. The Web Search tool does not invoke itself. When called, it executes its search plan, processes the results, and returns content to the orchestrator. There is no self-referencing, recursion, or re-invocation mechanism within the tool.

Residual risk

The language model could theoretically choose to call Web Search on every iteration, resulting in up to 10 sequential invocations over the course of a conversation turn. While this is bounded and not infinite, it could lead to higher-than-expected latency and resource consumption. In practice, the model is instructed via its system prompt about iteration limits and typically produces answers well before reaching them.

2. Prompt Injection

Risk Rating: 3/10 - Low

What could go wrong

Prompt injection in the context of Web Search refers specifically to indirect prompt injection: a malicious web page could embed hidden instructions in its content that, when processed by the language model, attempt to alter the model's behavior - for example, overriding system instructions, leaking conversation context, or steering the response in unintended ways.

How the architecture mitigates this

The following diagram traces the path of web content through the system, highlighting where untrusted content enters (vulnerability points) and where it is reduced or filtered (mitigation points):

A single injected source must survive every mitigation stage in this pipeline. Even if it does, the orchestrator LLM can only produce text - it cannot execute commands, modify data, or trigger side effects.

The Web Search tool employs several layers of defense that collectively reduce the attack surface for indirect prompt injection:

Content processing pipeline. All fetched web content passes through a multi-stage processing pipeline before reaching the orchestrator language model:

Cleaning. Raw HTML is converted to Markdown, with regex-based line removal and link cleanup to strip navigation elements, scripts, and other non-content markup.
Truncation. Content is truncated to a configurable length, limiting how much raw text from any single page can enter the system.
Optional LLM summarization. When enabled, an intermediate language model call summarizes each page's content, focusing it on the search query. This step uses structured output schemas that constrain the model's response format, making it harder for injected instructions to propagate.
Chunking. Processed content is split into discrete chunks with metadata (domain, title, snippet), further fragmenting any injected instructions.

Token budget enforcement. The total amount of web content that reaches the orchestrator language model is constrained by a configurable token budget (default: 40% of the model's input capacity). This limits the volume of potentially malicious content that can enter the main conversation.

Strict tool argument schemas. Tool arguments are validated against strict Pydantic schemas with extra="forbid", preventing attackers from smuggling additional parameters into tool invocations via manipulated content.

Structured output constraints. Internal LLM processing steps (GDPR compliance checking, summarization, snippet judging) use structured output schemas that constrain the model to produce responses in a fixed format. This makes it significantly harder for injected instructions to alter the behavior of these intermediate processing steps.

Citation boundary enforcement. The orchestrator's system prompt instructs the model to only cite sources from the current tool response, preventing cross-contamination between search results and prior conversation context.

GDPR privacy filtering. An optional privacy filter can be enabled that uses a dedicated language model to detect and redact sensitive personal data (GDPR Article 9 categories) from crawled content before it reaches the orchestrator. While designed for privacy compliance, this processing step also serves as an additional content transformation layer that can disrupt injected instructions.

Snippet-level pre-filtering (V3 mode). In V3 search mode, a snippet judge evaluates search results based only on their titles and short snippets - not full page content - before deciding which pages to crawl. This means the full-page injection surface is only exposed for pages that pass this relevance filter.

Source dilution through multi-source aggregation. The processing pipeline produces many independent chunks from multiple web pages. A single injected source is only one chunk among many in the orchestrator's context window. The model weighs all chunks together when formulating its response, so a single poisoned source has limited influence compared to the collective weight of legitimate sources. The more sources retrieved, the less influence any one source has.

LLM content judge (configurable). The relevancy sorting and snippet judge can be configured to filter out sources that appear suspicious or off-topic. This provides an additional layer that can catch injection attempts that disguise themselves as relevant content, deprioritizing or removing them before they reach the orchestrator.

Binary and document file blocking. The crawler rejects non-text content types (PDFs, Office documents, images, video, audio) and blocks URLs matching configurable patterns (e.g. *.pdf). This eliminates an entire class of injection vectors based on embedded content in binary file formats.

Residual risk

The residual risk from prompt injection remains low due to three reinforcing factors:

Bounded impact. The worst-case outcome of a successful prompt injection via Web Search is a degraded or misleading text answer. The Web Search tool has no ability to execute system commands, modify data, access internal APIs, or trigger any side effects. The attack surface is strictly limited to influencing the text of the model's response - it cannot escalate beyond that boundary.

Dilution by design. Because web content is cleaned, chunked, and merged with content from multiple independent sources, a single injected page represents only a fraction of the total context. The more sources the tool retrieves, the less influence any one source has. An attacker would need to compromise multiple high-relevance pages simultaneously to have meaningful impact on the final answer.

Industry context. Indirect prompt injection is an inherent challenge for all retrieval-augmented generation (RAG) systems across the industry. The multi-layered mitigations described above - cleaning, truncation, summarization, chunking, relevancy filtering, token budgets, structured outputs, and source dilution - place this risk at the lower end of the industry spectrum. No dedicated prompt injection firewall exists in the pipeline, but the cumulative effect of these controls significantly limits both the likelihood and the impact of a successful attack.

3. Call Surface - Internal Knowledge Base and Ingestion

Risk Rating: 1/10 - Very Low

What could go wrong

A concern is whether the Web Search tool could be used to trigger unintended actions within the platform - specifically, whether it could access or modify the internal knowledge base, trigger document ingestion, or interact with internal search indices.

Why this is not possible

The following diagram illustrates the architectural isolation between Web Search and the internal knowledge base. The two tools operate in completely separate domains:

Web Search results exist only in the conversation's in-memory context. They are never written to the knowledge base, stored on disk, or ingested into any data store.

The Web Search tool is architecturally isolated from all internal platform services:

Complete code-level separation. The Web Search tool is implemented as a standalone package with no imports, references, or dependencies on internal search, knowledge base, or ingestion services. It has no awareness of these systems and no mechanism to interact with them.

Independent tool registration. Web Search and Internal Search are separate tools registered independently in the platform's tool registry.

Outbound-only network profile. The Web Search tool makes exclusively outbound calls:

Search engine API requests (Google, Bing, Brave, Jina, Tavily, Firecrawl, VertexAI, or a configured custom API endpoint)
HTTP requests to external websites (for page crawling)
Language model API calls (for internal processing steps like query refinement and content summarization)

It does not make any calls to platform-internal APIs, databases, or services.

Binary and document file blocking. The crawler rejects non-text content types (PDFs, Office documents, images, video, audio) and blocks URLs matching configurable patterns. No binary files are downloaded, stored, or processed by the tool.

Read-only return path - no persistence. The tool's output is a set of content chunks that are added to the conversation's in-memory context for the language model to reference. All retrieved web content exists only in memory for the duration of that single request. It is never persisted to the knowledge base, written to disk, stored in any database, or ingested into any data store. This output does not trigger any write operations, ingestion workflows, or modifications to the knowledge base. When the conversation turn completes, the web content is not retained.

Orchestrator-level isolation. While the orchestrator language model can call both Web Search and Internal Search within the same conversation turn, each tool executes in complete isolation. Results from Web Search are returned as conversation context - they are never written to or merged with the internal knowledge base.

Residual risk

The only theoretical vector would be if a future code change introduced a dependency between Web Search and internal services. The current architecture provides no such path.

4. Web Crawling and Recursive Fetching

Risk Rating: 3/10 - Low

What could go wrong

Web crawlers that recursively follow links can be exploited for denial-of-service, data exfiltration via SSRF (Server-Side Request Forgery), or uncontrolled resource consumption. The question is whether the Web Search tool's crawling capabilities could be abused in these ways.

How crawling works in the Web Search tool

The following diagram shows the two crawling paths and highlights that neither path performs recursive link-following:

Both paths are strictly one-hop: the crawler fetches the provided URLs and returns their content. It never parses fetched pages for links, builds a crawl frontier, or follows discovered URLs.

The Web Search tool fetches web page content in two controlled scenarios:

Search-triggered crawling. When the configured search engine returns only URLs and snippets (without full page content), the crawler fetches the pages at those URLs to retrieve their content. Critically, the URLs that are crawled are only those returned by the search engine API - not URLs extracted from page content. This is a single-hop fetch with no recursive link-following.

Direct URL reads (V2/V3 modes). The language model can include read_url steps in its research plan, specifying a particular URL to fetch. The crawler fetches exactly that URL. Again, no links are extracted from the fetched content and no recursive crawling occurs.

Why recursive crawling is not possible

The Web Search tool's crawlers operate on a flat list of URLs provided to them. They do not:

Parse fetched HTML for hyperlinks
Build a link graph or crawl frontier
Follow redirects beyond standard HTTP redirect handling
Schedule follow-up fetches based on discovered content

Each crawl operation is a one-time, one-hop HTTP request. There is no depth parameter, no link extraction, and no recursion.

Safeguards in place

URL pattern blacklist. A configurable blacklist (default: blocks *.pdf URLs) prevents the crawler from fetching URLs matching known problematic patterns.

Content-type blocklist. The crawler rejects responses with content types that are not useful for text extraction, including PDFs, Office documents, images, video, and audio files.

Concurrency control. A semaphore limits concurrent requests (default: 10) to prevent resource exhaustion from parallel crawling.

Request timeouts. All HTTP requests and HTML-to-Markdown conversion operations have configurable timeouts. Requests that exceed the timeout are terminated.

Step caps. The total number of crawl operations per tool invocation is bounded by the max_steps configuration (default: 5 for V2/V3 modes) and max_queries (default: 5 for V1 Advanced mode).

Pre-crawl relevance filtering (V3 mode). In V3 search mode, a snippet judge evaluates search results by title and snippet before any pages are crawled. Only URLs deemed relevant are fetched, reducing unnecessary outbound requests and limiting exposure to potentially malicious pages.

Corporate proxy support. All outbound HTTP traffic from the crawler can be routed through a configurable corporate proxy with mTLS support, providing network-level visibility and control over what endpoints the crawler can reach.

Residual risk

The primary residual risk is SSRF (Server-Side Request Forgery) via the read_url step type, where the language model can specify arbitrary URLs for the crawler to fetch.

This risk can be addressed at two independent levels:

Application-level mitigation (URL pattern blacklist). The crawler's configurable URL pattern blacklist - the same mechanism that blocks *.pdf URLs by default - can be extended to block internal network ranges. By adding patterns for localhost, 127.x.x.x, 10.x.x.x, 172.16-31.x.x, 192.168.x.x, and 169.254.x.x (link-local / metadata endpoints), organizations can prevent the crawler from making requests to internal infrastructure at the application level, before any HTTP request is issued. This is a configuration change that requires no code modification.

Infrastructure-level mitigation. As a defense-in-depth measure, the following infrastructure controls provide an additional layer of protection:

Kubernetes network policies restricting pod egress
Corporate proxy configurations that block internal ranges
Firewall rules at the cluster or VPC level

With both layers in place, the SSRF risk is effectively neutralized. Organizations deploying the platform should configure the URL pattern blacklist for their environment and verify that infrastructure-level egress controls are active.

Security Risk Assessment

Table of Contents

Risk Summary

1. Infinite Loop Risk

What could go wrong

How the architecture prevents this

Residual risk

2. Prompt Injection

What could go wrong

How the architecture mitigates this

Residual risk

3. Call Surface - Internal Knowledge Base and Ingestion

What could go wrong

Why this is not possible

Residual risk

4. Web Crawling and Recursive Fetching

What could go wrong

How crawling works in the Web Search tool

Why recursive crawling is not possible

Safeguards in place

Residual risk