Code Reflector Infrastructure

5 min read

These pages are auto-generated from the monorepo docs.

Experimental Disclaimer

reflector is experimental software.

No SLA/SSLA: No service level agreements or support level agreements apply
No Support: No guaranteed support, response times, or issue resolution
Breaking Changes: APIs, configurations, and behavior may change without notice between versions
No Stability Guarantees: Features may be incomplete, modified, or removed at any time
Data Loss Risk: Bugs or changes may result in data loss or corruption
Use at Your Own Risk: This software is provided "as-is" without warranties of any kind

Experimental software is intended for evaluation and testing purposes only. Do not rely on this software for production workloads without understanding these limitations.

Maturity

Rated as experimental as of 2026.08.

Overview

Reflector renders dynamic HTML content generated by AI assistants. When an AI generates interactive content (charts, games, visualizations), the chat frontend stores the HTML in Reflector's in-memory store and retrieves it via a one-time-use nonce inside an iframe.

All data is in-memory only. Restarts clear the store. The nonce is an unguessable, single-use identifier — it acts as both the retrieval key and the access control. This design is necessary because iframes can only load a URL; they cannot set request headers (like Authorization: Bearer ...) or a request body.

The store path is a chunked upload protocol: every individual HTTP request stays under the WAF's body inspection limit (default 64 KB per chunk). The frontend opens an upload session, streams chunks one at a time, and finalizes to receive the nonce. The legacy single-POST /html-store endpoint is still accepted for backward compatibility but is deprecated.

Architecture

Chunked upload endpoints

Endpoint	Auth	Body	Purpose
`POST /html-store/init`	JWT (USER)	`{ "totalBytes": <int> }` (JSON)	Open an upload session. Returns `{ sessionId, chunkSize, expiresInMs }`. The server-returned `chunkSize` is authoritative — clients must respect it so each chunk stays under the WAF body inspection limit.
`POST /html-store/:sessionId/chunk`	JWT (USER)	raw chunk bytes (`application/octet-stream`), `X-Chunk-Index: <0-based int>`, `X-Chunk-Sha256: <hex>` (optional but recommended)	Append a chunk. Chunks may arrive in any order. Re-POSTing the same `(index, X-Chunk-Sha256)` is idempotent and returns `204` without storing twice — safe to retry after a transient `502/503` from the WAF. Re-POSTing the same index with a different hash returns `409 Conflict`.
`POST /html-store/:sessionId/finalize`	JWT (USER)	empty	Reassemble the chunks in index order, decode UTF-8, and store. Returns `{ id }`. Idempotent for `HTML_STORE_FINALIZED_TTL_MS` so a lost final response can be retried and returns the same nonce.

Sessions are owner-bound: the JWT identity that calls init is the only identity that can append chunks or finalize. Other authenticated users get 403 Forbidden.

Legacy single-shot endpoint (deprecated)

| POST /html-store | JWT (USER) | full HTML body (text/html) | Retained for backward compatibility while clients migrate. The body is read as a Node stream and capped by HTML_STORE_MAX_SIZE_BYTES. Tracked by the reflector_html_store_legacy_uploads_total counter so we can verify zero traffic before deletion. |

Security

Domain Isolation Required

Reflector MUST be hosted on a different subdomain than Unique Chat. The browser's same-origin policy treats different subdomains as different origins. A different top-level domain is not required — a distinct subdomain under the same domain is sufficient and avoids the overhead of managing separate domains and certificates.

✅ Correct: reflector.example.app and chat.example.app (different origin)

❌ Wrong: Both on chat.example.app (same origin — rendered HTML can access cookies)

The DNS A record for the reflector subdomain must point to the existing ingress gateway. This can be managed via external-dns, Terraform, or manual DNS provisioning.

Control	Description
Domain isolation	Reflector runs on a separate subdomain to prevent cookie access from rendered HTML
JWT on store	All `POST /html-store/*` endpoints require a valid Bearer token
Owner-bound sessions	Upload sessions are bound to the JWT identity that opened them; chunk and finalize from a different user return 403
Per-chunk size cap	Each chunk is limited to `HTML_STORE_CHUNK_SIZE_BYTES` (default 64 KB) to clear WAF body inspection limits
Total-size cap	Assembled total is capped at `HTML_STORE_MAX_SIZE_BYTES` (default 7 MB)
Nonce on retrieval	`GET /scoped/html-store/:nonce` is protected by the nonce in the URL path (see Overview for why JWT is not used here)
One-time retrieval	Content is deleted immediately after first retrieval; the nonce cannot be reused
TTL expiration	Unretrieved content expires after 60s (configurable)
Slot limits	Bounded in-memory slots prevent resource exhaustion (default: 50)
Session limits	Bounded concurrent in-flight upload sessions, default 200 (configurable via `HTML_STORE_MAX_SESSIONS`)
No persistence	All data is in-memory; restarts clear the store

Deployment

Reflector can be deployed with the backend-service Helm chart (version 10.2.0+, which supports route configuration). Container image: uniqueapp.azurecr.io/unique/reflector.

none

helm repo add unique https://unique-ag.github.io/helm-charts
helm install reflector unique/backend-service --version 10.2.0 -n chat -f reflector-values.yaml

Routes

The chart needs route config to expose two endpoints — one authenticated, one public. The scoped route restricts to GET-only and only allows the /html-store path prefix. The hostname must be on a different subdomain than the chat frontend — see Security.

yaml

routes:
  hostname: reflector.<cluster-domain>
  pathPrefix: '/'
  paths:
    default:
      extraAnnotations:
        external-dns.alpha.kubernetes.io/enabled: 'true'
    probe:
      enabled: true
    scoped:
      enabled: true
      allowList:
        - # matches GET /scoped/html-store/{nonce}
          path: /html-store
          type: prefix
          methods:
            - GET
      extraAnnotations:
        external-dns.alpha.kubernetes.io/enabled: 'true'
        konghq.com/methods: GET

Route	Path	Auth	Methods	Purpose
default	`/`	JWT required	POST	Store endpoint (`POST /html-store`)
scoped	`/scoped/*`	nonce in URL	GET only	Retrieval endpoint (`GET /scoped/html-store/:nonce`)

Integration with Other Services

next-chat (frontend) — tells the browser where to reach Reflector:

Variable	Value
`REFLECTOR_BACKEND_API_URL`	`https://reflector.<cluster-domain>`
`FEATURE_FLAG_ENABLE_ADVANCED_FORMATTING_UN_9758`	`true`
`FEATURE_FLAG_ENABLE_HTML_RENDERING_UN_15131`	`true`

Deployment Checklist

Deploy Reflector on a different subdomain than the chat frontend (not functionally required but security-critical)
Configure the ingress to route / (authenticated with JWT) and /scoped/* (GET-only, protected by nonce not JWT) to Reflector
Ensure that both are cluster-external reachable addresses as the frontend can't use Kubernetes cluster-local routes (_{http://release-name.svc:port} won't work)
Set REFLECTOR_BACKEND_API_URL in next-chat
Enable feature flags FEATURE_FLAG_ENABLE_ADVANCED_FORMATTING_UN_9758 and FEATURE_FLAG_ENABLE_HTML_RENDERING_UN_15131 in next-chat
Verify the reflector DNS A record points to the ingress gateway
Test: POST /html-store with a valid JWT returns a nonce
Test: GET /scoped/html-store/:nonce returns the HTML and the nonce is invalidated

Space Configuration

The Code Reflector is triggered when the Code Execution Tool creates an HTML file. Code Execution is by default activated in spaces that use OpenAI models on the Responses API. You can read more about how to configure Code Execution here.

Suggested prompt ideas / Tests

Snake — "Write me a quick snake game and render it as HTML"
Chart — "Write me a Plotly chart showing Microsoft stock with sample data"
Simulation — "Make a simulation of colliding particles"

Troubleshooting

HTML rendering does not trigger

Ensure that Code Execution is activated for the space. Code Execution is by default activated in spaces that use OpenAI models on the Responses API. See Code Execution for configuration details.

Content not loading in iframe

Verify the reflector hostname resolves and the A record points to the ingress gateway.
Check that the scoped route (/scoped/*) is reachable without authentication. Open https://reflector.<domain>/scoped/html-store/nonexistent in a browser — you should get a 404, not a 401/403.
Both the store and retrieval URLs must be cluster-external. The frontend runs in the user's browser and cannot reach cluster-internal addresses.

Nonce expired or not found (404)

Content expires after 60 seconds by default. If the iframe loads slowly or the user's network is slow, the nonce may have expired before the iframe fetches it. Content is also deleted after the first retrieval — a second request for the same nonce returns 404.