Code Reflector Infrastructure

5 min read

info

These pages are auto-generated from the monorepo docs.

warning

Experimental Disclaimer

reflector is experimental software.

  • No SLA/SSLA: No service level agreements or support level agreements apply

  • No Support: No guaranteed support, response times, or issue resolution

  • Breaking Changes: APIs, configurations, and behavior may change without notice between versions

  • No Stability Guarantees: Features may be incomplete, modified, or removed at any time

  • Data Loss Risk: Bugs or changes may result in data loss or corruption

  • Use at Your Own Risk: This software is provided "as-is" without warranties of any kind

Experimental software is intended for evaluation and testing purposes only. Do not rely on this software for production workloads without understanding these limitations.

Maturity

Rated as experimental as of 2026.08.

Overview

Reflector renders dynamic HTML content generated by AI assistants. When an AI generates interactive content (charts, games, visualizations), the chat frontend stores the HTML in Reflector's in-memory store and retrieves it via a one-time-use nonce inside an iframe.

All data is in-memory only. Restarts clear the store. The nonce is an unguessable, single-use identifier — it acts as both the retrieval key and the access control. This design is necessary because iframes can only load a URL; they cannot set request headers (like Authorization: Bearer ...) or a request body.

The store path is a chunked upload protocol: every individual HTTP request stays under the WAF's body inspection limit (default 64 KB per chunk). The frontend opens an upload session, streams chunks one at a time, and finalizes to receive the nonce. The legacy single-POST /html-store endpoint is still accepted for backward compatibility but is deprecated.

Architecture

embedded_ae50ca67038dc3831fd2f6cb43619d51.pngembedded_d101b936232701243c58039b3f43f594.png

Chunked upload endpoints

Endpoint

Auth

Body

Purpose

POST /html-store/init

JWT (USER)

{ "totalBytes": <int> } (JSON)

Open an upload session. Returns { sessionId, chunkSize, expiresInMs }. The server-returned chunkSize is authoritative — clients must respect it so each chunk stays under the WAF body inspection limit.

POST /html-store/:sessionId/chunk

JWT (USER)

raw chunk bytes (application/octet-stream), X-Chunk-Index: <0-based int>, X-Chunk-Sha256: <hex> (optional but recommended)

Append a chunk. Chunks may arrive in any order. Re-POSTing the same (index, X-Chunk-Sha256) is idempotent and returns 204 without storing twice — safe to retry after a transient 502/503 from the WAF. Re-POSTing the same index with a different hash returns 409 Conflict.

POST /html-store/:sessionId/finalize

JWT (USER)

empty

Reassemble the chunks in index order, decode UTF-8, and store. Returns { id }. Idempotent for HTML_STORE_FINALIZED_TTL_MS so a lost final response can be retried and returns the same nonce.

Sessions are owner-bound: the JWT identity that calls init is the only identity that can append chunks or finalize. Other authenticated users get 403 Forbidden.

Legacy single-shot endpoint (deprecated)

| POST /html-store | JWT (USER) | full HTML body (text/html) | Retained for backward compatibility while clients migrate. The body is read as a Node stream and capped by HTML_STORE_MAX_SIZE_BYTES. Tracked by the reflector_html_store_legacy_uploads_total counter so we can verify zero traffic before deletion. |

Security

warning

Domain Isolation Required

Reflector MUST be hosted on a different subdomain than Unique Chat. The browser's same-origin policy treats different subdomains as different origins. A different top-level domain is not required — a distinct subdomain under the same domain is sufficient and avoids the overhead of managing separate domains and certificates.

✅ Correct: reflector.example.app and chat.example.app (different origin)

❌ Wrong: Both on chat.example.app (same origin — rendered HTML can access cookies)

The DNS A record for the reflector subdomain must point to the existing ingress gateway. This can be managed via external-dns, Terraform, or manual DNS provisioning.

Control

Description

Domain isolation

Reflector runs on a separate subdomain to prevent cookie access from rendered HTML

JWT on store

All POST /html-store/* endpoints require a valid Bearer token

Owner-bound sessions

Upload sessions are bound to the JWT identity that opened them; chunk and finalize from a different user return 403

Per-chunk size cap

Each chunk is limited to HTML_STORE_CHUNK_SIZE_BYTES (default 64 KB) to clear WAF body inspection limits

Total-size cap

Assembled total is capped at HTML_STORE_MAX_SIZE_BYTES (default 7 MB)

Nonce on retrieval

GET /scoped/html-store/:nonce is protected by the nonce in the URL path (see Overview for why JWT is not used here)

One-time retrieval

Content is deleted immediately after first retrieval; the nonce cannot be reused

TTL expiration

Unretrieved content expires after 60s (configurable)

Slot limits

Bounded in-memory slots prevent resource exhaustion (default: 50)

Session limits

Bounded concurrent in-flight upload sessions, default 200 (configurable via HTML_STORE_MAX_SESSIONS)

No persistence

All data is in-memory; restarts clear the store

Deployment

Reflector can be deployed with the backend-service Helm chart (version 10.2.0+, which supports route configuration). Container image: uniqueapp.azurecr.io/unique/reflector.

none
helm repo add unique https://unique-ag.github.io/helm-charts
helm install reflector unique/backend-service --version 10.2.0 -n chat -f reflector-values.yaml

Routes

The chart needs route config to expose two endpoints — one authenticated, one public. The scoped route restricts to GET-only and only allows the /html-store path prefix. The hostname must be on a different subdomain than the chat frontend — see Security.

yaml
routes:
  hostname: reflector.<cluster-domain>
  pathPrefix: '/'
  paths:
    default:
      extraAnnotations:
        external-dns.alpha.kubernetes.io/enabled: 'true'
    probe:
      enabled: true
    scoped:
      enabled: true
      allowList:
        - # matches GET /scoped/html-store/{nonce}
          path: /html-store
          type: prefix
          methods:
            - GET
      extraAnnotations:
        external-dns.alpha.kubernetes.io/enabled: 'true'
        konghq.com/methods: GET

Route

Path

Auth

Methods

Purpose

default

/

JWT required

POST

Store endpoint (POST /html-store)

scoped

/scoped/*

nonce in URL

GET only

Retrieval endpoint (GET /scoped/html-store/:nonce)

Integration with Other Services

next-chat (frontend) — tells the browser where to reach Reflector:

Variable

Value

REFLECTOR_BACKEND_API_URL

https://reflector.<cluster-domain>

FEATURE_FLAG_ENABLE_ADVANCED_FORMATTING_UN_9758

true

FEATURE_FLAG_ENABLE_HTML_RENDERING_UN_15131

true

Deployment Checklist

  • Deploy Reflector on a different subdomain than the chat frontend (not functionally required but security-critical)

  • Configure the ingress to route / (authenticated with JWT) and /scoped/* (GET-only, protected by nonce not JWT) to Reflector

  • Ensure that both are cluster-external reachable addresses as the frontend can't use Kubernetes cluster-local routes (http://release-name.svc:port won't work)

  • Set REFLECTOR_BACKEND_API_URL in next-chat

  • Enable feature flags FEATURE_FLAG_ENABLE_ADVANCED_FORMATTING_UN_9758 and FEATURE_FLAG_ENABLE_HTML_RENDERING_UN_15131 in next-chat

  • Verify the reflector DNS A record points to the ingress gateway

  • Test: POST /html-store with a valid JWT returns a nonce

  • Test: GET /scoped/html-store/:nonce returns the HTML and the nonce is invalidated

Space Configuration

The Code Reflector is triggered when the Code Execution Tool creates an HTML file. Code Execution is by default activated in spaces that use OpenAI models on the Responses API. You can read more about how to configure Code Execution here.

Suggested prompt ideas / Tests

  • Snake — "Write me a quick snake game and render it as HTML"

  • Chart — "Write me a Plotly chart showing Microsoft stock with sample data"

  • Simulation — "Make a simulation of colliding particles"

Troubleshooting

HTML rendering does not trigger

Ensure that Code Execution is activated for the space. Code Execution is by default activated in spaces that use OpenAI models on the Responses API. See Code Execution for configuration details.

Content not loading in iframe

  • Verify the reflector hostname resolves and the A record points to the ingress gateway.

  • Check that the scoped route (/scoped/*) is reachable without authentication. Open https://reflector.<domain>/scoped/html-store/nonexistent in a browser — you should get a 404, not a 401/403.

  • Both the store and retrieval URLs must be cluster-external. The frontend runs in the user's browser and cannot reach cluster-internal addresses.

Nonce expired or not found (404)

Content expires after 60 seconds by default. If the iframe loads slowly or the user's network is slow, the nonce may have expired before the iframe fetches it. Content is also deleted after the first retrieval — a second request for the same nonce returns 404.

Last updated