LiteLLM - AI Gateway
12 min read
LiteLLM is the AI Gateway used by Unique to connect the Unique platform to approved AI model providers. It can make additional models available, for example models hosted by Google Cloud Vertex AI or other enterprise AI providers.
This page explains what this means for clients from a security, compliance, and responsibility perspective.
Short version: For Unique SaaS clients, Unique performs the provider setup and compliance controls. The client accepts the relevant sub-processors, regions, and contractual terms. For self-hosted clients, the client is responsible for the full setup and all compliance decisions.
Who This Page Is For
Audience | Applies to | Main client responsibilities |
|---|---|---|
Unique SaaS clients | Multi-Tenant and Single-Tenant deployments operated by Unique | Review and accept the relevant sub-processors, regions, and contractual documentation. |
Self-hosters | Clients operating Unique in their own environment | Perform provider onboarding, security review, compliance setup, operations, and ongoing control checks. |
About Provider Documentation Links
Provider documentation is the authoritative source for provider-specific terms, controls, regions, and availability. Unique makes reasonable efforts to keep this page current, but provider terms and documentation may change without notice. Readers should always verify the latest provider documentation before making compliance, contractual, or operational decisions.
What LiteLLM Changes
LiteLLM acts as a controlled gateway between Unique and AI model providers. It does not remove the need for provider review, data protection checks, regional checks, or customer approval. It simply provides one managed way to route model requests to the providers that are approved for a given deployment.
Depending on the selected model, data may be processed by providers outside the standard Azure OpenAI setup. This is why sub-processor approval, data residency, and contractual review matter before activation.
Unique SaaS Clients
This section applies to clients using Unique as a SaaS service, including both Multi-Tenant and Single-Tenant clients operated by Unique.
How LiteLLM is used
It is to note that Unique exclusively operates the Open Source setup of LiteLLM. Also LiteLLM is used solely and only as outgoing gateway. The rest of the (some also) paid enterprise features are not used.
What Unique Handles
For SaaS deployments, Unique performs the operational and compliance setup before LiteLLM-backed models are made available. This includes:
Reviewing and approving AI providers before use.
Setting up provider accounts, credentials, and secure access.
Checking provider regions against client requirements.
Configuring available models only after the required internal approvals.
Handling provider-specific opt-outs where available, such as training, abuse monitoring, or prompt logging opt-outs.
Operating, monitoring, and securing the LiteLLM gateway.
What the Client Needs to Do
The client does not need to configure LiteLLM or AI providers directly. The client must review and accept the relevant contractual and compliance documents before activation.
Confirm that the use of additional AI providers is allowed under the client contract.
Accept the applicable sub-processors and processing regions.
Confirm that data residency requirements are compatible with the selected models and regions.
Confirm any additional requirements for regulated or confidential data, such as banking secrecy or strict data classification rules.
Acknowledge that model usage may count toward the agreed AI usage or cost allowance, depending on the commercial setup.
Important Notes for SaaS Clients
Unique enables LiteLLM-backed models only after the necessary checks are complete.
Provider accounts and keys are managed by Unique for SaaS deployments.
Clients do not bring their own provider keys into Unique SaaS.
On Multi-Tenant deployments, enablement can be controlled per client company.
On Single-Tenant SaaS deployments, Unique still operates the service and performs the setup.
Self-Hosting Clients
This section applies to clients who operate Unique themselves in their own cloud, Kubernetes cluster, or hosting environment.
For self-hosted deployments, Unique provides product guidance, but the client is responsible for the full compliance and operational setup. This includes every provider, every model, every region, and every control required by the client's own policies and regulations.
What Self-Hosters Must Handle
Contracting with each AI provider and ensuring a valid Data Processing Agreement is in place.
Reviewing each provider as a sub-processor or technology supplier.
Selecting allowed processing regions and verifying that each model is available in those regions.
Configuring provider opt-outs for model training, abuse monitoring, prompt logging, or similar features where available.
Choosing and documenting content filtering settings according to internal policy.
Managing provider credentials, secrets, access control, backups, monitoring, logging, and incident response.
Running any required DPIA, TPRM, DORA, banking secrecy, or internal risk review before production use.
Vertex AI and Google Cloud
Some models can be used through Google Cloud Vertex AI. In that case, the contractual and compliance relationship is typically with Google Cloud for the Vertex AI service. Self-hosters should retain their signed Google Cloud data processing terms and verify that their selected models, regions, and monitoring settings meet their requirements.
Where supported, Vertex AI can use identity-based authentication instead of static API keys. This can reduce key-management risk, but it still needs to be configured, reviewed, and operated by the self-hosting client.
Google Cloud provides public information on its data processing terms here: Cloud Data Processing Addendum.
Compliance Topics to Review
Topic | Why it matters | SaaS responsibility | Self-hosted responsibility |
|---|---|---|---|
Sub-processors | Additional model providers may process client data. | Unique maintains and communicates the applicable sub-processors. Client accepts them. | Client reviews, contracts, and documents each provider. |
Data residency | Models are only compliant if used in approved regions. | Unique checks and configures allowed regions. | Client selects and verifies regions per model and provider. |
Model training | Client data must not be used to train provider models unless explicitly allowed. | Unique checks the provider terms and configures opt-outs where available. | Client reviews provider terms and configures opt-outs. |
Abuse monitoring and prompt logging | Some providers monitor prompts by default for safety or abuse detection. | Unique requests or configures opt-outs where available and required. | Client decides whether the default is acceptable and requests or configures opt-outs. |
Content filtering | Safety filters are separate from training and abuse monitoring settings. | Unique applies the approved setup for the SaaS service. | Client defines and documents the policy. |
Regulated data | Banking secrecy, confidential data, or personal data may require additional controls. | Unique supports the review process, but client approval is required. | Client performs the full regulatory and risk review. |
Vertex AI
For self-hosted deployments, use the detailed setup guide here: Vertex AI Setup. The sections below summarize the compliance controls that must be checked before activation.
At a high level: model training is contractually restricted by default for managed Vertex AI models unless the customer gives permission or instruction; abuse monitoring and prompt logging may apply for some Google Cloud customers, but Google provides an exception request process; safety filtering is separate and not a simple full opt-out because some filters are configurable while others are non-configurable.
Sources: Gemini Enterprise Agent Platform and zero data retention, Abuse monitoring, Safety and content filters
European Data Residency
Vertex AI can support EU data residency when requests use a European region or Google’s EU multi-region endpoint. The global endpoint should not be used for strict residency requirements because Google states the processing region cannot be controlled there.
Sources: Deployments and endpoints, Gemini Enterprise Agent Platform partner models for MaaS
Opt Out of Model Training
For managed models on Vertex AI, Google states it does not use customer data to train or fine-tune AI/ML models without prior permission or instruction. For Claude on Vertex AI, Anthropic’s Vertex terms state Anthropic may not train models on Customer Content.
Sources: Gemini Enterprise Agent Platform and zero data retention, Anthropic on Vertex commercial terms
Opt Out of Abuse Monitoring / Prompt Logging
Google may log prompts only when automated abuse classifiers detect suspicious activity, and only to investigate potential policy violations; Google states this data is not used for training and is stored for up to 90 days in the selected region or multi-region. Eligible customers can request an exception from abuse logging.
For regulated or high-confidentiality use cases, this setting should be treated as a pre-activation control. If prompts may contain banking data, personal data, or confidential client information, the abuse-logging exception should be requested and documented before production use.
Source: Abuse monitoring
Opt Out of Safety Filtering
Safety filtering is separate from abuse monitoring and model training. Some filters are configurable, but Google also documents non-configurable filters, for example for prohibited content such as CSAM and sensitive personal data. Therefore this is not a simple “opt out” control; clients can adjust supported filter thresholds, but cannot assume all safety filters can be disabled.
In short: abuse monitoring can be reviewed as a prompt-logging and retention control, while safety filtering is a runtime protection control. Opting out of abuse logging does not mean safety filters are disabled.
Sources: Safety and content filters, Abuse monitoring
Content Filtering Policy Options
Vertex AI supports configurable content filters for Gemini models, including harm-category thresholds that determine how often prompts or responses are blocked. Clients should define these settings based on their use case, risk appetite, user group, and regulatory duties, while remembering that provider-level non-configurable filters may still apply.
Source: Safety and content filters
FAQ
Does LiteLLM mean my data is sent to every connected provider?
No. LiteLLM routes a request to the model name selected in the request and configured in the LiteLLM model_list; its own docs explain that model_name is the user-facing model and litellm_params.model is the actual backend model used for that request. If several deployments share the same model name, LiteLLM can load-balance between those deployments, but it does not broadcast prompts to every configured provider. In practice, the compliance-relevant control is the approved model configuration: only enabled models and providers can receive traffic.
Source: LiteLLM configuration docs
Does Claude on Vertex AI guarantee EU data residency?
EU processing on Vertex AI depends on using the correct endpoint and confirming model availability. Google documents that multi-region endpoints can keep machine-learning processing of Customer Data within a jurisdictional boundary such as the EU, using location eu and endpoint https://aiplatform.eu.rep.googleapis.com; it also warns that the global endpoint should not be used where processing location must be controlled. For single regions, the deployment must use a specific European Vertex AI region such as Belgium, Netherlands, Zurich, Frankfurt, or another listed Europe region, and the exact Claude model version must be checked in Google’s partner model location tables before activation.
Sources: Deployments and endpoints, Gemini Enterprise Agent Platform partner models for MaaS
Are prompts for Claude on Vertex AI sent to Anthropic?
Requests are sent to Google Cloud Vertex AI endpoints from the client’s Google Cloud project, and Google Cloud handles the project, endpoint, access, and billing flow. Anthropic’s own “Anthropic on Vertex” commercial terms state that the service is made available through Vertex AI, hosted and managed by Google, and that Anthropic’s technology provided to Google does not give Anthropic access to the client’s GCP instance, including prompts and outputs; Anthropic may receive usage data, but that usage data excludes customer content. Clients should still retain both the Google Cloud agreement/DPA and the Anthropic-on-Vertex terms accepted for the model.
Sources: Request predictions with Claude models, Anthropic on Vertex commercial terms
Can Google use your/our data to train Gemini?
For managed models on Vertex AI, Google states that it will not use customer data to train or fine-tune AI/ML models without the customer’s prior permission or instruction, and that this applies to all managed Vertex AI models, including GA and pre-GA models. This commitment is separate from limited retention cases such as abuse monitoring, grounding with Google Search or Maps, or Gemini Live session resumption, which must be reviewed separately if used. Clients should keep the signed Google Cloud agreement and Cloud Data Processing Addendum as evidence.
Sources: Gemini Enterprise Agent Platform and zero data retention, Cloud Data Processing Addendum
Is there a written commitment we can show a regulator?
Yes. Google’s public Vertex AI data governance documentation states that Google will not use customer data to train or fine-tune AI/ML models without prior permission or instruction, and Google’s Cloud Data Processing Addendum describes Google’s processing and security obligations for Customer Data. For audit or regulatory evidence, clients should retain the signed Google Cloud agreement, the Cloud Data Processing Addendum, and the selected model and region configuration used for the deployment.
Sources: Gemini Enterprise Agent Platform and zero data retention, Cloud Data Processing Addendum.
Is abuse monitoring the same as model training?
No. Google describes abuse monitoring as a safety process to detect potential violations of Google’s Acceptable Use Policy and Prohibited Use Policy, not as model training. If automated classifiers detect suspicious activity, Google may log prompts solely to investigate abuse, stores that data for up to 90 days in the selected region or multi-region, and says it will not use that data to train or fine-tune models. Where applicable, customers can request an exception from abuse logging.
The matching form to opt-out from abuse monitoring can be found in Abuse monitoring.
For Unique SaaS clients Unique requests and confirms the opt-out before activation.
Self-hosting clients have to perform these steps on their own behalf.
Source: Abuse monitoring
Can provider employees read prompts?
This depends on the provider and activation path, so it must be reviewed per model. For Google Vertex AI abuse monitoring, Google says authorized Google employees may assess flagged prompts if automated classifiers detect suspicious activity; if approved for an abuse-logging exception, Google says it will not store prompts for the approved Cloud account. For Claude on Vertex AI, Anthropic’s commercial terms say Anthropic does not get access to the client’s GCP instance, prompts, or outputs, while usage data excludes customer content.
Sources: Abuse monitoring, Anthropic on Vertex commercial terms
Does default abuse monitoring conflict with banking secrecy?
It can, depending on the jurisdiction, the type of data in prompts, and the bank’s interpretation of secrecy obligations. Google states that, if abuse monitoring is active and automated classifiers detect suspicious activity, prompts may be logged and authorized Google employees may assess flagged prompts; this is the risk the abuse-logging exception is meant to address. For banking or high-confidentiality use cases, clients should either avoid sensitive raw inputs, pseudonymise prompts before processing, or ensure the abuse-logging exception is requested and documented before activation.
Source: Abuse monitoring
Does opting out of abuse monitoring disable safety filters?
No, these are separate controls. Google’s safety filter documentation explains that safety and content filters block unsafe prompts or responses and can include configurable thresholds for harm categories, as well as non-configurable filters for categories such as CSAM and sensitive personal data. Abuse monitoring is a separate process about whether prompts may be logged for investigation after suspicious activity is detected. Opting out of abuse logging should therefore be reviewed as a privacy and retention control, not as a decision to disable safety filtering.
Sources: Safety and content filters, Abuse monitoring
Is a DPIA required?
A DPIA is not automatically required for every AI use case, but under GDPR Article 35 it is required where processing is likely to result in a high risk to individuals’ rights and freedoms, especially when using new technologies or processing sensitive data at scale. In practice, clients should expect a DPIA or equivalent risk assessment when prompts may contain personal data, confidential business data, banking data, regulated financial data, or data subject to strict residency or secrecy obligations. For SaaS, clients follow their internal vendor and data protection process; for self-hosting, the client must perform and evidence the assessment itself.
Source: GDPR Article 35
Who is the contractual counterparty for Claude on Vertex AI?
There is not just one document to review. The Google Cloud relationship covers the Google Cloud project, Vertex AI service, billing, endpoint, regional configuration, and the Google Cloud DPA; at the same time, Anthropic’s Vertex commercial terms govern use of the Claude model made available through Vertex AI. Anthropic states that Vertex is hosted and managed by Google, that Anthropic is not responsible for Vertex or other Google services, and that the customer must comply with applicable Google policies and agreements. Compliance teams should therefore retain both sets of terms.
Sources: Request predictions with Claude models, Anthropic on Vertex commercial terms, Cloud Data Processing Addendum
How should these providers be classified in TPRM or DORA processes?
For regulated clients, AI model providers and cloud AI services should normally be assessed as ICT third-party providers or sub-processors, with the final classification depending on the use case, data type, criticality, substitutability, and impact on business continuity. DORA and EBA outsourcing guidance focus on ICT third-party risk, critical or important functions, due diligence, contractual controls, audit/access rights, and exit planning; AI providers used for regulated workflows should therefore not be treated as ordinary software features. Clients should document whether the model use supports a critical or important function and classify Google Cloud, Anthropic, Mistral, or other providers accordingly.
Sources: EBA DORA oversight, EBA Guidelines on outsourcing arrangements
Summary
LiteLLM gives access to approved AI models through a controlled gateway. For Unique SaaS, Unique operates the gateway and performs the provider setup and compliance controls; the client reviews and accepts the relevant sub-processors, regions, and contractual terms. For self-hosted deployments, the client is responsible for the full setup, approval, and ongoing operation.