Model Usage and Cost Management
5 min read
This feature is EXPERIMENTAL and under active development. It may change significantly, be discontinued, or have breaking changes without notice. Documentation may be incomplete or outdated and is NOT recommended for production use. Use at your own risk. Please refer to our Upgrade and Release Process for more information.
Overview
The Model Usage and Cost Management feature gives organisations full visibility into how their users interact with AI models — tracking token consumption and estimated spend per user, assistant, and application in real time.
Once enabled, the feature:
Records every LLM call (input tokens + completion tokens + estimated cost in USD)
Shows an Admin dashboard with usage reporting and model pricing
Shows a per-user spend badge in the Chat interface
Supports CSV export for offline analysis
Enabling the Feature
Model Usage tracking is controlled by two feature flags. Both are disabled by default and must be activated by Unique.
Feature Flag | What it enables | Feature Status |
|---|---|---|
| Starts recording usage data to the database. Must be on for anything to work. | Experimental |
| Shows the Cost Management section in the Admin UI and the spend badge in the Chat user menu. | Experimental |
| Allows setting limits for model usage. | Experimental |
To enable these flags for a client, please contact your Unique Representative or raise a request via the Enterprise mailbox.
Admin UI — Cost Management
Visible in the Settings Page → Cost Management (left sidebar), once the dashboard flag is enabled.
Requires the user to have the role CHAT_FEEDBACK_READ or CHAT_DATA_ADMIN.
Model Pricing
A read-only table showing the active pricing for every supported model:
Column | Description |
|---|---|
Model | Internal model identifier (e.g. |
Input cost / 1M tokens | Cost in USD per million prompt tokens |
Completion cost / 1M tokens | Cost in USD per million completion (output) tokens |
Currency | USD by default |
Usage Reporting
An aggregated view of token consumption and estimated spend across the organisation.
Filters available:
Date range — This month, last month, this week, or custom range (with previous/next period navigation)
View by — Model, User, or Assistant
Drill-down — Click any row to see the breakdown within that dimension
Text filter — Search by model name, user, or assistant
Pagination — Configurable page size
Limits
The Limits tab inside the Cost Management dashboard lets administrators cap how much each user spends on AI model calls per day. All limits are denominated in USD and reset automatically at midnight in the user's local timezone — there is no manual reset.
Navigate to Settings → Cost Management → Limits to configure limits. Requires the CHAT_DATA_ADMIN role.
Feature flag: Limits are controlled by FEATURE_FLAG_MODEL_USAGE_LIMITS_UN_18889 and must be activated by Unique.
Limit levels
Three levels of limits can be configured:
Level | Description |
|---|---|
Default (Company) | Applies to every user in the organisation who has no user or group limit. Acts as the global fallback. |
Group limit | Applies to all members of a specific group. Overrides the company default. |
User limit | Applies to a specific user. Takes priority over everything else. |
Priority and resolution
When a user makes a model call, the system resolves which limit applies in this order:
User limit — if a direct limit has been set for that user, it is used. No group or company limit is evaluated.
Group limit — if the user belongs to one or more groups that have limits, the highest group limit applies (the most permissive one wins).
Company default — if no user or group limit applies, the company-wide default is used.
No limit — if no limit is configured at any level, there is no spending cap.
Daily reset
Limits are daily and reset at midnight in the user's timezone. The timezone is locked at the start of each calendar day and does not shift mid-day if the user changes their timezone settings.
Enforcement
Limits are checked before each model call. When a user's accumulated spend for the day reaches their limit, the next model call is blocked and the user sees a message indicating how much they have used and what their daily limit is.
Setting a limit
Company default — enter a USD amount and save. Set to
0to remove the default (no cap).Group limits — select a group, enter a USD amount, and save. Remove the row to lift the cap for that group.
User limits — search for a user, enter a USD amount, and save. Remove the row to lift the cap for that user.
How Costs Are Calculated
Cost is computed per LLM call using the formula:
spent = (inputTokens x inputCostPer1M + completionTokens x completionCostPer1M) / 1,000,000
All costs are expressed in USD.
Default Pricing Sources
Default prices are maintained by the Unique engineering team and sourced from:
Azure OpenAI models — Azure OpenAI Service Pricing (Sweden Region): https://azure.microsoft.com/en-us/pricing/details/azure-openai/
LiteLLM-proxied models (Anthropic Claude, Gemini, Mistral, OpenAI via LiteLLM) — LiteLLM Providers and Models: https://models.litellm.ai/
Sample Default Prices (April 2026)
Model | Input (USD / 1M tokens) | Completion (USD / 1M tokens) |
|---|---|---|
AZURE_GPT_4o_2024_1120 | $2.50 | $10.00 |
AZURE_GPT_4o_MINI_2024_0718 | $0.15 | $0.60 |
AZURE_GPT_41_2025_0414 | $2.00 | $8.00 |
AZURE_o3_2025_0416 | $2.00 | $8.00 |
AZURE_o4_MINI_2025_0416 | $1.10 | $4.40 |
litellm:anthropic-claude-sonnet-4-5 | $3.00 | $15.00 |
litellm:anthropic-claude-opus-4-5 | $5.00 | $25.00 |
litellm:gemini-2-5-pro | $1.25 | $10.00 |
Per-Client Price Override
Prices can be customised per client. If a client's Azure contract or LiteLLM agreement carries different rates, the Unique engineering team can supply a custom pricing configuration for that deployment. Contact your CS to arrange this.
CSV Export
Usage data can be exported as a CSV file via the analytics export pipeline. User privacy settings (pseudonymisation or anonymisation) are respected per organisation configuration.
CSV columns:
Column | Description |
|---|---|
S/N | Row sequence number |
User ID | User identifier (may be pseudonymised/anonymised per org settings) |
Assistant ID | The assistant used (N/A if not applicable) |
Chat ID | The conversation session |
App ID | The Unique application |
Language Model | Model identifier |
Spent | Estimated cost in USD |
Input Tokens | Number of prompt tokens |
Completion Tokens | Number of output tokens |
Timestamp | UTC timestamp of the LLM call |
Required User Roles
Role | What it grants |
|---|---|
| View usage reporting and model pricing in Admin |
| Full access to model usage data in Admin |
Related Documentation
API Model Usage Tracking — Technical reference for developers integrating via the API