Ingestion Worker Configuration
2 min read
The Ingestion Worker (IW) handles the extraction of text using Microsoft Document Intelligence (MDI) or a Custom API (Docling, Unique Agentic Ingestion (UAI), etc.) and split into distinct Chunks for any document uploaded to the Unique platform. The IW is used in two use cases and for each a distinct container is deployed:
Upload-to-Chat (UtC): node-ingestion-worker-chat
Upload-to-Knowledge-Base (UtKB): node-ingestion-worker
To ensure a smooth ingestion experience, the right configuration must be set depending on the scale of your company and the usage frequency of UtC and UtKB.

Recommended Configuration
CPU / Memory per worker
Default Worker - Optimal resource requirements
resources:
requests:
cpu: 3
memory: 3000Mi
limits:
cpu: 3
memory: 3050MiChat Worker - Optimal resource requirements
resources:
limits:
cpu: 3.5
memory: 8050Mi
requests:
cpu: 3.5
memory: 8000MiAuto-scaling Configuration
To ensure an optimal user experience in the UtC use case, it is essential to avoid any latency caused by container bootstrapping, as this would directly impact users. Typically, no documents are uploaded to the chat after business hours. Therefore, we suggest implementing auto-scaling using a cron-based schedule to match expected usage patterns.
In contrast, for the UtKB use case, some latency is acceptable since the user experience is not directly affected. Nevertheless, documents might be uploaded / ingested over night which requires a different configuration here.
Please note that the configuration below should be considered a starting point and will likely need to be further adapted and fine-tuned to align with your specific company environment.
Small Usage (< 50 MAU recommendations)
20 - 100 average documents per day
3 - 8 Peak concurrent users
# Knowledge Base Worker
eventBasedAutoscaling:
minReplicaCount: 1
maxReplicaCount: 3
# Chat Worker
eventBasedAutoscaling:
maxReplicaCount: 2
cron:
start: 0 7 * * 1-5 # Scale up during business hours
end: 0 19 * * 1-5 # Scale down after hours
desiredReplicas: "1"Rationale: Low concurrent document processing, occasional chat usage, mostly single-user sessions.
Medium Usage (50-200 MAU recommendations)
100 - 500 average documents per day
8 - 25 peak concurrent users
# Knowledge Base Worker
eventBasedAutoscaling:
minReplicaCount: 2
maxReplicaCount: 6
# Chat Worker
eventBasedAutoscaling:
maxReplicaCount: 4
cron:
start: 0 7 * * 1-5 # Scale up during business hours
end: 0 19 * * 1-5 # Scale down after hours
desiredReplicas: "2"Rationale: Moderate concurrent usage, overlapping business hours, predictable usage patterns.
High Usage (200-800 MAU recommendations)
500 - 2,000 average documents uploads per day
25 - 80 peak concurrent users
# Default Worker
eventBasedAutoscaling:
minReplicaCount: 3
maxReplicaCount: 10
# Chat Worker
eventBasedAutoscaling:
maxReplicaCount: 8
cron:
start: 0 6 * * 1-5
end: 0 20 * * 1-5
desiredReplicas: "4"Rationale: High concurrent usage, extended business hours, multiple time zones, heavy document processing.
Enterprise Usage (800+ MAU recommendations)
2,000+ average documents uploads per day
80+ peak concurrent users
# Default Worker
eventBasedAutoscaling:
minReplicaCount: 4
maxReplicaCount: 15
# Chat Worker
eventBasedAutoscaling:
maxReplicaCount: 12
cron:
start: 0 5 * * 1-7 # 24/7 operations
end: 0 23 * * 1-7
desiredReplicas: "6"Rationale: Continuous operations, global usage patterns, high document volume, business-critical workloads