2025.50 Infrastructure Changes

2 min read

Changes on Application environment

Change

Name

Application Default Value
(if env variable unset)

Example

Required

Applications

Short Description

Added

FEATURE_FLAG_ENABLE_MODULE_BASED_EVENT_DISTRIBUTION_UN_15312

false

true

companyid1,companyid2,…

No

backend-service-chat

backend-service-app-repository

backend-service-webhook-scheduler

This flag enables the module based event distribution on the external apps configured

Added

FEATURE_FLAG_ENABLE_ENDPOINT_MODULES_UN_15310

false

true

companyid1,companyid2

No

web-app-admin

Enable AI Module selection for endpoint creation

Added

FEATURE_FLAG_AGENTIC_TABLE_FEEDBACK_FORM_UN_13996

false

No

web-app-chat

Enable Agentic Table cell feedback functionality

Removed

FEATURE_FLAG_AGENTIC_TABLE_PAGINATION_UN_12469

web-app-chat

Pagination for Agentic Table is now GA

Removed

FEATURE_FLAG_DELETE_AGENTIC_TABLE_ROW_UN_12016

web-app-chat

Row deletion row Agentic Table is now GA

CronJob node-ingestion-maintenance

The env variables variables should be set as part of the node-ingestion-maintenance cron job set via extraCronJobs tag, on the backend-service-ingestion see description and example configuration below.

Added

MAINTENANCE_ORCHESTRATOR_CRON_SCHEDULE

0 */2 * * * (Every 2 hours)

* * * * * (Every minute)

No

backend-service-ingestion

The cron schedule for the Kubernetes CronJob that triggers the maintenance orchestrator. Must match the value in the cron schedule tag.

Added

MAINTENANCE_CONTENT_TIMEOUT_CRON_SCHEDULE

0 */2 * * * (Every 2 hours)

* * * ** (Every minute)

No

backend-service-ingestion

Cron schedule for the ingestion timeout cleanup job.

Added

MAINTENANCE_CONTENT_TIMEOUT_MINUTES

120 (minutes)

120

No

backend-service-ingestion

The number of minutes after which a file stuck in a processing state is considered timed out and marked as FAILED_TIMEOUT

Added

MAINTENANCE_CONTENT_TIMEOUT_COMPANY_CONCURRENCY

1

1

No

backend-service-ingestion

Number of companies to process concurrently during ingestion timeout cleanup

Changes on Infrastructure

1. Ingestion Timeout Cleanup - Maintenance Job

Application: backend-service-ingestion

Automated Maintenance Job System to detect and handle timed-out ingestion processes. This is an extraCronJobs set on the backend-service-ingestion

Key Features

  1. Maintenance Job Orchestrator

    • Central service that manages and executes maintenance jobs based on cron schedules

    • Runs every 2 hours in dedicated maintenance mode and executes any configured maintenance services (at the moment only content timeout cleanup see below) whenever their cron schedule matches the interval.

  2. Ingestion Timeout Cleanup Job

    • Processes all companies in the platform. The amount of companies to process concurrently can be configured via MAINTENANCE_CONTENT_TIMEOUT_COMPANY_CONCURRENCY (default: 1)

    • For each company, the service detects content stuck in any of the below processing states for more than 120 minutes. The timeout can be configured via MAINTENANCE_CONTENT_TIMEOUT_MINUTES (default: 120):

      • INGESTION_READING

      • INGESTION_CHUNKING

      • INGESTION_EMBEDDING

      • MALWARE_SCANNING

      • METADATA_VALIDATION

      • RE_EMBEDDING

      • RE_INGESTING

      • RECREATING_VECETORDB_INDEX

      • CHECKING_INTEGRITY

    • Transitions timed-out content to new FAILED_TIMEOUT state

    • Runs every hour with configurable timeout threshold and company concurrency

  3. New Ingestion State

    • Added FAILED_TIMEOUT ingestion state to distinguish timeout failures

Application: backend-service-ingestion

yaml
extraCronJobs:
  node-ingestion-maintenance:
    # Run every 2 hours to check which jobs are due based on their individual schedules
    schedule: "0 */2 * * *"
    restartPolicy: Never
    concurrencyPolicy: Forbid
    env:
      RUNNING_MODE: maintenance
      # Maintenance orchestrator configuration - must match the CronJob schedule above
      MAINTENANCE_ORCHESTRATOR_CRON_SCHEDULE: "0 */2 * * *"
      # Ingestion timeout cleanup job
      MAINTENANCE_CONTENT_TIMEOUT_MINUTES: "120"
      MAINTENANCE_CONTENT_TIMEOUT_CRON_SCHEDULE: "0 */2 * * *"
    successfulJobsHistoryLimit: 1
    failedJobsHistoryLimit: 2
    startingDeadlineSeconds: 10
    timeZone: Europe/Zurich

New Environment Variables:

  • MAINTENANCE_ORCHESTRATOR_CRON_SCHEDULE - Indicates the maintenance service orchestrator in which interval the job is triggered. Must match the value in the cron schedule tag. Default: 0 */2 * * * (every 2 hours)

  • MAINTENANCE_CONTENT_TIMEOUT_MINUTES - Timeout threshold in minutes. Default: 120

  • MAINTENANCE_CONTENT_TIMEOUT_CRON_SCHEDULE - Maintenance timeout service execution schedule. This value must be set in alignment with the schedule / MAINTENANCE_ORCHESTRATOR_CRON_SCHEDULE values. The interval cannot be more frequent. Default: 0 */2 * * * (every 2 hours)

  • MAINTENANCE_CONTENT_TIMEOUT_COMPANY_CONCURRENCY - Number of companies to process in parallel. Default: 1

Deployment:

  • New Kubernetes CronJob in backend-service-ingestion: node-ingestion-maintenance

  • Schedule: Every 2 hours

  • Concurrency: Forbid (prevents overlapping executions)

Author

Solution Engineering

 

 

Last updated