SharePoint Connector - Configuration
17 min read
Configuration Overview
The SharePoint Connector uses a YAML-based tenant configuration file for all settings. The configuration file path is specified via the TENANT_CONFIG_PATH_PATTERN environment variable.
Environment Variables
The following environment variables control application-level behavior. They are set outside the tenant configuration YAML (typically in Helm connector.env).
Variable | Default | Description |
|---|---|---|
|
| Environment mode ( |
|
| HTTP port the application binds to |
|
| Log verbosity: |
|
| Controls whether sensitive data (site names, file names) is logged in full ( |
|
| JSON object controlling when configuration is logged. Set |
| — (required) | Glob pattern to tenant configuration YAML files (e.g., |
| — | OpenTelemetry metrics exporter (e.g., |
| — | Prometheus exporter bind host |
| — | Prometheus exporter bind port |
| — | Path to a PEM file containing additional CA certificates for TLS verification if pod's trust store doesn't have them |
|
| Number of recent sync runs kept in the sliding window for health evaluation |
|
| Per-site failure ratio (0–1) across the window that marks the service unhealthy when exceeded |
|
| Timeout in milliseconds for each reachability ping used by the health endpoint |
The following environment variables are loaded from Kubernetes secrets:
Variable | Description |
|---|---|
| Password for an encrypted certificate private key (optional, only if key is password-protected) |
| Zitadel client secret (required when |
| Proxy password (required when proxy |
Configuration Sources
Sites can be configured in two ways:
Source | Description | Use Case |
|---|---|---|
| Static YAML configuration | Simple deployments, fixed site list |
| Dynamic configuration from SharePoint list | Self-service, frequent changes |
Tenant Configuration File
Static Sites Configuration (config_file)
sharepoint:
# ... auth and base configuration ...
# Deployment-wide defaults applied to every site below; per-site values
# win when set. See "Site Defaults" further down for full semantics.
siteDefaults:
syncColumnName: FinanceGPTKnowledge
storeInternally: enabled
syncStatus: active
syncMode: content_only
permissionsInheritanceMode: inherit_scopes_and_files
sitesSource: config_file
sites:
# Overrides syncMode for this site; everything else inherits from siteDefaults.
- siteId: 12345678-1234-1234-1234-123456789abc
ingestionMode: recursive
scopeId: scope_bu4gokr0atzj0kfiuaaaaaaa
maxFilesToIngest: 1000
syncMode: content_and_permissions
# Overrides syncColumnName for this site; everything else inherits from siteDefaults.
- siteId: 87654321-4321-4321-4321-cba987654321
syncColumnName: HRKnowledge
ingestionMode: flat
scopeId: scope_bu4gokr0atzj0kfiubbbbbbDynamic Sites Configuration (sharepoint_list)
Configure sites dynamically via a SharePoint list:
sharepoint:
# ... auth and base configuration ...
sitesSource: sharepoint_list
sharepointList:
siteId: your-config-site-id-here
listId: 00000000-0000-0000-0000-000000000000You can use the CSV import template when populating the SharePoint list for sharepoint_list-based configuration.
SharePoint Base Configuration
The sharepoint section of the tenant YAML contains authentication and base settings that apply to all sites:
sharepoint:
tenantId: 12345678-1234-1234-1234-123456789012
baseUrl: https://acme.sharepoint.com
graphApiRateLimitPerMinuteThousands: 780
auth:
mode: certificate
clientId: 00000000-0000-0000-0000-000000000000
privateKeyPath: /app/key.pem
thumbprintSha1: AB12CD34EF56...Option | Required | Default | Description |
|---|---|---|---|
| Yes | — | Azure AD tenant ID |
| Yes | — | Company SharePoint URL (e.g., |
| No |
| Microsoft Graph API rate limit in thousands of requests per minute |
| Yes | — | Authentication configuration (see Authentication) |
| No |
| Deployment-level fallbacks applied to every per-site config (see Site Defaults) |
Authentication
The connector uses certificate-based authentication (auth.mode: certificate):
Option | Required | Description |
|---|---|---|
| Yes |
|
| Yes | Azure AD application client ID |
| Yes | Path to the private key file in PEM format |
| One of SHA1/SHA256 required | SHA-1 thumbprint of the certificate |
| One of SHA1/SHA256 required | SHA-256 thumbprint of the certificate |
| No | Injected from |
Unique Platform Configuration
The unique section configures how the connector communicates with the Unique platform:
unique:
serviceAuthMode: cluster_local
ingestionServiceBaseUrl: http://node-ingestion.finance-gpt:8091
scopeManagementServiceBaseUrl: http://node-scope-management.finance-gpt:8094
apiRateLimitPerMinute: 100
serviceExtraHeaders:
x-company-id: "company-id"
x-user-id: "service-user-id"Option | Required | Default | Description |
|---|---|---|---|
| Yes | — |
|
| Yes | — | Base URL for the Unique ingestion service |
| Yes | — | Base URL for the Unique scope management service |
| No |
| Rate limit for Unique API requests per minute |
| No | — | Optional object passed when submitting files for ingestion (e.g., |
cluster_local mode (in-cluster communication):
Option | Required | Description |
|---|---|---|
| Yes | Must contain |
external mode (authenticates via Zitadel):
Option | Required | Description |
|---|---|---|
| Yes | Zitadel OAuth token URL |
| Yes | Zitadel project ID |
| Yes | Zitadel client ID |
| Yes | Injected from |
Proxy Configuration
The connector supports HTTP/HTTPS proxy for environments where internet access is only available through a proxy. Proxy settings are configured via environment variables (managed by the Helm chart's proxyConfig section).
Mode | Description |
|---|---|
| Proxy disabled (default) |
| Proxy enabled without authentication |
| Basic authentication proxy |
| TLS client certificate proxy |
Common options (required for no_auth, username_password, and ssl_tls modes):
Variable | Description |
|---|---|
| Proxy server hostname |
| Proxy server port |
|
|
| (Optional) Path to CA bundle for verifying proxy TLS certificate |
| (Optional) JSON string of custom headers for CONNECT request |
username_password mode adds:
Variable | Description |
|---|---|
| Proxy username |
| Proxy password (loaded from secret) |
ssl_tls mode adds:
Variable | Description |
|---|---|
| Path to TLS client certificate |
| Path to TLS client key |
SharePoint List Configuration
When using sharepoint_list as the sites source, create a SharePoint list with the following columns. Only siteId is strictly required as a column on the list — any other column whose value is set via Site Defaults can be omitted from the list entirely, and rows will inherit the deployment-wide value.
Column Display Name | Type | Description |
|---|---|---|
| Single line text | SharePoint site ID (UUID or compound format: |
| Single line text | Column that marks files for sync |
| Choice |
|
| Single line text | Unique scope ID. Either |
| Number | Maximum new + updated files per sync cycle; sync fails for the site if exceeded |
| Choice |
|
| Choice |
|
| Choice |
|
| Choice | Optional inheritance mode |
| Choice |
|
Benefits of SharePoint List Configuration
Self-service: Site owners can request sync without IT involvement
No redeployment: Add/modify sites without restarting the connector
Audit trail: SharePoint tracks changes to the configuration list
Approval workflows: Use SharePoint approval flows for governance
Per-Site Configuration Options
Important: The connector is a singleton — each SharePoint site must be configured in at most one connector process per Unique instance. Configuring the same site in multiple processes leads to conflicting state and unexpected behavior of the connector.
Option | Values | Default | Description |
|---|---|---|---|
| UUID or compound ID | — (required) | SharePoint site ID. Subsites use compound format: |
| String |
| Display name or internal name of the sync flag column (display name takes priority) |
|
| — (required) | Flat ingests all to one scope; recursive maintains hierarchy |
|
| — (required) | Where to mount this site's content, see Choosing between fixed scope and |
| Number | — (unlimited) | Maximum new + updated files per sync cycle; sync fails for the site if exceeded |
|
|
| Whether to store content in Unique |
|
|
| Control sync behavior |
|
| — (required) | What to sync |
|
|
| Inheritance settings for content_only, see Permissions Inheritance Modes mode |
|
|
| Recursively discover and sync content from subsites |
Choosing between fixed scope and in_parent: auto-resolve
Fixed (scope_<id>) — each site needs its own scope, pre-created by the operator before the site is configured. Best when scopes are managed centrally and named or permissioned individually, since the operator stays in full control of the scope's identity, ACLs, and lifecycle.
Auto (in_parent:scope_<parentId>) — the connector finds-or-creates a child scope under the parent on every sync, named after the SharePoint site's URL slug. Removing a site (via syncStatus: deleted) removes the auto-created scope. If a sibling scope under the parent already has the same site name and isn't claimed by us, the connector aborts the sync with a typed error rather than guessing to stay on the safe side and not sync a site into a user folder.
Permissions Inheritance Modes
Only used when syncMode is content_only. It controls whether newly created scopes / files inherit permissions from their parent. If scopes / files are configured to not inherit permissions, any newly created scopes / files will not be visible to platform users, only to service user. To grant access to these new scopes / files, admin has to use API on behalf of the service user.
Mode | Scopes Inherit | Files Inherit |
|---|---|---|
| Yes | Yes |
| Yes | No |
| No | Yes |
| No | No |
Site Defaults
sharepoint.siteDefaults lets you set deployment-level fallbacks for any per-site option except siteId. Each site (whether sourced from config_file or from a sharepoint_list row) is merged with the defaults: if the per-site value is set, it wins; otherwise the default is used. This keeps individual site entries terse and makes it easy to change a policy across an entire deployment in one place.
With config_file
sharepoint:
# ... auth and base configuration ...
siteDefaults:
syncColumnName: FinanceGPTKnowledge
ingestionMode: recursive
storeInternally: enabled
syncStatus: active
syncMode: content_only
permissionsInheritanceMode: inherit_scopes_and_files
subsitesScan: disabled
sitesSource: config_file
sites:
# Inherits everything from siteDefaults except scopeId / maxFilesToIngest
- siteId: 12345678-1234-1234-1234-123456789abc
scopeId: scope_bu4gokr0atzj0kfiuaaaaaaa
maxFilesToIngest: 1000
# Overrides syncColumnName and ingestionMode for this site
- siteId: 87654321-4321-4321-4321-cba987654321
syncColumnName: HRKnowledge
ingestionMode: flat
scopeId: scope_bu4gokr0atzj0kfiubbbbbbWith sharepoint_list
siteDefaults works identically for sharepoint_list: any column whose value is set on a row wins; any column that is blank (or whose mapped field is undefined because the column is not present on the list at all) falls back to the default. This means columns covered by siteDefaults can be omitted from the SharePoint list entirely — only the columns you want to vary per row need to exist. At minimum, the list must carry siteId; everything else can live in siteDefaults.
sharepoint:
# ... auth and base configuration ...
siteDefaults:
syncColumnName: FinanceGPTKnowledge
ingestionMode: recursive
storeInternally: enabled
syncStatus: active
syncMode: content_only
permissionsInheritanceMode: inherit_scopes_and_files
subsitesScan: disabled
# Common pattern: every site auto-creates a child under one shared parent scope,
# so the list does not need a `uniqueScopeId` column at all.
scopeId: in_parent:scope_bu4gokr0atzj0kfiucccccc
sitesSource: sharepoint_list
sharepointList:
siteId: your-config-site-id-here
listId: 00000000-0000-0000-0000-000000000000With the example above, the SharePoint list can be reduced to a single siteId column — every other per-site field is supplied by siteDefaults. Add columns back to the list only when you need per-site overrides for those fields.
Merge Rules
Per-site value wins when "set". For string-typed fields (including
siteId,scopeId,syncColumnName), "set" means non-undefinedand non-empty after trim — so a blank cell in a SharePoint list row falls back to the default. For numeric/enum fields, any non-undefinedvalue counts as set.Required-after-merge.
ingestionMode,scopeId, andsyncModeare required on the final merged config. If a per-site entry omits them andsiteDefaultsdoes not supply them either, the merger throws — and because sites are merged eagerly at the start of every sync cycle (for bothconfig_fileandsharepoint_list), a single unmergeable row aborts the entire sync cycle, not just that site. The service stays running and retries on the next scheduled cycle; the failure is recorded as a full-sync failure with stepSitesConfigLoading.siteIdcannot be defaulted. It must always be set per site.
Schema Defaults
The fields below have schema-level defaults applied even when you do not provide a siteDefaults block. Listing them in siteDefaults is still allowed if you want to make them explicit, but it is optional.
Field | Schema Default |
|---|---|
|
|
|
|
|
|
|
|
|
|
The remaining defaultable fields (ingestionMode, scopeId, maxFilesToIngest, syncMode) have no schema default — they take effect only if you set them under siteDefaults or per site.
SharePoint Site Configuration
Finding SharePoint Site IDs
Site IDs are required to configure which SharePoint sites the connector scans. The connector supports both /sites/ and /teams/ managed paths.
Via Browser:
Navigate to: https://{tenant}.sharepoint.com/sites/your-site/_api/site/id (or /teams/your-team/_api/site/id for team sites)
The response will be XML containing the site ID:
<d:Id>12345678-1234-1234-1234-123456789012</d:Id>Via Microsoft Graph Explorer:
GET https://graph.microsoft.com/v1.0/sites/{tenant}.sharepoint.com:/sites/{site}For team sites, use /teams/ instead of /sites/:
GET https://graph.microsoft.com/v1.0/sites/{tenant}.sharepoint.com:/teams/{team}Look for the id field in the response.
Via PowerShell:
Connect-PnPOnline -Url "https://{tenant}.sharepoint.com/sites/your-site"
Get-PnPSite -Includes Id | Select-Object IdFor team sites, replace /sites/ with /teams/ in the URL.
Finding Subsite Compound IDs
Subsites use a compound site ID format (hostname,siteCollectionId,webId) instead of a plain UUID. To find a subsite's compound ID:
Via Microsoft Graph Explorer:
GET https://graph.microsoft.com/v1.0/sites/{tenant}.sharepoint.com:/sites/{parentSite}/{subsiteName}The id field in the response is the compound ID:
{
"id": "contoso.sharepoint.com,a1b2c3d4-...,e5f6a7b8-..."
}Use this full value as the siteId in your site configuration.
Via PowerShell:
Connect-PnPOnline -Url "https://{tenant}.sharepoint.com/sites/{parentSite}/{subsiteName}"
Get-PnPSite -Includes Id, Url | Select-Object Id, UrlVia Browser (manual construction — discouraged):
You can technically construct the compound ID by navigating to the subsite and calling two REST endpoints:
https://{tenant}.sharepoint.com/sites/{parentSite}/{subsiteName}/_api/site/id→siteCollectionIdhttps://{tenant}.sharepoint.com/sites/{parentSite}/{subsiteName}/_api/web/id→webId
Then combine as {tenant}.sharepoint.com,{siteCollectionId},{webId}. This approach is error-prone and discouraged — prefer Microsoft Graph Explorer or PowerShell instead.
For nested subsites, extend the path:
https://graph.microsoft.com/v1.0/sites/{tenant}.sharepoint.com:/sites/{parentSite}/{subsite}/{nestedSubsite}Subsites Scanning
Overview
When subsitesScan is set to enabled for a site, the connector recursively discovers all subsites under that site and syncs their content alongside the parent site's content. This means you only need to configure the top-level site — all nested subsites are discovered and included automatically.
How It Works
Discovery — During each sync cycle, the connector calls the Graph API (
GET /sites/{siteId}/sites) to list direct child subsites, then recurses into each child to discover the full subsite tree.Content fetching — For each discovered subsite, the connector fetches document libraries and site pages using the same
syncColumnNameas the parent site.Scope hierarchy — Subsite content is ingested under the parent site's scope tree. Each subsite appears as a folder at its relative path within the hierarchy (e.g.,
/RootScope/SubsiteA/Documents/file.pdf).File diff — Subsite items are keyed under the parent site's ID in the file-diff mechanism. If a subsite is later removed or reconfigured, its files are detected as deleted and cleaned up.
Deduplication with Standalone Sites
If a subsite is also configured as a standalone site (using its compound site ID), it is excluded from the parent's recursive discovery to avoid double-syncing. The connector compares compound IDs across all configured sites and skips any match during discovery, including any further subsites.
Limitations
The
syncColumnNameis shared between the parent site and all its subsites. You cannot use a different sync column per subsite.Subsites are only addressable via compound site IDs (
hostname,siteCollectionId,webId) in the Graph API. A plain UUID cannot identify a subsite.
Configuring Document Libraries for Sync
Adding the Sync Column
Navigate to your SharePoint document library
Click Add column → Yes/No
Name the column (code default if unset:
FinanceGPTKnowledge)Set default value to No
Click Save
Column Settings
Column name: Must match the
syncColumnNameconfigured for the site in the tenant configuration YAML or SharePoint configuration listType: Yes/No (Boolean)
Default value: No (recommended)
Require this column: No
Column name resolution: SharePoint distinguishes between a column's internal name (set at creation time and immutable) and its display name (which can be changed later in the UI). The connector accepts either name in the syncColumnName configuration and resolves it per drive/list as follows:
If a column's display name matches
syncColumnName, the connector uses that column's internal name for filtering.Otherwise, if a column's internal name matches
syncColumnName, that name is used directly.If neither matches, the drive or SitePages list is skipped entirely — the connector logs a warning and moves on to the next drive without scanning any items.
This means you can configure syncColumnName using the human-readable display name shown in the SharePoint UI (e.g., Sync to Unique) even if the underlying internal name is different (e.g., Sync_Unique). When a display name is resolved to a different internal name, the connector logs the mapping for transparency.
Note on column renaming: If a column was created as UniqueAI and later renamed to SyncToUnique in the UI, only the display name changes — the internal name remains UniqueAI. You can configure syncColumnName as either SyncToUnique (the current display name) or UniqueAI (the internal name). Using the display name is recommended as it is easier to verify in the SharePoint UI, but please be aware of this behavior in case of conflicts.
Drive and List Skipping
The connector checks each document library (drive) and the SitePages list for the presence of the configured sync column before scanning. If the column is not found on a drive, the entire drive is skipped — no items are fetched. This avoids unnecessary API calls for libraries that were never set up for sync. A warning is logged for each skipped drive so operators can verify the configuration.
User Workflow
Users mark documents for sync by:
Selecting a document in the library
Clicking the sync column
Setting value to Yes
The connector picks up flagged files on the next scan cycle.
Processing Configuration
The processing section of the tenant configuration file controls file processing behavior:
processing:
stepTimeoutSeconds: 30
concurrency: 1
maxFileSizeToIngestBytes: 209715200
allowedMimeTypes:
- application/pdf
- application/vnd.openxmlformats-officedocument.wordprocessingml.document
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- application/vnd.openxmlformats-officedocument.presentationml.presentation
- application/x-asp
- text/plain
- text/html
- text/csv
mimeTypeOverridesByExtension:
.csv: text/csv
scanIntervalCron: "*/15 * * * *"Option | Default | Description |
|---|---|---|
|
| Time limit (in seconds) for a single file processing step before the file is skipped |
|
| Number of files to ingest into Unique concurrently |
|
| Maximum file size in bytes. Files larger than this are skipped with a warning in the logs |
| (none — must be configured) | List of MIME types the connector will process. The Helm chart ships sensible defaults; see Supported File Types |
|
| Map of file extension suffix to canonical MIME type. See MIME Type Overrides by Extension |
|
| Cron expression for the scheduled sync interval |
Supported File Types
Configure allowed types via the allowedMimeTypes processing option. There is no schema-level default — operators must explicitly configure this field. The Helm chart ships the following defaults:
Extension | MIME Type | Helm Default |
|---|---|---|
|
| Yes |
|
| Yes |
|
| Yes |
|
| Yes |
|
| Yes |
|
| Yes |
|
| Yes |
|
| Yes |
Note:.aspx SharePoint pages bypass the MIME type filter and are always eligible for ingestion regardless of allowedMimeTypes.
MIME Type Overrides by Extension
SharePoint occasionally reports the wrong MIME type for a file (notably .csv files come back as application/vnd.ms-excel). This causes the ingestion service to reject the file even when the operator has whitelisted the correct type. The mimeTypeOverridesByExtension map rewrites the SharePoint-reported MIME type by file extension before the allow-list check runs, so both the filter and the registered content carry the canonical value.
processing:
mimeTypeOverridesByExtension:
.csv: text/csvDefaults and merging: The default value is { .csv: text/csv }. A user-supplied value replaces the default wholesale — there is no merge. To keep the CSV fix while adding your own overrides, include .csv: text/csv explicitly in your map.
Suffix matching: Keys are matched against the lowercased file name with endsWith. Both keys and file names are lowercased, so .CSV and Foo.Csv match a .csv key. Multi-segment suffixes are supported (e.g. .tar.gz); when multiple keys could match (e.g. .tar.gz and .gz are both configured), the longest match wins — so archive.tar.gz resolves via .tar.gz, not .gz.
Validation: Keys must match ^(\.[a-z0-9]+)+$ after lowercase normalization (one or more .alphanumeric segments). Empty MIME values are rejected. Invalid configuration fails fast at startup.
Scheduler Configuration
Sync Interval
The connector runs sync cycles at regular intervals, controlled by scanIntervalCron in the processing section of the tenant configuration file:
processing:
scanIntervalCron: "*/15 * * * *" # Default: every 15 minutesConsiderations:
Lower intervals increase API usage and may hit rate limits
Higher intervals delay sync of new content
Recommended range: every hour, every night
Logging
Application Logs
The connector produces structured JSON logs:
{
"timestamp": "2024-01-15T10:30:00.000Z",
"level": "info",
"message": "Sync cycle started",
"traceId": "abc123",
"siteId": "xxx-xxx-xxx"
}Log Levels
Level | Description |
|---|---|
| Detailed debugging information |
| General operational information |
| Warning conditions |
| Error conditions |
Audit Logs
Audit events are logged for compliance:
Sync cycle start/end
Files processed (create, update, delete)
Permission changes
Authentication events
Configuration changes
Metrics
System Telemetry
Standard Kubernetes metrics are exposed:
CPU usage
Memory usage
Pod restarts
Network I/O
Application Telemetry
All custom metrics use the spc_ prefix (SharePoint Connector).
Sync Cycle Metrics
Metric | Type | Labels | Description |
|---|---|---|---|
| Histogram |
| Duration of synchronization cycles (per site and full sync) |
Histogram buckets: 10s, 30s, 60s, 5m, 10m, 30m, 1h
File Processing Metrics
Metric | Type | Labels | Description |
|---|---|---|---|
| Counter |
| Files processed by ingestion pipeline steps |
| Counter |
| File change detection events ( |
| Counter |
| File move operations in Unique |
| Counter |
| File deletion operations in Unique |
Microsoft Graph API Metrics
Metric | Type | Labels | Description |
|---|---|---|---|
| Histogram |
| Request latency for Microsoft Graph API calls |
| Counter |
| Microsoft Graph API throttling (429) events |
| Counter |
| Slow Microsoft Graph API requests |
Request duration histogram buckets: 100ms, 500ms, 1s, 2s, 5s, 10s, 20s
Slow request duration_bucket values: >1s, >2s, >5s, >10s
Unique API Metrics
Metric | Type | Labels | Description |
|---|---|---|---|
| Histogram |
| Request latency for Unique GraphQL API calls |
| Counter |
| Slow Unique GraphQL API calls |
| Histogram |
| Request latency for Unique REST API calls |
| Counter |
| Slow Unique REST API calls |
Request duration histogram buckets: 100ms, 500ms, 1s, 2s, 5s, 10s, 20s
Permissions Sync Metrics
Metric | Type | Labels | Description |
|---|---|---|---|
| Histogram |
| Duration of the permissions synchronization phase for a site |
| Counter |
| Operations performed on SharePoint groups during permissions sync |
| Counter |
| Folder (scope) permission changes synced ( |
| Counter |
| File permission changes synced ( |
Permissions sync duration histogram buckets: 5s, 10s, 30s, 60s, 2m, 5m, 10m, 30m
Grafana Dashboard
A Grafana dashboard template is available in the Helm chart:
grafana:
dashboard:
enabled: true
folder: connectorsAlerts
Default Alert Categories
The Helm chart organizes alerts into three categories, each independently toggleable:
Category | Alert Name | Description |
|---|---|---|
|
| GraphQL API error rate alert |
|
| Unique REST API error rate alert |
|
| Sync cycle failure alert |
Each category supports enabled, disabled (per-alert), and customRules (to override for, severity, threshold).
Custom Alerts
alerts:
enabled: true
rules:
- alert: LongSyncCycle
expr: histogram_quantile(0.95, rate(spc_sync_duration_seconds_bucket[5m])) > 3600
for: 5m
labels:
severity: warning
annotations:
summary: "Sync cycle taking too long"Health Endpoint
The connector exposes a GET /health endpoint that reports operational health. It is separate from the existing GET /probe endpoint used for K8s liveness/readiness probes — GET /health is intended for external monitoring and SRE tooling.
The endpoint returns HTTP 200 when all checks pass and HTTP 503 when any check fails. The response follows the @nestjs/terminus format with status, info, error, and details fields.
Health Checks
The endpoint runs three checks on every request:
Sync — Evaluates sync history from a sliding window of the last N runs (configurable via HEALTH_SYNC_HISTORY_SIZE). Each site's failure ratio is computed independently: failures / appearances. If any site exceeds HEALTH_SYNC_SITE_FAILURE_THRESHOLD, the check is down. When no sync has completed yet (e.g. shortly after startup), this check is omitted from the response. A single transient per-site failure is absorbed by the window and does not trigger an alert.
Connectivity — Performs unauthenticated HTTP requests to Microsoft Graph (https://graph.microsoft.com/v1.0/) and the configured SharePoint base URL. Any HTTP response (including 401/403) proves the endpoint is reachable — only transport-level failures (DNS, TLS, timeout, connection refused) are treated as unhealthy.
Unique API — Sends a minimal { __typename } GraphQL query to both Unique API endpoints (ingestion and scope management). Unlike the connectivity check, non-2xx responses (401/403/500) are treated as unhealthy because they indicate the API is not functioning correctly. These requests bypass the internal rate limiter to avoid queuing behind sync traffic.
Response Examples
Healthy (200):
{
"status": "ok",
"info": {
"sync": {
"status": "up",
"lastSyncAt": "2026-03-18T10:15:00.000Z",
"recentSyncs": 5,
"sites": {
"site-aaa": { "failures": 0, "total": 5 },
"site-bbb": { "failures": 1, "total": 5 }
}
},
"connectivity": {
"status": "up",
"graph": "reachable",
"sharepoint": [
{ "tenant": "default", "status": "reachable" }
]
},
"uniqueApi": {
"status": "up",
"ingestion": "reachable",
"scopeManagement": "reachable"
}
},
"error": {},
"details": { "...same as info when healthy..." }
}Unhealthy (503) — site exceeds sync failure threshold:
{
"status": "error",
"info": {
"connectivity": { "status": "up", "..." : "..." },
"uniqueApi": { "status": "up", "..." : "..." }
},
"error": {
"sync": {
"status": "down",
"lastSyncAt": "2026-03-18T10:15:00.000Z",
"threshold": 0.5,
"failingSites": ["site-bbb"],
"sites": {
"site-aaa": { "failures": 0, "total": 5 },
"site-bbb": { "failures": 4, "total": 5 }
}
}
},
"details": { "...all checks combined..." }
}Configuration
Variable | Default | Description |
|---|---|---|
|
| Number of recent sync runs in the sliding window |
|
| Per-site failure ratio (0–1) that triggers unhealthy when exceeded |
|
| Timeout in milliseconds for each reachability ping (connectivity checks) |
Complete Re-ingestion
To perform a complete re-ingestion of all synced SharePoint content:
Prerequisites
Access to Unique API or admin interface
Ability to pause the connector
Step 1: Pause the SharePoint Connector
Scale down the deployment:
kubectl scale deployment sharepoint-connector --replicas=0 -n sharepoint-connectorStep 2: Delete Root Scope and All Content
Use the Unique Public API to delete content recursively starting from the root scope. This removes all synced content.
Warning: This operation is irreversible. Ensure you have backups if needed.
Step 3: Re-enable the Connector
Scale up the deployment:
kubectl scale deployment sharepoint-connector --replicas=1 -n sharepoint-connectorThe connector will perform a full sync on the next cycle, re-ingesting all flagged content.
Further Guidance
A dedicated re-ingestion runbook with extended prerequisites, API request examples, and operational caveats will be linked here in a later documentation update.