SharePoint Connector - FAQ
10 min read
General
What type of connector is this?
Answer: The SharePoint Connector is a pull-based synchronization service that periodically scans SharePoint sites and syncs flagged documents to the Unique knowledge base.
Key characteristics:
Runs on a schedule (default: every 15 minutes)
Pulls content from SharePoint (vs. push-based Power Automate v1)
Requires explicit flagging of documents via a custom column
Operates as a background service without user interaction
How does this differ from the Power Automate connector (v1)?
Answer:
Aspect | v1 (Power Automate) | v2 (SharePoint Connector) |
|---|---|---|
Architecture | Push-based | Pull-based |
Trigger | Power Automate flow | Scheduled scan |
Dependencies | Power Automate license | None (standalone) |
Deployment | Power Automate cloud | Kubernetes container |
Control | Limited | Full control |
Permissions
Why Sites.Selected / Lists.SelectedOperations.Selected instead of Sites.Read.All?
Answer:Sites.Selected and Lists.SelectedOperations.Selected follow the principle of least privilege:
Sites.Read.All: Grants access to ALL sites in the tenant
Sites.Selected: Only grants access to explicitly approved sites
Lists.SelectedOperations.Selected: Only grants access to explicitly approved document libraries
Benefits:
Administrators control exactly which sites or libraries are accessible
Each grant is auditable and revocable
Meets enterprise security requirements
Aligns with zero-trust principles
Why do I need GroupMember.Read.All for permission sync?
Answer: SharePoint permissions often reference Entra ID (Azure AD) groups. To sync these permissions to Unique, the connector must:
Read the permission entry (group ID)
Expand the group to get member list
Map members to Unique users
Without GroupMember.Read.All, group-based permissions cannot be synchronized.
Why can't I read SharePoint site group members?
Answer: SharePoint site groups have a visibility setting: "Who can view the membership of the group?"
If this is not set to "Everyone", the connector cannot read group members.
Solutions:
Set group visibility to "Everyone"
Add the app principal as a group member/owner
Grant Full Control to the app principal
How do public and private SharePoint sites affect Everyone permissions?
Answer: Private and public sites can behave differently for tenant-wide visibility:
Private site: Access is typically limited to explicit members/owners/visitors.
Public site (org-visible): SharePoint can include tenant-wide principals such as
Everyone except external usersfor read visibility.
The connector intentionally does not expand tenant-wide principals (Everyone, Everyone except external users) during permission sync. This avoids broad permission replication into Unique and can create a visible difference between SharePoint and Unique access behavior.
Configuration
What are the two ways to configure sites?
Answer: The connector supports two configuration sources:
Source | Description | Use Case |
|---|---|---|
| Static YAML configuration | Simple deployments, fixed site list |
| Dynamic configuration from SharePoint list | Self-service, frequent changes |
Static (YAML file):
sharepoint:
sitesSource: config_file
sites:
- siteId: "xxx-xxx-xxx"
syncColumnName: UniqueAI
ingestionMode: recursive
scopeId: scope_xxx
syncMode: content_onlyDynamic (SharePoint list):
sharepoint:
sitesSource: sharepoint_list
sharepointList:
siteId: "config-site-id"
listId: "00000000-0000-0000-0000-000000000000"What columns are needed for the SharePoint configuration list?
Answer: When using sharepoint_list as the sites source, create a list with these columns. Only siteId must be present as a column on the list — every other column marked Yes* can instead be supplied via sharepoint.siteDefaults in the tenant config, in which case the column can be omitted from the list entirely. See Site Defaults.
Column Display Name | Type | Required | Description |
|---|---|---|---|
| Single line text | Yes | SharePoint site ID (UUID) |
| Single line text | Yes* | Column marking files for sync |
| Choice | Yes* |
|
| Single line text | Yes* | Unique scope ID |
| Choice | Yes* |
|
| Choice | Yes* |
|
| Number | No | Optional limit |
| Choice | No |
|
| Choice | No | Inheritance settings |
| Choice | No |
|
Are subsites automatically included?
Answer: Only if subsitesScan is set to enabled for a site. When enabled, the connector recursively discovers all subsites under the configured site and syncs their content using the parent site's syncColumnName. See Subsites Scanning for details.
How do I find SharePoint Site IDs?
Answer: Several methods are available:
Method 1: Graph Explorer
GET https://graph.microsoft.com/v1.0/sites/{hostname}:/{site-path}Example:
GET https://graph.microsoft.com/v1.0/sites/contoso.sharepoint.com:/sites/marketingMethod 2: PowerShell
Connect-PnPOnline -Url "https://contoso.sharepoint.com/sites/marketing" -Interactive
Get-PnPSite | Select-Object IdMethod 3: SharePoint URL Pattern
The site ID follows the format: {hostname},{site-collection-id},{web-id}
I renamed my sync column in SharePoint but the connector stopped picking up files. Why?
Answer: SharePoint columns have an internal name (set at creation, immutable) and a display name (changeable). Renaming a column in the SharePoint UI only changes the display name — the Microsoft Graph API still uses the original internal name. The syncColumnName in the connector configuration must match the internal name, not the display name.
For example, if you created a column as UniqueAI and later renamed it to SyncToUnique, the connector configuration must still use UniqueAI.
It is recommended to create a new column with desired name and deleting the old one instead of renaming to avoid confusion in the future.
Sync Behavior
What safety guards does the connector have?
Answer: The connector includes safeguards to prevent accidental data loss:
Full-deletion protection: If the file diff would delete all files stored in Unique for a site, the sync cycle for that site is aborted. This prevents accidental full deletion due to misconfiguration or transient issues. To intentionally remove all content for a site, set the site's
syncStatustodeletedin the configuration.Duplicate scope ID detection: The connector runs two dedup passes: by
scopeId(skipped forin_parent:rows since multiple sites can legitimately share the same parent) and bysiteId(a SharePoint site can only be configured once, regardless of variant). First-occurrence wins; later duplicates are logged and skipped.Scope ownership validation: Each root scope is tagged with the site that owns it. If a scope was already claimed by a different site, the sync for that site fails immediately, preventing two sites from accidentally writing into the same scope.
What happens when a file is deleted from SharePoint?
Answer: The file is automatically removed from the Unique knowledge base on the next sync cycle. The file diff mechanism detects:
Files deleted from SharePoint
Files with sync flag changed to "No"
Both are treated as deletions in Unique.
What happens if I change a site's scopeId?
Answer: The connector detects that the root scope has changed and automatically migrates all child scopes from the old root to the new root. After migration the old root scope is deleted. If migration fails for any child scope, the error is logged and the sync continues.
The same migration path covers transitions between the two scopeId variants:
scope_<X>→scope_<Y>— existing behaviour, no change.scope_<X>→in_parent:scope_<P>— the existing scope is reused (claim by externalId still hits) and moved under<P>if it isn't already there.in_parent:scope_<P>→scope_<Y>— the existing root-scope-migration logic moves children to<Y>and deletes the auto-created scope.in_parent:scope_<X>→in_parent:scope_<Y>— the existing scope is moved under<Y>(same global externalId lookup).
When should I use in_parent: instead of a fixed scope ID?
Answer: Use in_parent:scope_<parentId> when you want the connector to find-or-create a per-site root scope under a shared parent automatically, instead of pre-creating one scope per site. It's useful when many sites need to be onboarded quickly and you don't want operators to materialise a scope before each ingestion request. The auto-created scope is named after the SharePoint site's URL slug. Removing the site (via syncStatus: deleted) removes the auto-created scope.
What happens if I unflag a document?
Answer: Setting the sync column to "No" is treated as a deletion request. On the next sync cycle:
Connector detects the flag change via the server-side file diff
File is removed from the Unique knowledge base
Are subfolders synced?
Answer: It depends on the ingestionMode setting for each site:
Mode | Behavior |
|---|---|
| Scans all subfolders, maintains folder hierarchy in Unique |
| All flagged files go to a single root scope |
The sync column must be set on individual files (not folders).
What file types are supported?
Answer: The Helm chart ships the following default MIME types:
PDF (
.pdf)Word (
.docx)Excel (
.xlsx)PowerPoint (
.pptx)Text (
.txt)HTML (
.html)ASP/ASPX (
.asp,.aspx)CSV (
.csv)
SharePoint pages (.aspx) bypass the MIME type filter and are always eligible regardless of configuration. Additional or fewer types can be configured via allowedMimeTypes in the processing configuration. Note: there is no schema-level default — allowedMimeTypes must be explicitly configured.
About CSV files: SharePoint reports .csv files with the MIME type application/vnd.ms-excel (the legacy Excel type), which the ingestion service rejects. The connector ships a default mimeTypeOverridesByExtension mapping of .csv → text/csv that rewrites the reported MIME type before the allow-list check. If you previously added application/vnd.ms-excel to allowedMimeTypes as a workaround, you should now use text/csv instead — and remove application/vnd.ms-excel unless you actually want to ingest legacy .xls files. If you already configure mimeTypeOverridesByExtension, note that user-supplied maps replace the default wholesale; include .csv: text/csv explicitly in your map to retain the fix. See MIME Type Overrides by Extension.
What is the maximum file size?
Answer: Default is 200 MB, configurable via maxFileSizeToIngestBytes in the processing configuration. Larger files are skipped with a warning in the logs.
Troubleshooting
Why aren't my documents syncing?
Checklist:
Is the sync column set to "Yes" for the document?
Is the site configured (in YAML or SharePoint list)?
Is the site's
syncStatusset toactive?Is the file type in
allowedMimeTypes?Is the file under
maxFileSizeToIngestBytes?Check connector logs for errors
How does the connector behave when errors occur?
Answer: The connector uses scenario-based handling to keep sync cycles running:
transient API/network issues are retried with backoff
non-retryable item errors are logged and skipped
configuration/authentication problems require operator action and can fail a cycle early
Detailed behavior by scenario is documented in Flows.
Why do I see "Site not found" errors?
Causes:
Incorrect site ID
Site-specific permission not granted
Site deleted or renamed
Resolution:
Verify site ID using Graph Explorer
Re-grant site access via PowerShell
Check site exists in SharePoint
Why do I see "Access denied" errors?
Causes:
Sites.SelectedorLists.SelectedOperations.Selectednot granted for the site/libraryAdmin consent not completed
Certificate/credential issues
Permission scope mismatch: Library-level grants (issued with the
-Listparameter) requireLists.SelectedOperations.Selectedon the app registration. If the app only hasSites.Selected, those library-level grants are invisible to it and requests return 403.
Resolution:
Check whether the admin issued site-level or library-level grants, and ensure the app registration has the matching permission (
Sites.Selectedfor sites,Lists.SelectedOperations.Selectedfor libraries, or both)Complete admin consent in Azure Portal
Verify certificate configuration
Why is sync taking too long?
Possible causes:
Too many files to process
Large file sizes
API rate limiting
Network latency
Solutions:
Increase
concurrencyin the processing configurationReview and reduce flagged files
Check for rate limit warnings in logs
Verify network connectivity
Why is a public SharePoint site accessible, but its content is not visible in Unique for all users?
Answer: This usually occurs when SharePoint visibility is granted through tenant-wide groups such as Everyone or Everyone except external users.
SharePoint may allow broad read access through those principals.
The connector does not sync those tenant-wide principals to Unique permissions.
Resolution:
Grant access through explicit users/groups that are supported by connector permission sync.
Re-run permission sync after updating site permissions.
See Permissions and Flows for supported resolution behavior.
Why do I see secretOrPrivateKey must be an asymmetric key when using RS256?
Answer: The connector received private key material in an unsupported format.
Common causes:
Key is provided as plain text that is not valid PEM/asymmetric key content
Key and certificate do not match
KeyVault-backed secret value does not contain the expected key file content
Resolution:
Provide private key content in a valid file-based PEM/asymmetric format.
Verify key/certificate pair consistency.
Reapply connector secret/config values and restart the pod.
Multi-Tenant
Can one connector serve multiple SharePoint tenants?
Answer: Not currently. Each SharePoint tenant requires a separate connector deployment. Multi-tenant support is planned for a future release.
Workaround: Deploy multiple connector instances, each configured for a different tenant.
Can I sync from multiple SharePoint sites?
Answer: Yes, configure multiple sites in the tenant configuration:
Static configuration:
sharepoint:
sitesSource: config_file
sites:
- siteId: "site-id-1"
# ... other settings
- siteId: "site-id-2"
# ... other settingsDynamic configuration: Add multiple rows to the SharePoint configuration list.
Each site must have Sites.Selected or library-specific Lists.SelectedOperations.Selected permission granted separately.
Performance
What are the resource requirements?
Answer:
Memory: ~2 GB
CPU: 1 core
Storage: Minimal (streaming, no local storage)
What are the API rate limits?
Answer: Microsoft Graph limits:
~10,000 requests per 10 minutes per app
4 concurrent requests per resource type
The connector respects these limits via configurable rate limiting and exponential backoff.
Certificates
What certificate formats are supported, and do I need the thumbprint?
Answer: Generate certificates with OpenSSL or PowerShell and keep the deployment on connector-compatible asymmetric key/certificate material.
OpenSSL output can be
.key+.crtand optionally.pfxPowerShell commonly produces
.cer/.pfxIf needed, convert formats to the recommended PEM key/cert files before configuring the connector
After uploading the certificate to Entra App Registration, capture the Thumbprint (SHA) and add it to connector configuration where thumbprint is required by your deployment setup.
Related Documentation
Operator Guide - Deployment and operations
Authentication - Auth setup details
Configuration - Environment variables
Permissions - API permissions
Standard References
Microsoft Graph API - Graph documentation
SharePoint REST API - SharePoint REST
Sites.Selected - Sites.Selected permission
Lists.SelectedOperations.Selected - Library-specific permission