SharePoint Connector - FAQ

10 min read

General

What type of connector is this?

Answer: The SharePoint Connector is a pull-based synchronization service that periodically scans SharePoint sites and syncs flagged documents to the Unique knowledge base.

Key characteristics:

  • Runs on a schedule (default: every 15 minutes)

  • Pulls content from SharePoint (vs. push-based Power Automate v1)

  • Requires explicit flagging of documents via a custom column

  • Operates as a background service without user interaction

How does this differ from the Power Automate connector (v1)?

Answer:

Aspect

v1 (Power Automate)

v2 (SharePoint Connector)

Architecture

Push-based

Pull-based

Trigger

Power Automate flow

Scheduled scan

Dependencies

Power Automate license

None (standalone)

Deployment

Power Automate cloud

Kubernetes container

Control

Limited

Full control

Permissions

Why Sites.Selected / Lists.SelectedOperations.Selected instead of Sites.Read.All?

Answer:Sites.Selected and Lists.SelectedOperations.Selected follow the principle of least privilege:

  • Sites.Read.All: Grants access to ALL sites in the tenant

  • Sites.Selected: Only grants access to explicitly approved sites

  • Lists.SelectedOperations.Selected: Only grants access to explicitly approved document libraries

Benefits:

  • Administrators control exactly which sites or libraries are accessible

  • Each grant is auditable and revocable

  • Meets enterprise security requirements

  • Aligns with zero-trust principles

Why do I need GroupMember.Read.All for permission sync?

Answer: SharePoint permissions often reference Entra ID (Azure AD) groups. To sync these permissions to Unique, the connector must:

  1. Read the permission entry (group ID)

  2. Expand the group to get member list

  3. Map members to Unique users

Without GroupMember.Read.All, group-based permissions cannot be synchronized.

Why can't I read SharePoint site group members?

Answer: SharePoint site groups have a visibility setting: "Who can view the membership of the group?"

If this is not set to "Everyone", the connector cannot read group members.

Solutions:

  1. Set group visibility to "Everyone"

  2. Add the app principal as a group member/owner

  3. Grant Full Control to the app principal

How do public and private SharePoint sites affect Everyone permissions?

Answer: Private and public sites can behave differently for tenant-wide visibility:

  • Private site: Access is typically limited to explicit members/owners/visitors.

  • Public site (org-visible): SharePoint can include tenant-wide principals such as Everyone except external users for read visibility.

The connector intentionally does not expand tenant-wide principals (Everyone, Everyone except external users) during permission sync. This avoids broad permission replication into Unique and can create a visible difference between SharePoint and Unique access behavior.

Configuration

What are the two ways to configure sites?

Answer: The connector supports two configuration sources:

Source

Description

Use Case

config_file

Static YAML configuration

Simple deployments, fixed site list

sharepoint_list

Dynamic configuration from SharePoint list

Self-service, frequent changes

Static (YAML file):

yaml
sharepoint:
  sitesSource: config_file
  sites:
    - siteId: "xxx-xxx-xxx"
      syncColumnName: UniqueAI
      ingestionMode: recursive
      scopeId: scope_xxx
      syncMode: content_only

Dynamic (SharePoint list):

yaml
sharepoint:
  sitesSource: sharepoint_list
  sharepointList:
    siteId: "config-site-id"
    listId: "00000000-0000-0000-0000-000000000000"

What columns are needed for the SharePoint configuration list?

Answer: When using sharepoint_list as the sites source, create a list with these columns. Only siteId must be present as a column on the list — every other column marked Yes* can instead be supplied via sharepoint.siteDefaults in the tenant config, in which case the column can be omitted from the list entirely. See Site Defaults.

Column Display Name

Type

Required

Description

siteId

Single line text

Yes

SharePoint site ID (UUID)

syncColumnName

Single line text

Yes*

Column marking files for sync

ingestionMode

Choice

Yes*

flat or recursive

uniqueScopeId

Single line text

Yes*

Unique scope ID

syncStatus

Choice

Yes*

active, inactive, or deleted

syncMode

Choice

Yes*

content_only or content_and_permissions

maxFilesToIngest

Number

No

Optional limit

storeInternally

Choice

No

enabled or disabled

permissionsInheritanceMode

Choice

No

Inheritance settings

subsitesScan

Choice

No

enabled or disabled (default: disabled)

Are subsites automatically included?

Answer: Only if subsitesScan is set to enabled for a site. When enabled, the connector recursively discovers all subsites under the configured site and syncs their content using the parent site's syncColumnName. See Subsites Scanning for details.

How do I find SharePoint Site IDs?

Answer: Several methods are available:

Method 1: Graph Explorer

none
GET https://graph.microsoft.com/v1.0/sites/{hostname}:/{site-path}

Example:

none
GET https://graph.microsoft.com/v1.0/sites/contoso.sharepoint.com:/sites/marketing

Method 2: PowerShell

powershell
Connect-PnPOnline -Url "https://contoso.sharepoint.com/sites/marketing" -Interactive
Get-PnPSite | Select-Object Id

Method 3: SharePoint URL Pattern

The site ID follows the format: {hostname},{site-collection-id},{web-id}

I renamed my sync column in SharePoint but the connector stopped picking up files. Why?

Answer: SharePoint columns have an internal name (set at creation, immutable) and a display name (changeable). Renaming a column in the SharePoint UI only changes the display name — the Microsoft Graph API still uses the original internal name. The syncColumnName in the connector configuration must match the internal name, not the display name.

For example, if you created a column as UniqueAI and later renamed it to SyncToUnique, the connector configuration must still use UniqueAI.

It is recommended to create a new column with desired name and deleting the old one instead of renaming to avoid confusion in the future.

Sync Behavior

What safety guards does the connector have?

Answer: The connector includes safeguards to prevent accidental data loss:

  • Full-deletion protection: If the file diff would delete all files stored in Unique for a site, the sync cycle for that site is aborted. This prevents accidental full deletion due to misconfiguration or transient issues. To intentionally remove all content for a site, set the site's syncStatus to deleted in the configuration.

  • Duplicate scope ID detection: The connector runs two dedup passes: by scopeId (skipped for in_parent: rows since multiple sites can legitimately share the same parent) and by siteId (a SharePoint site can only be configured once, regardless of variant). First-occurrence wins; later duplicates are logged and skipped.

  • Scope ownership validation: Each root scope is tagged with the site that owns it. If a scope was already claimed by a different site, the sync for that site fails immediately, preventing two sites from accidentally writing into the same scope.

What happens when a file is deleted from SharePoint?

Answer: The file is automatically removed from the Unique knowledge base on the next sync cycle. The file diff mechanism detects:

  • Files deleted from SharePoint

  • Files with sync flag changed to "No"

Both are treated as deletions in Unique.

What happens if I change a site's scopeId?

Answer: The connector detects that the root scope has changed and automatically migrates all child scopes from the old root to the new root. After migration the old root scope is deleted. If migration fails for any child scope, the error is logged and the sync continues.

The same migration path covers transitions between the two scopeId variants:

  • scope_<X>scope_<Y> — existing behaviour, no change.

  • scope_<X>in_parent:scope_<P> — the existing scope is reused (claim by externalId still hits) and moved under <P> if it isn't already there.

  • in_parent:scope_<P>scope_<Y> — the existing root-scope-migration logic moves children to <Y> and deletes the auto-created scope.

  • in_parent:scope_<X>in_parent:scope_<Y> — the existing scope is moved under <Y> (same global externalId lookup).

When should I use in_parent: instead of a fixed scope ID?

Answer: Use in_parent:scope_<parentId> when you want the connector to find-or-create a per-site root scope under a shared parent automatically, instead of pre-creating one scope per site. It's useful when many sites need to be onboarded quickly and you don't want operators to materialise a scope before each ingestion request. The auto-created scope is named after the SharePoint site's URL slug. Removing the site (via syncStatus: deleted) removes the auto-created scope.

What happens if I unflag a document?

Answer: Setting the sync column to "No" is treated as a deletion request. On the next sync cycle:

  1. Connector detects the flag change via the server-side file diff

  2. File is removed from the Unique knowledge base

Are subfolders synced?

Answer: It depends on the ingestionMode setting for each site:

Mode

Behavior

recursive

Scans all subfolders, maintains folder hierarchy in Unique

flat

All flagged files go to a single root scope

The sync column must be set on individual files (not folders).

What file types are supported?

Answer: The Helm chart ships the following default MIME types:

  • PDF (.pdf)

  • Word (.docx)

  • Excel (.xlsx)

  • PowerPoint (.pptx)

  • Text (.txt)

  • HTML (.html)

  • ASP/ASPX (.asp, .aspx)

  • CSV (.csv)

SharePoint pages (.aspx) bypass the MIME type filter and are always eligible regardless of configuration. Additional or fewer types can be configured via allowedMimeTypes in the processing configuration. Note: there is no schema-level default — allowedMimeTypes must be explicitly configured.

About CSV files: SharePoint reports .csv files with the MIME type application/vnd.ms-excel (the legacy Excel type), which the ingestion service rejects. The connector ships a default mimeTypeOverridesByExtension mapping of .csvtext/csv that rewrites the reported MIME type before the allow-list check. If you previously added application/vnd.ms-excel to allowedMimeTypes as a workaround, you should now use text/csv instead — and remove application/vnd.ms-excel unless you actually want to ingest legacy .xls files. If you already configure mimeTypeOverridesByExtension, note that user-supplied maps replace the default wholesale; include .csv: text/csv explicitly in your map to retain the fix. See MIME Type Overrides by Extension.

What is the maximum file size?

Answer: Default is 200 MB, configurable via maxFileSizeToIngestBytes in the processing configuration. Larger files are skipped with a warning in the logs.

Troubleshooting

Why aren't my documents syncing?

Checklist:

  1. Is the sync column set to "Yes" for the document?

  2. Is the site configured (in YAML or SharePoint list)?

  3. Is the site's syncStatus set to active?

  4. Is the file type in allowedMimeTypes?

  5. Is the file under maxFileSizeToIngestBytes?

  6. Check connector logs for errors

How does the connector behave when errors occur?

Answer: The connector uses scenario-based handling to keep sync cycles running:

  • transient API/network issues are retried with backoff

  • non-retryable item errors are logged and skipped

  • configuration/authentication problems require operator action and can fail a cycle early

Detailed behavior by scenario is documented in Flows.

Why do I see "Site not found" errors?

Causes:

  • Incorrect site ID

  • Site-specific permission not granted

  • Site deleted or renamed

Resolution:

  1. Verify site ID using Graph Explorer

  2. Re-grant site access via PowerShell

  3. Check site exists in SharePoint

Why do I see "Access denied" errors?

Causes:

  • Sites.Selected or Lists.SelectedOperations.Selected not granted for the site/library

  • Admin consent not completed

  • Certificate/credential issues

  • Permission scope mismatch: Library-level grants (issued with the -List parameter) require Lists.SelectedOperations.Selected on the app registration. If the app only has Sites.Selected, those library-level grants are invisible to it and requests return 403.

Resolution:

  1. Check whether the admin issued site-level or library-level grants, and ensure the app registration has the matching permission (Sites.Selected for sites, Lists.SelectedOperations.Selected for libraries, or both)

  2. Complete admin consent in Azure Portal

  3. Verify certificate configuration

Why is sync taking too long?

Possible causes:

  • Too many files to process

  • Large file sizes

  • API rate limiting

  • Network latency

Solutions:

  1. Increase concurrency in the processing configuration

  2. Review and reduce flagged files

  3. Check for rate limit warnings in logs

  4. Verify network connectivity

Why is a public SharePoint site accessible, but its content is not visible in Unique for all users?

Answer: This usually occurs when SharePoint visibility is granted through tenant-wide groups such as Everyone or Everyone except external users.

  • SharePoint may allow broad read access through those principals.

  • The connector does not sync those tenant-wide principals to Unique permissions.

Resolution:

  1. Grant access through explicit users/groups that are supported by connector permission sync.

  2. Re-run permission sync after updating site permissions.

  3. See Permissions and Flows for supported resolution behavior.

Why do I see secretOrPrivateKey must be an asymmetric key when using RS256?

Answer: The connector received private key material in an unsupported format.

Common causes:

  • Key is provided as plain text that is not valid PEM/asymmetric key content

  • Key and certificate do not match

  • KeyVault-backed secret value does not contain the expected key file content

Resolution:

  1. Provide private key content in a valid file-based PEM/asymmetric format.

  2. Verify key/certificate pair consistency.

  3. Reapply connector secret/config values and restart the pod.

Multi-Tenant

Can one connector serve multiple SharePoint tenants?

Answer: Not currently. Each SharePoint tenant requires a separate connector deployment. Multi-tenant support is planned for a future release.

Workaround: Deploy multiple connector instances, each configured for a different tenant.

Can I sync from multiple SharePoint sites?

Answer: Yes, configure multiple sites in the tenant configuration:

Static configuration:

yaml
sharepoint:
  sitesSource: config_file
  sites:
    - siteId: "site-id-1"
      # ... other settings
    - siteId: "site-id-2"
      # ... other settings

Dynamic configuration: Add multiple rows to the SharePoint configuration list.

Each site must have Sites.Selected or library-specific Lists.SelectedOperations.Selected permission granted separately.

Performance

What are the resource requirements?

Answer:

  • Memory: ~2 GB

  • CPU: 1 core

  • Storage: Minimal (streaming, no local storage)

What are the API rate limits?

Answer: Microsoft Graph limits:

  • ~10,000 requests per 10 minutes per app

  • 4 concurrent requests per resource type

The connector respects these limits via configurable rate limiting and exponential backoff.

Certificates

What certificate formats are supported, and do I need the thumbprint?

Answer: Generate certificates with OpenSSL or PowerShell and keep the deployment on connector-compatible asymmetric key/certificate material.

  • OpenSSL output can be .key + .crt and optionally .pfx

  • PowerShell commonly produces .cer/.pfx

  • If needed, convert formats to the recommended PEM key/cert files before configuring the connector

After uploading the certificate to Entra App Registration, capture the Thumbprint (SHA) and add it to connector configuration where thumbprint is required by your deployment setup.

Standard References

Last updated