Sharepoint Connector

This guide helps customers resolve issues when using Fivetran to sync SharePoint data (such as documents or spreadsheets) into a destination warehouse for Snowfire AI, an Adaptive Decision Intelligence platform that leverages AI for data synthesis and analysis. A common issue is the failure to recursively sync nested SharePoint folders due to configuration errors. This document outlines the problem, its causes, solutions, and best practices, designed to fit seamlessly into a text-based document without relying on tables. Customize with specific Snowfire or SharePoint configurations, and include screenshots of Fivetran's setup forms or Snowfire query outputs where relevant.

Introduction

This troubleshooting guide addresses syncing issues between Fivetran's SharePoint connector and Snowfire AI, particularly when nested folder contents are not captured. It covers setup verification, the specific issue of non-recursive folder syncing, related problems, and preventive measures. The guide assumes you are using SharePoint Online and have a basic Fivetran setup. For initial configuration, refer to Fivetran's SharePoint setup guide at https://fivetran.com/docs/connectors/files/share-point/setup-guide. Before proceeding, ensure you have access to Fivetran's dashboard, Snowfire's data ingestion logs, and SharePoint site permissions.

Connecting SharePoint to Fivetran: Setup Guide

Prerequisites

Before connecting SharePoint to Fivetran, ensure you have:

  1. A SharePoint account.
  2. A SharePoint folder containing files with supported file types and encodings.
  3. The ability to grant Fivetran permission to read from this account.

Setup Instructions

Step 1: Choose SharePoint Access Type

Decide how Fivetran will access your SharePoint site based on your permission levels within Microsoft. In the Fivetran connection setup form, select one of the following Access Types:

1.1 Quick Setup (All Sites via Fivetran App)

  • Description: Access is delegated to Fivetran’s pre-registered Azure app. This is the simplest method and most commonly used if you have the necessary permissions.
  • Requirements:
    • Microsoft Entra Administrator (Global Admin) with “managePermissionGrantsForAll” permission.
    • SharePoint non-admin user, if allowed by a Global Administrator. See Microsoft Entra built-in roles for details.
  • Permissions Needed by Fivetran:
    • files.read
    • offline_access
    • Sites.Read.All
  • Action: Select Quick Setup in the connection setup form and proceed to Step 4: Select Sync Strategy.

1.2 Connection Setup by Non-Admin User

  • Description: A non-admin can complete the setup after a Global Administrator configures permissions. This step is performed only once.
  • Sub-Steps:
    1. In the Azure Portal, go to Microsoft Entra ID.
    2. Search for Enterprise applications in the search bar and click it.
    3. Under Security in the left navigation menu, click Consent and permissions.
    4. Under User consent settings, select Allow user consent for apps from verified publishers, for selected permissions (Recommended) and click Save.
    5. Go to Permission classifications and click Add permissions.
    6. Select Microsoft Graph and add the following permissions:
      • Sites.Read.All
      • Files.Read
      • offline_access
      • profile
      • openid
      • email
    7. (Alternative) A Global Administrator can enable users to request admin consent by navigating to Microsoft Entra ID > Enterprise Applications > Consent and Permissions > Admin Consent Settings. This prompts users to request admin approval during connection authorization.
  • Action: After updating consent settings, proceed to Step 3: Find Folder URL.

1.3 Advanced Setup (Granular Access Using Your Own Azure App)

  • Description: Create and configure your own Azure app in Microsoft Entra ID for authentication with Fivetran. This method allows reuse across multiple Fivetran connections and offers granular access control.
  • Sub-Steps:
    • 1.3.1 Register Your App (Perform only once per SharePoint tenant):
      1. In the Azure Portal, go to Microsoft Entra ID > App registrations > New registration.
      2. Enter a Name for your application (needed for Fivetran configuration).
      3. Select Accounts in this organizational directory only and click Register.
      4. On the app’s Overview page, note the Application (client) ID and Directory (tenant) ID.
      5. Go to Certificates & secrets > New client secret and note the secret value.
      6. Navigate to API Permissions > Add a permission > Microsoft Graph.
      7. Select Application permissions, choose Sites.Selected, and click Add permissions.
      8. Click Grant admin consent to apply the permissions.
    • 1.3.2 Grant Access for a Selected Site (Perform only once per SharePoint site):
      1. Navigate to the SharePoint site you want to connect to Fivetran and append /_api/site/id to the site’s URL in your browser.
      2. Note the site ID from the JSON response.
      3. Sign in to Microsoft Graph Explorer with a SharePoint Administrator (or higher) account.
      4. In Graph Explorer, click your avatar, select Consent to permissions, and choose Consent for Sites.FullControl.All.
      5. Make a POST request to https://graph.microsoft.com/v1.0/sites/{site_id}/permissions with:
        • Request Header: Content-Type: application/json
        • Request Body:{
             "roles": ["read"],
             "grantedToIdentities": [
                 {
                     "application": {
                         "id": "<application_client_id>",
                         "displayName": "<application_name>"
                     }
                 }
             ]
          }
        • Replace {site_id}, <application_client_id>, and <application_name> with values from previous steps.
      6. Allow up to a minute for changes to reflect in the SharePoint app.
  • Action: Proceed to Step 3: Find Folder URL.

1.4 Limited Sites (Legacy)

  • Description: Use this option only if selected in the connection setup form. Requires SharePoint admin permissions.
  • Sub-Steps:
    • 1.4.1 Find Tenant ID:
      1. In the Azure Portal, go to Microsoft Entra ID.
      2. Go to the Overview tab and note the Tenant ID.
    • 1.4.2 Find SharePoint Site ID:
      ECC1. Log in to SharePoint and navigate to the site you want to connect to Fivetran.
      1. Go to Home and copy the site URL.
      2. Append /_api/site/id to the URL and enter it in the browser.
      3. Note the Site ID from the JSON response. Ensure you are signed in to SharePoint.
    • 1.4.3 Grant Fivetran Access to Your Site:
      • Option 1: Using Microsoft Graph Explorer (Recommended):
        1. Log in to Microsoft Graph Explorer.
        2. Click the profile avatar, select Consent to permissions > Consent for Sites.FullControl.All. This grants permission to Graph Explorer, not Fivetran (revokable after connection creation).
        3. Run the following query, replacing {site_id} with the Site ID:{
             "roles": ["read"],
             "grantedToIdentities": [
                 {
                     "application": {
                         "id": "fc84eb7d-8983-4279-858a-45a9dafd02b0",
                         "displayName": "Fivetran App for Limited Access"
                     }
                 }
             ]
          }
        4. Verify a 201 Created response.
      • Option 2: Using an App in Microsoft Entra ID:
        1. In the Azure Portal, go to Microsoft Entra ID > App Registrations > New Registration.
        2. Enter a Name and click Register.
        3. Note the Application (client) ID from the Overview section.
        4. Navigate to API permissions > Add a permission > Microsoft Graph > Application permissions > Sites.FullControl.All.
        5. Click Grant Admin Consent for {your tenant name} > Yes.
        6. Go to Certificates & secrets > Add New client secret and note the value.
        7. In Postman, import this curl request:curl -X POST -H "Content-Type: application/x-www-form-urlencoded" -d 'client_id=<client_id>&scope=https%3A%2F%2Fgraph.microsoft.com%2F.default&client_secret=<client_secret>&grant_type=client_credentials' 'https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token'
        8. Replace <client_id>, <client_secret>, and <tenant_id> with noted values.
        9. Send the request and note the Access Token.
        10. In Postman, import this curl request:curl --location -g --request POST "https://graph.microsoft.com/v1.0/sites/<site_id>/permissions" \
          --header 'Content-Type: application/json' \
          --header 'Authorization: Bearer <access_token>' \
          --data-raw '{
             "roles": ["read"],
             "grantedToIdentities": [{
                 "application": {
                     "id": "fc84eb7d-8983-4279-858a-45a9dafd02b0",
                     "displayName": "Fivetran App for Limited Access"
                 }
             }]
          }'
        11. Replace <access_token> and <site_id> with noted values.
        12. Send the request and verify a 201 Created response.
  • Action: Proceed to Step 3: Find Folder URL.

Step 2: Re-Authorize Connection (If Needed)

  • If connection setup tests fail after granting permissions:
    1. In the connection setup form, click Re-Authorize Connection.

Step 3: Find Folder URL

  1. Log in to your SharePoint account.
  2. Navigate to the folder you want to sync.
  3. Find the folder’s URL, click Copy link, then Copy to note it for Fivetran configuration.

Step 4: Select Sync Strategy

Ensure you (or the authorizing user) are a member of the SharePoint site to sync.

  1. In the connection setup form, select a Sync Strategy:
    • Magic Folder
    • Merge Mode
  2. Enter a Destination schema name of your choice.
  3. For Merge Mode, enter a Table group name. This combines with the destination schema to form the Fivetran connection name <destination_schema>.<table_group_name>. It’s used only in Fivetran and does not appear in your destination.
  4. Based on your access type, complete the following:
    • Quick Setup:
      1. Click Authorize. You’ll be redirected to SharePoint to authorize Fivetran’s access, then back to Fivetran.
    • Advanced Setup:
      1. In the Connect section, enter the Tenant ID, Client ID, and Client Secret noted earlier.
      2. In the Base folder URL field, enter the folder URL from Step 3.
      3. (Optional) Click Run connection test to validate credentials. You can skip this, but it will be tested after configuration.
    • Limited Sites (Legacy):
      1. Enter the Microsoft Entra ID Tenant ID from Step 1.4.1.
      2. Click Authorize. You’ll be redirected to SharePoint to authorize Fivetran’s access, then back to Fivetran.

Step 5: Configure Connection (Merge Mode)

For Merge Mode, configure options to sync specific folders, file types, or subsets of files. Multiple connections with different options can target the same file system.

5.1 Configure Files

  • File Handling:
    • Extract structured data into destination tables: Parses supported file types into destination tables. Recommended for most use cases.
    • Replicate unstructured files (Beta): Copies files in their original format (e.g., PDFs, images). Compressed files are not synced. See Fivetran’s documentation.
    • (Optional, Advanced Setup only) Enable Sync file access permissions to sync user-specific permissions. Requires Azure App scope changes:
      • GroupMember.Read.All: To retrieve group members.
      • User.ReadBasic.All: To sync user email addresses (otherwise, only user IDs are synced).
  • Base Folder Path:
    • Enter the folder URL from Step 3. Fivetran examines files in this folder and its subfolders.
    • Available for Quick Setup or Admin-Approved Site Authentication - Controlled Access.
  • Files:
    1. Click + Add files to map file name patterns to destination tables.
    2. Enter a unique Table name across all SharePoint connections in the same destination schema.
    3. (Optional) File Pattern: Use a regular expression to select files under the prefix. Leave blank to sync all files. Examples:
      • data/.*: All files in the data folder, including subfolders.
      • data/.*json: All JSON files in the data folder, including subfolders.
      • data/subFolder2/report_.*\.json: JSON files in subFolder2 starting with report_.
      • report_\d{2}/\d{2}/\d{4}\.json: JSON files with date format like report_03/12/2050.json.
      • Test regex before proceeding.
      • (Optional) Click Preview Files to validate. You can skip this, but it will be tested after configuration.
    4. (Optional) Archive File Pattern: Use a regular expression to sync files from archived folders (e.g., .*json for JSON files). Configure per table to route file types to specific tables.
    5. Click Save.

5.2 Format

  • File Type: All files are processed as the selected type. Use File Pattern to specify extensions.
    • CSV or TSV:
      • Specify delimiter, escape character, and other settings as needed.
    • JSON or JSONL:
      • JSON Delivery Mode:
        • Packed: Loads JSON data into the _data column without flattening.
        • Unpacked: Flattens one level of columns and infers data types.
    • XLS/XLSX/XLSM:
      • By default, Fivetran analyzes spreadsheets to identify cell references. Alternatively, enable Manually provide cell reference.
      • Analyze Sheet:
        1. In the Spreadsheet to find data to be synced field, enter the path to an Excel file.
        2. Click Analyze sheet.
        3. Select the cell reference from the Cell reference for syncs drop-down.
      • Manually Provide Cell Reference:
        1. Enable the toggle.
        2. Enter the cell reference in the format '<sheetName>'!<startColumnName><startRowName> (e.g., 'Data2'!C3).
      • See Fivetran’s documentation for Excel syncing details.
  • Primary Key Used for File Process and Load:
    • Upsert file using file name and line number: Uses _file and _line as surrogate keys. Unique file names are synced as new data.
    • Append file using file modified time: Uses _file, _line, and _modified. Tracks full file history.
    • Upsert file using custom primary key: Keeps the latest record version. Choose keys after saving and testing (modifiable only if initial sync fails). See Fivetran’s documentation.

5.3 Additional Options

  • Compression:
    • For compressed files without extensions (e.g., .gz, .zip), select a compression algorithm (e.g., GZIP). Use infer if files have correct extensions, or uncompressed to sync without decompression.
  • Error Handling:
    • Skip: Ignores improperly formatted data, syncing only valid data.
    • Fail: Fails the sync if any malformed data is found. Recommended unless malformed data is expected.
    • Errors trigger notifications on the Fivetran dashboard.

Step 6: Finish Fivetran Configuration

  1. Click Save & Test.
  2. Fivetran will sync your SharePoint data and validate the connection.
  3. On successful completion of setup tests, your data will sync to your destination.

Step 7: Understand Setup Tests

Fivetran performs the following tests based on your sync strategy:

  1. (Magic Folder Mode) Validating Shared URL from SharePoint: Verifies access to the specified folder URL.
  2. (Merge Mode) Connecting to API: Confirms access to SharePoint folders (if configured).
  3. (Merge Mode) Finding Tables: Ensures at least one table is specified in the files field.
  4. (Merge Mode) Validating Regex File Pattern: Validates all file pattern regexes.
  5. (Merge Mode) Validating Archive Pattern: Validates archive pattern regex (if specified).
  6. (Merge Mode) Validating EscapeChar: Checks escape character length (one character) for CSV files (if specified).
  7. Validating Infer FileType: Ensures infer is valid for API-created connections.
  8. (Merge Mode) Multi-Character Delimiter Support: Validates delimiter length (within 15 characters) for CSV files (if specified).
  9. (Merge Mode) Finding Matching Files: Verifies retrieval of 1–10 sample files based on configuration.

Related Issues

  • Incomplete Data in Snowfire Queries:
    If nested folders are not synced, Snowfire queries will miss critical data. Ensure Merge Mode is enabled and a full historical sync is completed to capture all files.
  • Performance with Deep Nesting:
    Deep folder structures can slow syncs or cause timeouts. To mitigate, limit folder depth in SharePoint or schedule syncs during off-peak hours. Monitor performance via Fivetran alerts.
  • File Type/Format Errors:
    Nested Excel files may fail to sync if in unsupported formats (e.g., older .xls files). In Merge Mode, configure cell ranges to ensure proper parsing.

Best Practices

  • Always choose Merge Mode for SharePoint libraries with nested folders, unless the structure is confirmed to be flat.
  • Leverage the file_path column in the destination to enable Snowfire's semantic search across folder hierarchies.
  • Align Fivetran sync frequency (e.g., hourly) with Snowfire's data refresh needs to ensure AI queries reflect the latest data.
  • Prevent recurrence by documenting the Sync Strategy choice in setup guides and testing with sample nested folders.

Resources

September 29, 2025