Sharepoint Connector
This guide helps customers resolve issues when using Fivetran to sync SharePoint data (such as documents or spreadsheets) into a destination warehouse for Snowfire AI, an Adaptive Decision Intelligence platform that leverages AI for data synthesis and analysis. A common issue is the failure to recursively sync nested SharePoint folders due to configuration errors. This document outlines the problem, its causes, solutions, and best practices, designed to fit seamlessly into a text-based document without relying on tables. Customize with specific Snowfire or SharePoint configurations, and include screenshots of Fivetran's setup forms or Snowfire query outputs where relevant.
Introduction
This troubleshooting guide addresses syncing issues between Fivetran's SharePoint connector and Snowfire AI, particularly when nested folder contents are not captured. It covers setup verification, the specific issue of non-recursive folder syncing, related problems, and preventive measures. The guide assumes you are using SharePoint Online and have a basic Fivetran setup. For initial configuration, refer to Fivetran's SharePoint setup guide at https://fivetran.com/docs/connectors/files/share-point/setup-guide. Before proceeding, ensure you have access to Fivetran's dashboard, Snowfire's data ingestion logs, and SharePoint site permissions.
Before You Start
- Confirm the connector type is set to Files (SharePoint) with either Magic Folder or Merge Mode.
- Check Snowfire's dashboard for signs of incomplete data (e.g., queries missing expected documents).
- Review Fivetran sync history for errors like "No new files detected" or unexpectedly low row counts.
- Enable debug logging in Fivetran to capture detailed error messages.
Common Issue: Nested SharePoint Folders Not Synced
Symptoms
When syncing SharePoint data, only files in the top-level folder appear in the destination warehouse. Files in subfolders are missing, leading to incomplete datasets in Snowfire AI queries. For example, a document search in Snowfire might return results only from the root folder, omitting critical files nested deeper. Fivetran logs may show a successful sync but with fewer rows than expected, without explicit error messages like "403 Forbidden" or "Timeout."
Root Cause
The issue typically arises because Fivetran's SharePoint connector is configured in Magic Folder Mode, which syncs only files directly in the specified root folder and ignores all subfolders. In contrast, Merge Mode recursively syncs files from the root folder and all its subfolders into a single destination table. Customers often select Magic Folder Mode during setup, unaware that it does not support recursive folder traversal. Additionally, deep folder nesting (e.g., 10+ levels) or inconsistent subfolder permissions can exacerbate the issue, though these are secondary to mode selection.
Solutions
To resolve this, follow these table-free steps to ensure recursive syncing of nested folders:
- Switch to Merge Mode:
In the Fivetran dashboard, navigate to Connectors, select your SharePoint connector, and click Edit Connection. Under Sync Strategy, change from Magic Folder to Merge Mode. Specify the destination schema and table name (e.g., sharepoint_documents). For Excel or CSV files, define a cell range (e.g., A1:Z1000) to avoid syncing entire sheets, which can inflate row counts. After updating, trigger a full historical sync to capture all files, including those in subfolders. This may take time for large libraries. Verify by querying the destination table for files with metadata like folder_path or file_path, ensuring subfolder contents appear in Snowfire queries. - Verify SharePoint Permissions:
Ensure the SharePoint app or user account used by Fivetran has "Read" access to all subfolders. Inconsistent permissions can cause partial syncs, where some subfolders are skipped. Check Microsoft Entra ID or SharePoint site settings to grant site-wide read access. If logs show 403 Forbidden errors for specific subpaths, re-authorize the connector after updating permissions. - Optimize Folder Structure:
If recursive syncing is slow due to thousands of files or deep nesting, consider flattening the SharePoint folder structure before syncing (e.g., move files to fewer subfolders). Alternatively, create multiple Fivetran connectors, each targeting a specific subfolder. In Snowfire, use post-sync dbt models to reconstruct folder hierarchies using path metadata for AI-friendly querying. - Enable Incremental Syncs:
Merge Mode supports incremental updates based on file last-modified timestamps. After the initial full sync, enable incremental syncs to reduce runtime. If new subfolder files are not detected, manually trigger a sync and verify that SharePoint metadata (e.g., timestamps) is updating correctly.
Verification Steps
To confirm the fix, run a test sync on a small SharePoint folder with known nested files. In the destination warehouse, query the total row count and compare it to the expected number of files across all folder levels. In Snowfire, execute a sample query (e.g., document search across a department's files) to ensure subfolder contents are included. Check the file_path column in the destination to confirm nested paths are present.
Related Issues
- Incomplete Data in Snowfire Queries:
If nested folders are not synced, Snowfire queries will miss critical data. Ensure Merge Mode is enabled and a full historical sync is completed to capture all files. - Performance with Deep Nesting:
Deep folder structures can slow syncs or cause timeouts. To mitigate, limit folder depth in SharePoint or schedule syncs during off-peak hours. Monitor performance via Fivetran alerts. - File Type/Format Errors:
Nested Excel files may fail to sync if in unsupported formats (e.g., older .xls files). In Merge Mode, configure cell ranges to ensure proper parsing.
Best Practices
- Always choose Merge Mode for SharePoint libraries with nested folders, unless the structure is confirmed to be flat.
- Leverage the file_path column in the destination to enable Snowfire's semantic search across folder hierarchies.
- Align Fivetran sync frequency (e.g., hourly) with Snowfire's data refresh needs to ensure AI queries reflect the latest data.
- Prevent recurrence by documenting the Sync Strategy choice in setup guides and testing with sample nested folders.
Resources