Reading time: 5 min read

Building a CSV import system for SitecoreAI

Learn how to build a self-contained Next.js application for importing CSV data into SitecoreAI using the Authoring GraphQL API.

Portrait photo of John Flores, article author

Dealing with hundreds of data items

Many organizations manage hundreds of content items annually. Teams often receive data in CSV format from various sources, but getting this data into SitecoreAI was a manual, time-consuming process involving:

  • Manual content entry for each item (hundreds per batch)
  • Copy-paste errors and inconsistencies
  • No validation before import
  • Duplicate entries when running imports multiple times
  • Missing folder structure requiring manual setup

You end up with hours of manual work, prone to human error and just frustrations from content editors.

A self-contained import web app

Sitecore has made efforts into creating work arounds where you can create your own script in order to deal with these issues, we decided to takke a step further. We prioritize user experience by building a dedicated web application that content editors use directly.

How it works

The import process follows this workflow:

  1. Upload CSV: User uploads CSV file via web interface
  2. Validate: System validates required fields and business rules
  3. Preview: User reviews items to be imported
  4. Import: System creates/updates items in Sitecore via GraphQL
  5. Report: User sees success/error report with details

GraphQL implementation

The system uses Sitecore's Authoring GraphQL API to create and update items. The process is split into two steps for reliability.

Step 1: Creating items

mutation CreateItem(
  $name: String!
  $templateId: ID!
  $parent: ID!
  $language: String!
) {
  createItem(
    input: {
      name: $name
      templateId: $templateId
      parent: $parent
      language: $language
    }
  ) {
    item {
      itemId
      name
      path
    }
  }
}

Important Notes:

  • parent parameter requires a GUID (not a path) - query for parent ID first
  • templateId uses your custom template GUID
  • Don't try to include field values here - update them separately

Step 2: Updating fields

mutation {
  updateItem(
    input: {
      itemId: "{GUID}"
      database: "master"
      language: "en"
      version: 1
      fields: [
        { name: "title", value: "Example Title", reset: false }
        { name: "startDateTime", value: "2026-07-01T11:00:00", reset: false }
      ]
    }
  ) {
    item {
      itemId
      name
    }
  }
}

Important Notes:

  • Always include database: "master" parameter
  • Set reset: false for each field to preserve existing data
  • Inline mutation values are more reliable than typed variables
  • Empty field values should be filtered out before calling this mutation

Implementation patterns

Here are some things we've taken note of which plays a big role in preventing any side effects and keeping the flow of the application seamless.

1. Duplicate prevention

Running the same import twice shouldn't create duplicate items. This three-layer system prevents that:

// Check cache first
let existingId = itemIdCache.get(fullItemPath);
if (!existingId) {
  // Not in cache, check Sitecore directly
  try {
    existingId = await getItemIdByPath(fullItemPath, accessToken);
    if (existingId) {
      itemIdCache.set(fullItemPath, existingId); // Cache it
    }
  } catch (error) {
    // Item doesn't exist - we'll create it
  }
}
// If exists, update instead of create
if (existingId) {
  await updateItemFields(existingId, fieldsToUpdate, accessToken);
  return { id: existingId, path: fullItemPath, name: itemName };
}

This approach:

  • Checks in-memory cache first (fast)
  • Falls back to Sitecore query if not cached
  • Updates existing items instead of creating duplicates
  • Caches results for future lookups

2. Auto-creating folder structure

Items often need to be organized in folder hierarchies. This recursive function creates the entire path automatically:

async function getOrCreateParentId(itemPath: string, accessToken: string) {
  // Try to get existing item
  try {
    return await getItemIdByPath(itemPath, accessToken);
  } catch (error) {
    // Doesn't exist - create it
    const folderName = pathParts[pathParts.length - 1];
    const parentPath = pathParts.slice(0, -1).join('/');
    // Recursively ensure parent exists
    const parentId = await getOrCreateParentId(parentPath, accessToken);
    // Create this folder
    return await createFolder(parentPath, folderName, accessToken);
  }
}

Example: Creating an item at /sitecore/content/YourSite/Events/2026/July/event-name will automatically create Events/, 2026/, and July/ folders if they don't exist.

3. Unique item names

Multiple items with the same title need unique item names in Sitecore. This function generates URL-safe names with identifiers:

function generateItemName(title: string, uniqueIdentifier?: string): string {
  let itemName = title
    .toLowerCase()
    .replace(/[^a-z0-9\s-]/g, '')
    .replace(/\s+/g, '-')
    .replace(/-+/g, '-')
    .replace(/^-|-$/g, '');
  if (uniqueIdentifier) {
    // Add unique identifier (could be date, ID, etc.)
    itemName += `-${uniqueIdentifier}`;
  }
  return itemName;
}

Example: "Summer Festival" on July 1, 2026 becomes summer-festival-2026-07-01 This ensures:

  • URL-safe item names (no special characters)
  • Unique names even with duplicate titles
  • Meaningful, readable names in content tree

Setup requirements

To build an implementation like this, your environment will need to be configured to talk to the Sitecore Authoring GraphQL endpoint.

Configuration checklist:

  • Authentication: Your app needs access to your CLIENT_ID, CLIENT_SECRET, and the specific AUTHORING_ENDPOINT for your instance.
  • OAuth Permissions: The client must be granted xmcloud.item.create, xmcloud.item.update, and xmcloud.item.read scopes.
  • Template Prep: Ensure your custom templates in Sitecore have field names that match your CSV mapping.

Best practices for a CSV import system

When you begin developing your own implementation, we recommend following these core practices. We found these to be vital for maintaining a stable and user-friendly experience:

  1. Always use a two-step process: Create items first, then update fields separately
  2. Cache aggressively: Store item IDs and folder IDs to minimize API calls
  3. Validate early: Check CSV data before starting the import
  4. Handle duplicates gracefully: Check for existing items and update instead of creating
  5. Generate unique names: Include dates, IDs, or other identifiers in item names
  6. Create folders recursively: Don't assume parent folders exist
  7. Filter empty values: Remove empty fields before GraphQL calls
  8. Use GUIDs for parents: Always query for parent folder GUIDs rather than using paths

CSV import use cases

If youโ€™re considering implementing this pattern in a future project, here are several scenarios where a CSV import system is particularly effective:

  • Products & SKUs: Import product catalogs with specifications
  • Blog Posts & Articles: Bulk content migration from other systems
  • Events & Calendar Items: Season schedules, event listings
  • Media Library Metadata: Organize and tag media assets
  • Marketing Campaigns: Campaign data, landing pages, promotions

Where do we go from here?

This standalone application effectively handles bulk CSV imports, but it does live outside the core Sitecore environment. As a next step, we are looking toward the Sitecore Marketplace. Integrating this methodology directly into the Marketplace setup would keep the tool secure and accessible right within the Sitecore UI, further reducing friction for content teams.

Additional resources