March 20, 20265 min read

Building a CSV import system for SitecoreAI

Learn how to build a self-contained Next.js application for importing CSV data into SitecoreAI using the Authoring GraphQL API.

John Flores

Senior Front-End Developer, 2x Sitecore Technology MVP

Senior Front-End Developer

2x Sitecore Technology MVP

John blends design and code to craft websites that look great and work flawlessly, backed by a Computer Science background and cross-industry experience.

Dealing with hundreds of data items

Many organizations manage hundreds of content items annually. Teams often receive data in CSV format from various sources, but getting this data into SitecoreAI was a manual, time-consuming process involving:

Manual content entry for each item (hundreds per batch)
Copy-paste errors and inconsistencies
No validation before import
Duplicate entries when running imports multiple times
Missing folder structure requiring manual setup

You end up with hours of manual work, prone to human error and just frustrations from content editors.

A self-contained import web app

Sitecore has made efforts into creating work arounds where you can create your own script in order to deal with these issues, we decided to takke a step further. We prioritize user experience by building a dedicated web application that content editors use directly.

How it works

The import process follows this workflow:

Upload CSV: User uploads CSV file via web interface
Validate: System validates required fields and business rules
Preview: User reviews items to be imported
Import: System creates/updates items in Sitecore via GraphQL
Report: User sees success/error report with details

GraphQL implementation

The system uses Sitecore's Authoring GraphQL API to create and update items. The process is split into two steps for reliability.

Step 1: Creating items

mutation CreateItem(
  $name: String!
  $templateId: ID!
  $parent: ID!
  $language: String!
) {
  createItem(
    input: {
      name: $name
      templateId: $templateId
      parent: $parent
      language: $language
    }
  ) {
    item {
      itemId
      name
      path
    }
  }
}

Important Notes:

parent parameter requires a GUID (not a path) - query for parent ID first
templateId uses your custom template GUID
Don't try to include field values here - update them separately

Step 2: Updating fields

mutation {
  updateItem(
    input: {
      itemId: "{GUID}"
      database: "master"
      language: "en"
      version: 1
      fields: [
        { name: "title", value: "Example Title", reset: false }
        { name: "startDateTime", value: "2026-07-01T11:00:00", reset: false }
      ]
    }
  ) {
    item {
      itemId
      name
    }
  }
}

Important Notes:

Always include database: "master" parameter
Set reset: false for each field to preserve existing data
Inline mutation values are more reliable than typed variables
Empty field values should be filtered out before calling this mutation

Implementation patterns

Here are some things we've taken note of which plays a big role in preventing any side effects and keeping the flow of the application seamless.

1. Duplicate prevention

Running the same import twice shouldn't create duplicate items. This three-layer system prevents that:

// Check cache first
let existingId = itemIdCache.get(fullItemPath);
if (!existingId) {
  // Not in cache, check Sitecore directly
  try {
    existingId = await getItemIdByPath(fullItemPath, accessToken);
    if (existingId) {
      itemIdCache.set(fullItemPath, existingId); // Cache it
    }
  } catch (error) {
    // Item doesn't exist - we'll create it
  }
}
// If exists, update instead of create
if (existingId) {
  await updateItemFields(existingId, fieldsToUpdate, accessToken);
  return { id: existingId, path: fullItemPath, name: itemName };
}

This approach:

Checks in-memory cache first (fast)
Falls back to Sitecore query if not cached
Updates existing items instead of creating duplicates
Caches results for future lookups

2. Auto-creating folder structure

Items often need to be organized in folder hierarchies. This recursive function creates the entire path automatically:

async function getOrCreateParentId(itemPath: string, accessToken: string) {
  // Try to get existing item
  try {
    return await getItemIdByPath(itemPath, accessToken);
  } catch (error) {
    // Doesn't exist - create it
    const folderName = pathParts[pathParts.length - 1];
    const parentPath = pathParts.slice(0, -1).join('/');
    // Recursively ensure parent exists
    const parentId = await getOrCreateParentId(parentPath, accessToken);
    // Create this folder
    return await createFolder(parentPath, folderName, accessToken);
  }
}

Example: Creating an item at /sitecore/content/YourSite/Events/2026/July/event-name will automatically create Events/, 2026/, and July/ folders if they don't exist.

3. Unique item names

Multiple items with the same title need unique item names in Sitecore. This function generates URL-safe names with identifiers:

function generateItemName(title: string, uniqueIdentifier?: string): string {
  let itemName = title
    .toLowerCase()
    .replace(/[^a-z0-9\s-]/g, '')
    .replace(/\s+/g, '-')
    .replace(/-+/g, '-')
    .replace(/^-|-$/g, '');
  if (uniqueIdentifier) {
    // Add unique identifier (could be date, ID, etc.)
    itemName += `-${uniqueIdentifier}`;
  }
  return itemName;
}

Example: "Summer Festival" on July 1, 2026 becomes summer-festival-2026-07-01 This ensures:

URL-safe item names (no special characters)
Unique names even with duplicate titles
Meaningful, readable names in content tree

Setup requirements

To build an implementation like this, your environment will need to be configured to talk to the Sitecore Authoring GraphQL endpoint.

Configuration checklist:

Authentication: Your app needs access to your CLIENT_ID, CLIENT_SECRET, and the specific AUTHORING_ENDPOINT for your instance.
OAuth Permissions: The client must be granted xmcloud.item.create, xmcloud.item.update, and xmcloud.item.read scopes.
Template Prep: Ensure your custom templates in Sitecore have field names that match your CSV mapping.

Best practices for a CSV import system

When you begin developing your own implementation, we recommend following these core practices. We found these to be vital for maintaining a stable and user-friendly experience:

Always use a two-step process: Create items first, then update fields separately
Cache aggressively: Store item IDs and folder IDs to minimize API calls
Validate early: Check CSV data before starting the import
Handle duplicates gracefully: Check for existing items and update instead of creating
Generate unique names: Include dates, IDs, or other identifiers in item names
Create folders recursively: Don't assume parent folders exist
Filter empty values: Remove empty fields before GraphQL calls
Use GUIDs for parents: Always query for parent folder GUIDs rather than using paths

CSV import use cases

If you’re considering implementing this pattern in a future project, here are several scenarios where a CSV import system is particularly effective:

Products & SKUs: Import product catalogs with specifications
Blog Posts & Articles: Bulk content migration from other systems
Events & Calendar Items: Season schedules, event listings
Media Library Metadata: Organize and tag media assets
Marketing Campaigns: Campaign data, landing pages, promotions

Where do we go from here?

This standalone application effectively handles bulk CSV imports, but it does live outside the core Sitecore environment. As a next step, we are looking toward the Sitecore Marketplace. Integrating this methodology directly into the Marketplace setup would keep the tool secure and accessible right within the Sitecore UI, further reducing friction for content teams.

Additional resources

Sitecore Authoring GraphQL API Documentation

Sitecore

SitecoreAI Base Image 1.75: The SPE 8.0 change is the part to watch

A practical look at what changed in SitecoreAI base image 1.7.5, why the SPE 8.0 bump matters, and what teams should check before production rolls over

May 8, 20261 min read

Dev

Rendering parameters vs. Headless variants: choosing the right Sitecore extension

How to architect scalable Sitecore components by choosing the right extension method for your team

May 7, 20261 min read

AI solutions

Digital solutions

Industries