Loading component...

Published on January 22, 2024Reading time: 8 min read

Customizing a Sitemap in Sitecore XM Cloud for a Multidomain Solution

Customize the sitemap.ts file in the Next.js solution to accurately reflect the corresponding domain in a Sitecore XM Cloud multidomain environment.

Loading component...

Related Articles

Sitecore

SitecoreAI Benefits (Updated)

Unlock the full potential of your organization with SitecoreAI—enhancing efficiency, scalability, and innovation for marketing, IT, and beyond.

February 10, 20261 min read
Blog Card Placeholder
Sitecore

Fishtank sets the bar with eight 2026 Sitecore MVPs

Meet Fishtank's 2026 Sitecore MVPs and discover how their expertise drives innovation in digital strategy, technology, and community leadership across the Sitecore ecosystem.

February 3, 20261 min read

Loading component...

Digital solutions

Learn how to maximize your investment in Sitecore by working with us.

AI solutions

Dominate your industry. Work with us to digitally transform your business.

Fishtank's proprietary project accelerator. Save time and money. Build what matters.

Utilities

Energy

Education

Healthcare

Manufacturing

Professional services

Financial services

Non-profits and associations

About FishtankThe PlaybookSitecore SnackContact us

Solutions

  • Digital solutions
  • AI solutions
  • TIDAL for XM Cloud

About

  • About Fishtank
  • The Playbook
  • Sitecore Snack
  • Contact us

Insights

Work

Industries

  • Utilities
  • Energy
  • Education
  • Healthcare
  • Manufacturing
  • Professional services
  • Financial services
  • Non-profits and associations

Start typing to search...

Multi-Domain Sitemap Strategies in Sitecore XM Cloud

So far in this series on supporting multiple language based domains in Sitecore XM Cloud, we have discussed:

  • Supporting Multiple Language Based Domains in Headless Sitecore XM Cloud
  • Resolving Translated URLs in a Multi Domain Solution in Sitecore XM Cloud on Vercel

In today’s blog, the last part my series on configuring multidomain language based domains, we're focusing on a crucial aspect of our multi domain solution—modifying the sitemap to ensure it accurately reflects the correct domain which you are viewing the sitemap from. We will be modifying the existing sitemap.ts file in this Next.js solution to achieve this.

The Problem

In our multidomain Sitecore solution, we only have one instance of Sitecore XM Cloud behind both Vercel instances. This means there is only one shared Site Grouping item and that drives what values we see in the Sitemap. Without any customization, we will see the same Sitemap regardless of what domain we are requesting it from.

Below is what we’d see regardless of which domain we request it from (www.trainingwebsite.ca/sitemap.xml or www.sitedeformation.ca/sitemap.xml). Note, we have language embedding for the French URL AND we have a duplicate entry for each page.

<url xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <loc>https://www.trainingwebsite.ca/education</loc>
    <lastmod>2023-11-22</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.5</priority>
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="x-default" href="https://www.trainingwebsite.ca/education" />
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="fr" href="https://www.trainingwebsite.ca/fr/education" />
  </url>
<url xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <loc>https://www.trainingwebsite.ca/fr/education</loc>
    <lastmod>2023-11-22</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.5</priority>
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="x-default" href="https://www.trainingwebsite.ca/education" />
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="fr" href="https://www.trainingwebsite.ca/fr/education" />
  </url>

This is what we actually want to see on one of our Sitemap nodes. <loc> will reflect what domain we are requesting the Sitemap from. The alternate <xhtml:link> nodes should also show the correct domains rather than a language embedded URL with the hostname we added to the Site Grouping item in Sitecore. We have also removed the duplicate entry.

<url xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <loc>https://www.trainingwebsite.ca/education</loc>
    <lastmod>2023-11-22</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.5</priority>
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="x-default" href="https://www.trainingwebsite.ca/education" />
    <xhtml:link xmlns:xhtml="http://www.w3.org/1999/xhtml" rel="alternate" hreflang="fr" href="https://www.sitedeformation.ca/formation" />
  </url>

The Fix

Here is where the brunt of our customization is happening. We are taking the result object from the existing code and removing the entries that do not reflect the current domain, updating the <xhtml:link> node for the French pages, and updating the <loc> node for the French pages. We do not need to update the URLs where they are in English as they are already correct (this is the value coming from the Site Grouping item).

// Use the result object with existing code to update the loc property
            if (lang == 'en') {
              result.urlset.url = result.urlset.url.filter(filterUrlsEN);
              result.urlset.url = result.urlset.url.map(updateFrenchXhtmlURLs);
            } else if (lang == 'fr') {
              result.urlset.url = result.urlset.url.filter(filterUrlsFR);
              result.urlset.url = result.urlset.url.map(updateLoc);
              result.urlset.url = result.urlset.url.map(updateFrenchXhtmlURLs);
            }

Here are the two functions we are using for filtering. The first one will return false if the <loc> URL is in the French domain, thus it will not be included in the result object. The second will return true if the <loc> is in French, thus the page will be included in the result object.

// Function to filter <url> nodes based on <loc> subnode content
const filterUrlsEN = (url: Url) => {
  const loc = url.loc[0];
  return !loc.includes(FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX);
};

const filterUrlsFR = (url: Url) => {
  const loc = url.loc[0];
  return loc.includes(FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX);
};

This function updates the <loc> URL with the passed url object. In our case, we only need this functionality for French (remember, we do not need to update the English URLs).

const updateLoc = (url: Url) => {
  if (url.loc && url.loc[0]) {
    url.loc[0] = url.loc[0].replace(
      FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX,
      FRENCH_URL_DESIRED_AUTHORITY
    );
  }
  return url;
}

We call this function to update the <xhtml:link> nodes for French as they show with the English domain with the /fr language embedded.

const updateFrenchXhtmlURLs = (url: Url) => {
  if (url['xhtml:link']) {
    url['xhtml:link'].forEach((link) => {
      if (link.$.hreflang === 'fr') {
        link.$.href = link.$.href.replace(
          FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX,
          FRENCH_URL_DESIRED_AUTHORITY
        ); // Update the href value
      }
    });
  }
  return url;
};

Here is thesitemap.xml solution in its entirety.

import type { NextApiRequest, NextApiResponse } from 'next';
import {
  AxiosDataFetcher,
  GraphQLSitemapXmlService,
  AxiosResponse,
} from '@sitecore-jss/sitecore-jss-nextjs';
import { siteResolver } from 'lib/site-resolver';
import config from 'temp/config';
import { getPublicUrl } from '../../utils/publicUrlUtil';
import { Builder, parseString } from 'xml2js';

const ABSOLUTE_URL_REGEXP = '^(?:[a-z]+:)?//';
const FRENCH_URL_DESIRED_AUTHORITY = process.env.PUBLIC_FR_HOSTNAME || '';
const FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX = process.env.PUBLIC_EN_HOSTNAME + '/fr' || '';

type Url = {
  loc: string[];
  lastmod?: string[];
  changefreq?: string[];
  priority?: string[];
  'xhtml:link': {
    $: {
      xmlns: string;
      rel: string;
      hreflang: string;
      href: string;
    };
  }[];
};

// Function to filter <url> nodes based on <loc> subnode content
const filterUrlsEN = (url: Url) => {
  const loc = url.loc[0];
  return !loc.includes(FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX);
};

const filterUrlsFR = (url: Url) => {
  const loc = url.loc[0];
  return loc.includes(FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX);
};

const updateLoc = (url: Url) => {
  if (url.loc && url.loc[0]) {
    url.loc[0] = url.loc[0].replace(
      FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX,
      FRENCH_URL_DESIRED_AUTHORITY
    );
  }
  return url;
};

const updateFrenchXhtmlURLs = (url: Url) => {
  if (url['xhtml:link']) {
    url['xhtml:link'].forEach((link) => {
      if (link.$.hreflang === 'fr') {
        link.$.href = link.$.href.replace(
          FRENCH_URL_INVALID_AUTHORITY_AND_PATH_PREFIX,
          FRENCH_URL_DESIRED_AUTHORITY
        ); // Update the href value
      }
    });
  }
  return url;
};

const sitemapApi = async (
  req: NextApiRequest,
  res: NextApiResponse
): Promise<NextApiResponse | void> => {
  const {
    query: { id },
  } = req;

  // Resolve site based on hostname
  const hostName = req.headers['host']?.split(':')[0] || 'localhost';
  const site = siteResolver.getByHost(hostName);

  // create sitemap graphql service
  const sitemapXmlService = new GraphQLSitemapXmlService({
    endpoint: config.graphQLEndpoint,
    apiKey: config.sitecoreApiKey,
    siteName: site.name,
  });

  // if url has sitemap-{n}.xml type. The id - can be null if it's sitemap.xml request
  const sitemapPath = await sitemapXmlService.getSitemap(id as string);

  // Determine language of current site
  let lang = 'localhost';
  if (process.env.PUBLIC_FR_HOSTNAME && hostName.includes(process.env.PUBLIC_FR_HOSTNAME)) {
    lang = 'fr';
  } else if (process.env.PUBLIC_EN_HOSTNAME && hostName.includes(process.env.PUBLIC_EN_HOSTNAME)) {
    lang = 'en';
  }

  // if sitemap is match otherwise redirect to 404 page
  if (sitemapPath) {
    const isAbsoluteUrl = sitemapPath.match(ABSOLUTE_URL_REGEXP);
    const sitemapUrl = isAbsoluteUrl ? sitemapPath : `${config.sitecoreApiHost}${sitemapPath}`;
    res.setHeader('Content-Type', 'text/xml;charset=utf-8');

    return new AxiosDataFetcher()
      .get(sitemapUrl, {
        responseType: 'stream',
      })
      .then((response: AxiosResponse) => {
        if (lang === 'localhost') {
          response.data.pipe(res);
          return;
        }
        // BEGIN CUSTOMIZATION - Filter the sitemap per domain/language, and set the French domain to French URLs.

        // Need to prepare stream from sitemap url
        const dataChunks: Buffer[] = [];
        response.data.on('data', (chunk: Buffer) => {
          dataChunks.push(chunk);
        });

        response.data.on('end', () => {
          // Concatenate the data chunks to get the complete XML content
          const xmlData = Buffer.concat(dataChunks).toString();

          // Now, parse the XML data into an object using xml2js
          parseString(xmlData, (err, result) => {
            if (err) {
              console.error('Error parsing XML:', err);
              return;
            }
            // Use the result object with existing code to update the loc property
            if (lang == 'en') {
              result.urlset.url = result.urlset.url.filter(filterUrlsEN);
              result.urlset.url = result.urlset.url.map(updateFrenchXhtmlURLs);
            } else if (lang == 'fr') {
              result.urlset.url = result.urlset.url.filter(filterUrlsFR);
              result.urlset.url = result.urlset.url.map(updateLoc);
              result.urlset.url = result.urlset.url.map(updateFrenchXhtmlURLs);
            }

            // Convert the modified object back to XML format
            const xmlBuilder = new Builder();
            const modifiedXml = xmlBuilder.buildObject(result);

            // pipe 'modifiedXml' to response
            res.setHeader('Content-Type', 'text/xml');
            res.send(modifiedXml);
            // END CUSTOMIZATION
          });
        });
      })
      .catch(() => res.redirect('/404'));
  }

  // this approach if user goes to /sitemap.xml - under it generate xml page with list of sitemaps
  const sitemaps = await sitemapXmlService.fetchSitemaps();

  if (!sitemaps.length) {
    return res.redirect('/404');
  }

  const SitemapLinks = sitemaps
    .map((item) => {
      const parseUrl = item.split('/');
      const lastSegment = parseUrl[parseUrl.length - 1];

      return `<sitemap>
        <loc>${getPublicUrl()}/${lastSegment}</loc>
      </sitemap>`;
    })
    .join('');

  res.setHeader('Content-Type', 'text/xml;charset=utf-8');

  return res.send(`
  <sitemapindex xmlns="http://sitemaps.org/schemas/sitemap/0.9" encoding="UTF-8">${SitemapLinks}</sitemapindex>
  `);
};

export default sitemapApi;

In wrapping up this blog series, I have covered the process of configuring multiple language-based domains within Vercel for Sitecore XM Cloud. The series aimed to provide a comprehensive guide on configuring language-specific domains effectively in the context of the platform. I hope the information presented throughout the series proves valuable to those navigating the intricacies of domain configuration in a multilingual setup within Vercel and Sitecore XM Cloud.



Related Links

  • Guide To Fixing Setup Errors: Sitecore Headless SXA, Next.js, XM Cloud Dev
  • Sitecore XM Cloud: How To Setup Sitecore Headless SXA And Next.js
  • Accessing Sitecore Databases within a Local Docker Environment

Solutions

  • Digital solutions
  • AI solutions

Brands

  • TIDAL for SitecoreAI
  • Sitecore Snack
  • The Playbook

Fishtank lore

  • Work
  • Insights
  • About us
  • Contact us

Industries

Utilities
Professional services
Energy
Financial services
Healthcare
Education
Non-profits and associations
Manufacturing
Utilities
Professional services
Energy
Financial services
Healthcare
Education
Non-profits and associations
Manufacturing

2026 @All Rights Reserved. Privacy Policy.

FISHTANK