Controlling Search Visibility in Sitecore Search with Custom Metadata
When setting up Sitecore Search in XM Cloud to index content from your website, there are often pages you want to exclude from the search index. For example, landing pages, the search results page itself, or gated content.
Unfortunately the Sitecore documentation can be a bit lacking on how one would go about setting this up. In this post I will guide you through the process of setting up a custom field on your page template and configuring Sitecore Search to exclude any pages from it’s index that have this custom field set.
How It Works
Each page will need to have a custom meta tag setup with the property of excludeFromSearch. This will generate a meta tag like below:
<meta property="excludeFromSearch" content="true">
From there Sitecore Search will use a Document Extractor to extract this property and discard any results that have the value set to true.
Setting up the Backend
Start off by creating a base template called _Search and on that base template setup any custom properties you want to include on your Page template that will be relevant to the Search. Since this is a base class you’ll want to create it on the Foundation level, a path such as /sitecore/templates/Foundation/Common/Pages/_Search would be suggested.
From there create the Search subheading and a property called excludeFromSearch. Set this property to the type of checkbox.
You may also want to create some additional properties such as searchResultsTitle and searchResultsDescription which can be populated by the content author. But for the sake of this article we’ll only be focusing on the excludeFromSearch property.
Next, in order to follow best practices, set the Title and Display Name for this property to Exclude from Search.
Next we will want to ensure that this template is defined as a base template on your Page template. On this example project that Page template can be found at /sitecore/templates/Project/client-website/Page. On the Content tab for the Base Template find the _Search template on the left side and use the right arrow to add it as a Base Template.
Once completed this will ensure that any Pages that are created using this Page template will now include the Exclude from Search property.
At this point you will want to set up a page that is excluded from the Search results, this will come in handy later when validating your configuration.
Setting up the Frontend
Next you will need to update your front end code to output this custom property in the markup. The exact implementation will vary depending on your solution however you’ll ultimately want something like the following:
import Head from 'next/head';
import { MetadataProps } from './Metadata';
import { CustomPageMetaFieldsType } from 'lib/types';
export const MetadataSearch: React.FC<MetadataProps> = ({ route }) => {
const { excludeFromSearch, searchResultsTitle, searchResultsDescription } = (
route as CustomPageMetaFieldsType
)?.fields;
const excludeFromSearchValue = excludeFromSearch?.value?.toString() ?? 0;
const searchResultsTitleValue = searchResultsTitle?.value?.toString() ?? 0;
const searchResultsDescriptionValue = searchResultsDescription?.value?.toString() ?? 0;
return (
<Head>
**<meta property="excludeFromSearch" content={excludeFromSearchValue} />**
<meta property="searchResultsTitle" content={searchResultsTitleValue} />
<meta property="searchResultsDescription" content={searchResultsDescriptionValue} />
</Head>
);
};
This relies on a metadata.ts file with the following code:
import { Field, ImageField } from '@sitecore-jss/sitecore-jss-nextjs';
import { CustomField } from '../fields';
import { CommonPageRouteMetaDataFieldsType } from './page';
export type MetaCollectionType = {
name: string;
value: string;
};
type MetaFieldTypes =
| CustomField
| Field<string>
| ImageField
| ConfigType
| TagType[]
| MetaCollectionType[]
| Field<number>
| Field<boolean>;
export type CustomPageMetaFieldsType = CommonPageRouteMetaDataFieldsType & {
fields: {
[key: string]: MetaFieldTypes;
Title: Field<string>;
MetaKeywords: Field<string>;
MetaDescription: Field<string>;
//Search
**excludeFromSearch: Field<number>;**
searchResultsTitle: Field<string>;
searchResultsDescription: Field<boolean>;
//OG Tags
OpenGraphTitle: Field<string>;
OpenGraphDescription: Field<string>;
OpenGraphImageUrl: ImageField;
OpenGraphType: Field<string>;
OpenGraphSiteName: Field<string>;
OpenGraphAdmins: Field<string>;
OpenGraphAppId: Field<string>;
// Tags
SxaTags: TagType[];
};
templateName?: string;
};
Configuring Sitecore Search
Now that you have both the frontend and backend set up to use your new custom field you will need to configure Sitecore Search to exclude any results where this field has been checked.
First, go to https://portal.sitecorecloud.io/ and click on the Search app followed by the Open App button on the right panel. For this example we’ll be working with our Non-Prod environment. This will take you the Customer Engagement Console (CEC).
Once in the CEC you will want to go to Sources on the left side and then click Edit on your Advanced Crawler.
Once in the Source Settings for the Advanced Crawler scroll down until you see the Document Extractors section then click Edit.
We will be using a JS Extractor for this example, find your JS Extractor and click the Edit button next to the Tagger.
You should already have something like the following:
We want to add some additional logic to read the excludeFromSearch meta tag and if the property is set to true have it return null. By having it return null it will end up excluding this page when indexing the content.
Add the following before the return statement:
excludeFromSearch = $('meta[property="excludeFromSearch"]')?.attr('content');
if(excludeFromSearch == "true") return null;
Your final JS code for the Tagger should now look like:
Click on Save.
Testing the Configuration
At this point I would highly recommend testing your configuration. This will require that you have your frontend code setup and deployed to your Vercel endpoint as well as your backend templates and a test page or two setup that you know are being indexed. On one of those you will have the Exclude from Search property checked. Once you have that all setup and published set up two test validation URLs, one with the Exclude property set and another without, then click Validate.
After Validating you should see something like the following:
As you’ll notice the first page which is setup with the Exclude from Search property has the word Excluded in red text. If you’re seeing this, then you have successfully setup Sitecore Search to exclude pages based on a custom field!