Indexing strategies are used to control how and when Sitecore content is indexed for use with ContentSearch API.
The out-of-the-box indexing strategies in Sitecore 7 work very well and provide you with many options to manage your indexes.
Strategies:
- IntervalAsynchronousStrategy
- ManualStrategy
- OnPublishEndAsynchronousStrategy
- RebuildAfterFullPublishStrategy
- RemoteRebuildStrategy
- SynchronousStrategy
The Problem
We recently ran into issue where the traditional strategies were not producing the expected results.
There was a dependency between indexed documents that was not accounted for development. When content in one part of the tree changed, multiple documents needed to be updated. It happens.
We needed a full reindex to occur on every publish.
Unfortunately this mean the RebuldAfterFullPublishStrategy didn't address our need. Because the items were published individually via workflow approval a Full Publish never occurred to trigger this strategy. But it was close.
The Code
So close that we used Sitecore.ContentSearch.Maintenance.Strategies.RebuildAfterFullPublishStrategy as the starting point for our own custom strategy.
Below is the complete code for our RebuildAfterAnyPublishStrategy class. It's also available as a Visual Studio project on GitHub.
using System;
using Sitecore.Configuration;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.Diagnostics;
using Sitecore.ContentSearch.Maintenance;
using Sitecore.ContentSearch.Maintenance.Strategies;
using Sitecore.Data;
using Sitecore.Diagnostics;
namespace Fishtank.IndexingStrategies
{
public class RebuildAfterAnyPublishStrategy: IIndexUpdateStrategy
{
// Fields
protected ISearchIndex index;
protected const string ClassName = "Fishtank.IndexingStrategies.RebuildAfterAnyPublishStrategy";
// Methods
public RebuildAfterAnyPublishStrategy(string database)
{
Assert.IsNotNullOrEmpty(database, "database");
this.Database = Factory.GetDatabase(database);
Assert.IsNotNull(this.Database, string.Format("Database '{0}' was not found", database));
}
protected virtual void Handle()
{
OperationMonitor.Register(new Action(this.Run));
OperationMonitor.Trigger();
}
public virtual void Initialize(ISearchIndex index)
{
Assert.IsNotNull(index, "index");
CrawlingLog.Log.Info(string.Format("[Index={0}] Initializing {1}.", index.Name, ClassName), null);
this.index = index;
if (!Settings.EnableEventQueues)
{
CrawlingLog.Log.Fatal(string.Format("[Index={0}] Initialization of {1} failed because event queue is not enabled.", index.Name, ClassName), null);
}
else
{
EventHub.PublishEnd += (sender, args) => this.Handle();
}
}
public virtual void Run()
{
CrawlingLog.Log.Info(string.Format("[Index={0}] {1} triggered.", this.index.Name, ClassName), null);
if (this.Database == null)
{
CrawlingLog.Log.Fatal(string.Format("[Index={0}] OperationMonitor has invalid parameters. Index Update cancelled.", this.index.Name), null);
}
else
{
CrawlingLog.Log.Info(string.Format("[Index={0}] Full Rebuild.", this.index.Name), null);
IndexCustodian.FullRebuild(this.index, true);
}
}
// Properties
public Database Database { get; protected set; }
}
}
The main difference between RebuildAfterFullPublishStrategy and our RebuildAfterAnyPublishStrategy class is the event we attach to:
// Inside RebuildAfterFullPublishStrategy.cs - Sitecore Strategy
public virtual void Initialize(ISearchIndex index)
{
// removed code
EventHub.FullPublishEnd += (EventHandler) ((sender, args) => this.Handle());
}
// Inside RebuildAfterAnyPublishStrategy.cs - Our New Strategy
public virtual void Initialize(ISearchIndex index)
{
// removed code
EventHub.PublishEnd += (EventHandler) ((sender, args) => this.Handle());
}
Now we just need to update our configs.
The Configuration
In Lucene the configuration for the indexing strategies is in Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config. This can also be patched in from any file located under /App_Config/Include .
Defining the strategy here allows it to be re-used across multiple indexes.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<indexUpdateStrategies>
<rebuildAfterAnyPublishStrategy type="Fishtank.IndexingStrategies.RebuildAfterAnyPublishStrategy, Fishtank.IndexingStrategies">
<param desc="database">web</param>
</rebuildAfterAnyPublishStrategy>
</indexUpdateStrategies>
</contentSearch>
</sitecore>
</configuration>
And this is the configuration for our ContentSearch index. Note that we've changed the value under index > strategies > strategy.
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<configuration type="Sitecore.ContentSearch.LuceneProvider.LuceneSearchConfiguration, Sitecore.ContentSearch.LuceneProvider">
<indexes hint="list:AddIndex">
<index id="site_web" type="Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex, Sitecore.ContentSearch.LuceneProvider">
<Configuration ref="siteSearch/siteSearchIndexConfiguration" />
<param desc="name">$(id)</param>
<param desc="folder">$(id)</param>
<!-- This initializes index property store. Id has to be set to the index id -->
<param desc="propertyStore" ref="contentSearch/databasePropertyStore" param1="$(id)" />
<!-- ############################## -->
<!-- START: IMPORTANT CHANGE! -->
<!-- ############################## -->
<strategies hint="list:AddStrategy">
<!-- NOTE: order of these is controls the execution order -->
<strategy ref="contentSearch/indexUpdateStrategies/rebuildAfterAnyPublishStrategy" />
</strategies>
<!-- ############################## -->
<!-- STOP: IMPORTANT CHANGE! -->
<!-- ############################## -->
<locations hint="list:AddCrawler">
<crawler type="Sitecore.ContentSearch.SitecoreItemCrawler, Sitecore.ContentSearch">
<Database>web</Database>
<Root>/sitecore/content/site/Home</Root>
</crawler>
</locations>
</index>
</indexes>
</configuration>
</contentSearch>
</sitecore>
</configuration>
Closing
It's a fair bit to digest but it's really just a few simple parts
- Create your indexing strategy class. I'd recommend building off the foundation laid in existing indexing strategy
- Define your new strategy in configuration > sitecore > contentSearch > indexUpdateStrategies
- Configure your content search index to use the new strategy
It's worth mentioning that we changed the index type to SwitchOnRebuildLuceneIndex. This ensures live index during full rebuilds.
I don't recommend this indexing strategy. It is only used by us in a very specific circumstance. But I hope it illustrates how you can do a custom indexing strategy when necessary.
Thanks for reading. This article was authored using Markdown for Sitecore.