To get started with Algolia, you first need to create an account. This process is quick and easy. Once you've created an account, you can access Algolia's powerful features, including lightning-fast search and advanced analytics. Additionally, Algolia offers a variety of customization options that allow you to tailor the search experience to your specific needs.
You will need to create an index to add your documents to. You may have one or multiple indexes depending on your application's needs. Different indexes for different environments may exist, such as Development, UAT, and Production. This concept could be used to separate content types or different language versions.
You will need to create a new API Key for your front-end application with the permissions to add new objects to the index.
Indices
field.Create a Next.js application and install the necessary modules.
npm install algoliasearch cheerio node-fetch
Set up an .env
file on the root of your solution to hold your Algolia variables. The ALGOLIA_SEARCH_API_KEY
will be equal to the API key we just created above.
ALGOLIA_APP_ID = 'ALGOLIA_APP_ID';
ALGOLIA_SEARCH_API_KEY = 'ALGOLIA_SEARCH_API_KEY';
ALGOLIA_INDEX_NAME = 'ALGOLIA_INDEX_NAME';
First, we must define all the fields we want to see in our index. Then, create a type and store it in a separate file for application-wide referencing.
export type PageResult = {
title: string,
content: string,
url: string,
objectID: string
}
Here we have defined a simple endpoint that takes in a URL and passes it to a function that indexes the page. This file lives under pages/api
. It is called crawler.ts.
import type { NextApiRequest, NextApiResponse } from 'next';
import { crawlPageAndIndex } from '@/util/crawlIndex';
export default async function handler(req: NextApiRequest, res: NextApiResponse) {
const { url } = req.query as { url: string };
if (!url) {
res.status(400).send('Missing URL parameter');
return;
}
await crawlPageAndIndex(url);
res.status(200).send('OK');
}
We can hit this endpoint by going to the following URL.
https://<APPLICATION URL>/api/crawler?url=<URL TO CRAWL>
The following code was added to a subfolder called /util
under the root of our application. We have included examples of how to fill the object fields we have defined above. However, how this happens in a production-ready application will likely need to be driven by the business requirements.
import cheerio from 'cheerio';
import fetch from 'node-fetch';
import algoliasearch from 'algoliasearch';
import { PageResult } from './types';
const APP_ID = process.env.ALGOLIA_APP_ID ?? '';
const API_KEY = process.env.ALGOLIA_SEARCH_API_KEY ?? '';
const INDEX = process.env.ALGOLIA_INDEX_NAME ?? '';
export async function crawlPageAndIndex(url: string) {
const content = await crawlPage(url);
await addToAlgolia(url, content);
}
async function crawlPage(url: string) {
const response = await fetch(url);
const html = await response.text();
const c$ = cheerio.load(html);
// Use the page h1 tag as our title
const title = c$('h1').map((i, el) => c$(el).text()).get();
// Get all header content
const headings = c$('h1, h2, h3, h4, h5, h6').map((i, el) => c$(el).text()).get();
// Get all page paragraphs
const paragraphs = c$('p').map((i, el) => c$(el).text()).get();
// Combine headers and paragraphs and trim content to avoid going over the allowed field size
const content = [headings.join(' '), paragraphs.join(' ').slice(0,500)].filter(s => s.trim().length > 0).join(' ');
const page = {
objectID: url,
title: title[0],
url: url,
content: content,
} as PageResult;
return page;
}
async function addToAlgolia(url: string, page: PageResult) {
const client = algoliasearch(APP_ID, API_KEY);
const index = client.initIndex(INDEX);
await (index as any).saveObject(page);
}
Note: We need to create a unique objectID to avoid duplicates and also to be able to update existing records. In this screenshot, we have used the URL but this could be any unique identifier.
Algolia has made searching for our content very simple with itβs InstantSearch
library of components. In our example, we have created a simple text based search component using these components.
import { InstantSearch, SearchBox, Hits, InstantSearchProps } from 'react-instantsearch-dom';
import algoliasearch from 'algoliasearch';
import { useState } from 'react';
import styles from './Search.module.css';
const ALGOLIA_APP_ID = process.env.ALGOLIA_APP_ID ?? '';
const ALGOLIA_SEARCH_API_KEY = process.env.ALGOLIA_SEARCH_API_KEY ?? '';
const ALGOLIA_INDEX_NAME = process.env.ALGOLIA_INDEX_NAME ?? '';
const algoliaClient = algoliasearch(ALGOLIA_APP_ID, ALGOLIA_SEARCH_API_KEY);
function Search() {
const [query, setQuery] = useState('');
const handleSearch = (event: React.SyntheticEvent<HTMLInputElement, Event>) => {
setQuery(event.currentTarget.value);
};
return (
// Pass our index name and the search client into the InstantSearch components parameters
<InstantSearch indexName={ALGOLIA_INDEX_NAME} searchClient={algoliaClient}>
<div className={styles.input}>
<SearchBox onChange={handleSearch} />
</div>
{query && (
// This component displays all of the search results
<div className={styles.results}>
<Hits hitComponent={Hit} />
</div>
)}
</InstantSearch>
);
}
// Define our result template here
function Hit({ hit }: { hit: any }) {
return (
<div className={styles.hit}>
<a href={hit.url}>{hit.title}</a>
<p>{hit.content}</p>
</div>
);
}
export default Search;
The first place to start when you run into errors when attempting to index content is in the logs. To view the logs:
objectID
field is set to a unique value for each page.Sign up to our bi-weekly newsletter for a bite-sized curation of valuable insight from the Sitecore community.