Creating a Keyword Faceted Search in dotCMS 2.1

Aug 06, 2012


This blog will show you a step by step guide to create a keyword faceted search in dotCMS 2.1.

We will show you how to create a new “keywords” tag field on your content types, use the tags as meta keywords on your templates and use ElasticSearch to create a keyword faceted search.

A faceted search will display the number of hits within the search that match each tag, allowing the users to drill down by each specific keyword and refine their search results.

Click here for more information on how the Faceted Site Search works in dotCMS 2.1 click here

Let’s go over the steps:

Changes to the content types and templates

1. Create a keywords tag field

Select the content types where you want to perform the faceted site search. Add a “tags” tag field on each content type.

2. Add content

Add keywords to the “tags” field in all the content to be searched.

3. Modify your templates

Modify all the detail page templates for each content type selected, to add the following code. This code will assume there is a keywords field on your content type and will display the content as meta keywords in the HTML. You can either add this code to your Template or to a Header Container.

${esc.h}if(${esc.d}URLMapContent && ${esc.d}URLMapContent.tags)
  ${esc.h}foreach(${esc.d}keyword in ${esc.d}URLMapContent.tags)
  <meta name="keywords" content="${esc.d}strKeywords">  
  <meta name="keywords" content="${esc.d}!{HTMLPAGE_KEYWORDS}">

4. View the source code

Open one detail page for each content type selected to make sure the keywords are displaying. The source code should look like this:

<meta name="keywords" content="Gas,Oil,Prices,Investment,">

Indexing the Content

1. Create new Site Search Index

Using an index name “SiteSearch”. Make it your default site search index.

2. Create a new Site Search Job

Include either the entire host, or the paths to all the detail pages for each content type selected.

In our example, we added the “tags” field to the content types: Products, News, Blogs, Events. In this case, we can choose to only include the paths to their detail pages:

/products/*, /news/*, /blog/*, /events/*

Note: To test your site search job, use a cron expression. You may with to run it every 5 minutes at first(until you are okay with the results), then change it to once a day or so.

This cron expression will run the job every 5 minutes: 0 0/5 * * * ?

3. Test Site Search

After the job has finished, test your site search by going to the “Test Site Search” tab and entering any search term.

Creating the Widget to Display the Site Search Results

1. Create a new page

Where you will display your search results. In our example we created the page: /home/site-search.html

2. Add a New Widget to Display Search Results

You can download or view the entire code on our demo site under: //shared/vtl/widgets/full-site/site-search.vtl

The code takes a search term $q and creates a lucene query adding the current host.

${esc.h}if(${esc.d}UtilMethods.isSet(${esc.d}q) )
  ${esc.h}${esc.h} QUERY
  ${esc.h}set(${esc.d}runQ  =${esc.d}q.replaceAll("\"", ""))
  ${esc.h}${esc.h} add a +
    ${esc.h}set(${esc.d}runQ = "+${esc.d}runQ")
  ${esc.h}set(${esc.d}runQ = "${esc.d}runQ +host:${esc.d}host.identifier")

Then it calls the sitesearch api to search on the default index using the query $runQ:

${esc.h}${esc.h} null for the first argument searches the "default" index
${esc.h}set(${esc.d}results = ${esc.d}, "${esc.d}!runQ ",  ${esc.d}start, ${esc.d}end))

If there are results, it returns them by looping through each result and displaying the Title, URL, Highlights, Modified Date and the number of matches in the document.

${esc.h}foreach(${esc.d}detail in ${esc.d}results.results)
  <div class="resultResult">
    <div class="resultTitle"><a href="${esc.d}detail.uri"<${esc.d}detail.title</a></div>
    <div class="resultUrl">${esc.d}detail.url</div>
    ${esc.h}foreach(${esc.d}highlight in ${esc.d}detail.highlights)
      <div class="resultSummary">${esc.d}highlight...</div>
    <div class="resultsNum">modified: ${esc.d}detail.modified</div>
    <div class="resultsNum">${esc.d}{detail.highlights.size()} match(es) in document</div>
  ${esc.h}set(${esc.d}i =${esc.d}math.add(${esc.d}i, 1))

3. Add Code to Display Search Facets

First we create the keywords facet query. This query is using the previously built lucene query $runQ, and also adds the keyword faceted search. Note that this is only done to get the facets. This is not filtering by keywords yet.

 ${esc.h}${esc.h}Keywords Facet
    ${esc.h}set(${esc.d}kwFacetsQry = '  {
    "query" : { "query_string" : {"query" : "${esc.d}runQ"} },
    "facets" : {
    "keywords" : { "terms" : { "field" : "keywords", "size" : 10 } }
  ${esc.h}set(${esc.d}kwFacetsQry = ${esc.d}render.eval(${esc.d}context,${esc.d}kwFacetsQry))

Call the sitesearch api method getFacets. If using the default index, we can send null as the first parameter. If not, we should send the search alias.

  ${esc.h}set(${esc.d}kwFacets = ${esc.d}sitesearch.getFacets(null, ${esc.d}kwFacetsQry))

Once this runs, we have the facets in $kwFacets. We then loop through the facet entries and display each keyword’s term and count on the page.

    ${esc.h}if(!${esc.d}UtilMethods.isSet(${esc.d}facet.entries()) || ${esc.d}facet.entries().size() == 0)
      <em>No Results</em>
      (<a href="javascript:addKeyword('')">clear</a>)
      ${esc.h}foreach(${esc.d}term in ${esc.d}facet.entries())
            <li><a href="javascript:addKeyword('${esc.d}term.term')">${esc.d}term.term</a> (${esc.d}term.count)</li>
          ${esc.h}elseif(${esc.d}keyword == ${esc.d}term.term)
            <li><b>${esc.d}term.term</b> (${esc.d}term.count)</li>

Each keyword is also linked to a javascript function addKeyword that will add the term to a hidden field on the form and submit it.

  function addKeyword(term ){
    document.getElementById("keyword").value  = term;

When the page loads again it checks to see if there is a keyword in the request and adds it to the query $runQ. This will allow the drill down and refinement of the current search results.

  ${esc.h}if(${esc.d}keyword.contains(" "))
    ${esc.h}set(${esc.d}runQ = "${esc.d}runQ +keywords:${esc.d}{esc.q}${esc.d}keyword${esc.d}{esc.q}")
    ${esc.h}set(${esc.d}runQ = "${esc.d}runQ +keywords:${esc.d}keyword")

Voila! Your keyword faceted search is done!

You can see all the above code, and test this search, on the demo site at:

ElasticSearch will search all the meta fields on your HTML pages. You can perform a similar filtering process using the meta description or author tags.

<meta name="description" content="This is my content description field" />
<meta name="author" content="Maria Ahues Bouza" />

Email me if you have any questions or feedback concerning this post.

Filed Under:

Recommended Reading

Sneak Peek of dotCMS 2.1

After releasing dotCMS 2.0 in May we’re set to release dotCMS 2.1 by the end of July. Here is a sneak peak of the features that will be released soon, hope you enjoy using them as much as our R&D team...

Edit Mode Anywhere: Shaping the Hybrid CMS Landscape - A Fireside Chat with dotCMS & DEPT

dotCMS and DEPT sit down for a fireside chat about Hybrid CMSs and how Edit Mode Anywhere impacts the field.

How to Choose a Hybrid CMS (To Please Your CMO and CTO)

Hybrid CMS will become the defacto headless CMS choice in the coming years. In this post, we take a look at the key features business should look for, especially Edit Mode Anywhere