Elasticsearch Examples - Documentation topics on: elasticsearch,rest,restful,searching,.

Elasticsearch Examples

All content in dotCMS is indexed by Elasticsearch. The dotCMS Enterprise Edition exposes an Elasticsearch endpoint that can be used to query the content store with native elasticsearch queries using the ElasticSearch JSON format.

This page includes several basic example queries. These queries are presented as curl commands which can be run against dotCMS starter site or the dotCMS demo site, but can also be tested via the ElasticSearch Portlet by removing the first and last line of each example (leaving just the JSON format search string).

Basic Queries

These queries perform basic searches using common ElasticSearch features. Also see Query by language using a Range, below, for how to query a range of values.

Match All Content and Limit the Results

This query matches all items in the content store, but only returns the first 5 items.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query" : {
            "match_all" : {}
        },
        "size":5
    }
'

Match in All Fields

This query uses the “all” keyword to search for a string in all fields of all content. For more information on the “all” keyword, please see the documentation on the _all keyword.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "_all": "gas"
                    }
                }
            }
        }
    }
'

Match Multiple Terms

This query only returns items which match all of the following conditions:

  • An item of the News Content Type.
  • Includes the “investing” category.
  • Has the tag “gas”.
  • Contains the word “jean” in the byline field.

Note that all field names and values are converted to lowercase in the search index.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "contenttype": "news"
          }
        },
        {
          "term": {
            "categories": "investing"
          }
        },
        {
          "term": {
            "news.tags": "gas"
          }
        },
        {
          "term": {
            "news.byline": "jean"
          }
        }
      ]
    }
  }
}
'

Find Files using a Regular Expression (Regex)

This query uses a regular expression to return all files that end in .jpg. It is a good example of how to use a regular expression to query the index fields.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "regexp": {
                        "path": "(.*?).jpg"
                    }
                }
            }
        }
    }
'

List All Sites

This query returns all sites in your dotCMS installation:

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "contentType": "host"
                    }
                }
            }
        }
    }
'

List All Content Types and Counts for a Specific Site

This query uses the ElasticSearch aggregations feature to return a list of all the Content Types on a specific site (where the site id is 48190c8c-42c4-46af-8d1a-0cd5db894797, which is the site id of the dotCMS starter site).

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "conHost": "48190c8c-42c4-46af-8d1a-0cd5db894797"
                    }
                }
            }
        },
        "aggs" : {
            "tag" : {
                "terms" : {
                    "field" : "contentType",
                    "size" : 100   //the number of aggregations to return
                }
            }
        },
       "size":0    //the number of hits to return

    }
'

Filter Search Results by Title and Date

The following query pulls all items of the “News” Content Type with “retirement” in the title field that were published after on or after midnight January 1, 2015, but before midnight January 1, 2016:

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
    {
       "query": {
          "filtered": { 
               "query": {
                    "query_string" : {
                       "query" : "+news.title:retirement"
                    }
                },
                "filter": {
                    "range" : {
                        "news.sysPublishDate" : {
                            "gte": "2015-01-01 00:00:00", 
                            "lt": "2016"
                        }
                    }
                }
            }
        }
    }
'

Note:

  • You may leave out either the start or end of the range (e.g., the "gte" or "lt" terms).
    • If you leave out the start of the range, the query will find all content up to the end of the range (and vice-versa).
  • You may also specify the end of the range as "now" (e.g., "lt": "now") to find all content with dates up to the current time when the query is run.

Use Lucene Query Syntax

When submitting Elasticsearch queries, you may always use the simpler Lucene query syntax by providing a Lucene query string within an Elasticsearch "query_string" term, as follows:

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
    {
        "query": {
            "query_string" : {
                "query" : "+news.title:retirement"
            }
        }
    }

Query by Language

These queries demonstrate how to retrieve content in specific languages. Your language IDs will vary based on how your site is built; for more information, please see the Configuring Languages documentation.

Return a single piece of content by language

This query returns a single piece of content by Identifier, limiting the results to language ID 1 (the default language):

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "languageId": 1
              }

            },
            {
              "term": {
                "identifier": "c1857ef4-fdbd-4e08-a4f4-bd2ff68ea60b" 
              }
            }
          ]
        }
      }
    }
'

Query by language using a range

This query returns all results which have the word “gas” in their titles and have a language ID in the range from 2 to 20 (excluding results in the default language, which is language ID 1):

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "title": "gas"
                    }
                },
                "must_not": {
                    "range": {
                        "languageid": {
                            "from": 2,
                            "to": 20
                        }
                    }
                }
            }
        }
    }
'

Using the File Path

The following queries search based on the path to an item within the dotCMS Site Browser file tree. Note that since these queries reference specific locations in the tree, they will only return Page and File Content Types.

Return a single image based on the path

This query returns a single image based on the path within your Site Browser tree:

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
    {
      "query": {
        "bool": {
          "must": 
            {
              "term": {
                "path": "/images/404.jpg"
              }
            }

        }
      }
    }
'

List Pages within a specific folder

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
{
   "query": {
      "bool": {
         "must": [
             {
                 "term": {
                 "parentpath": "/services/"
                 }
             },{
                 "term": {
                 "basetype": "5"   //basetype 5=pages
                 }
             }
         ]
      }
   }
}

Geolocation

You may perform Geolocation queries on any Content Types which include a latlong field that contains latitude and longitude coordinates. Please see How Content is Mapped to ElasticSearch for more information on adding a latlong field to your content.

Filter by Distance: Return only results near the user

This query filters results to only display items within 2000 km of the user's location (“News near you:“):

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
   {
      "query": {
         "filtered": {
            "query": {
               "match_all": {}
            },
            "filter": {
               "geo_distance": {
                  "distance": "2000km",
                  "news.latlong": {
                     "lat": 37.776,
                     "lon": -122.41
                  }
               }
            }
         }
      }
   }
'

Sort by Distance: Sort results based on distance from the user

Similar to the previous query, this query sorts search results so that News items closest to the user are displayed at the top of the search results:

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
   {
      "sort" : [
         {
            "_geo_distance" : {
               "news.latlong" : {
                  "lat" : 42,
                  "lon" : -71
               },
               "order" : "asc",
               "unit" : "km"
            }
         }
      ],
      "query" : {
         "term" : { "title" : "gas" }
      }
   }
'

Automatically Finding Visitor Geolocation

When performing a geolocation query using a curl command, you must supply the latitude and longitude for the query. However when performing a geolocation query from within dotCMS, you can automatically find the geolocation coordinates for the current user. The following code performs the same query as the Filter by Distance example above, but uses the Elasticsearch Viewtool and Visitor Geolocation to determine the results based on the visitor's automatically determined geolocation coordinates:

#set($geolocationFromSession = $session.getAttribute("geolocation"))
#if(!$UtilMethods.isSet($geolocationFromSession))
    #set($locationURL = "http://www.geoplugin.net/json.gp?ip=$request.getRemoteAddr()")
    #set($geolocation = $json.fetch("$!locationURL"))
    $session.setAttribute("geolocation", $geolocation)
#else
    #set($geolocation = $session.getAttribute("geolocation"))
#end

#set($latitude = ${geolocation.geoplugin_latitude}
#set($longitude = ${geolocation.geoplugin_longitude}

#set($query = '{
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "geo_distance": {
                    "distance": "2000km",
                    "news.latlong": {
                     "lat": ${latitude},
                     "lon": ${longitude}
                    }
                }
            }
        }
    }
}'

#set($results = $estool.search($query)
#foreach($news in $results)
  <p>$news.title</p>
#end

Other Common Features

These examples provide ways to provide some other common web site features using ElasticSearch queries, including tag clouds and search suggestions.

Tag Cloud: Return an aggregated list of tags and counts

This query uses the ElasticSearch aggregations feature to provide a list of tags with the counts for each tag for the News Content Type to enable the creation of tag clouds on your site:

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/search -d '
    {
       "query": {
          "query_string": {
             "query": "+contenttype:news"
          }
       },
       "aggs" : {
          "tag" : {
             "terms" : {
                "field": "tags",
                "size" : 20
             }
          }
       },
       "size":0    
    }
'

Suggestions: Generate Suggestions (Did you mean?) based on content title

This query uses the suggest feature to suggest results which are close to the user's entered query (“Did you mean … ?“):

curl -H "Content-Type: application/json" -XPOST http://localhost:8080/api/es/raw -d '
   {
      "suggest" : {
         "title-suggestions" : {
            "text" : "gs pric rollrcoater",
            "term" : {
               "size" : 3,
               "field" : "title"
            }
         }
      }
   }
'