Search in dotCMS uses ElasticSearch, which is an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Apache Lucene.
ElasticSearch provides a fast and efficient search solution that allows you to start with one server and scale up very easily.
You can find more about ElasticSearch at elasticsearch.org
Since ElasticSearch is built on top of Lucene, you won’t need to modify any of the existing search queries in your widgets.
ElasticSearch provides the ability to scale the search index differently from the content. For example, you can have 10 servers running dotCMS and only 3 servers running the index.
The ElasticSearch index is divided into two indexes: Live and Working, this allows you to have a server that is not used for content edits with only a Live index.
ElasticSearch also allows you to transfer indexes between environments by downloading an index from one environment and restoring it to another one.
To see a list of all active indices, go to System → Maintenance → Index.
Each index has the following properties:
- Status: An index can be Active or Inactive. You need at least one active Live and one active Working index.
- Index Name: The name of the index. This name matches the folder name where the index is stored in the file system.
- Count: Number of objects stored in the index.
- Shards: Number of Shards in the index.
- Replicas: Number of Replicas for the index.
- Size: Size of the index.
- Health: Displays the health of the index. This can be green, yellow or red; green indicates everything is working properly, yellow indicates a non-critical error (search may work but at degraded performance), and red indicates a critical error (search may not work at all). For instance if you have two servers with 3 replicas the Health will be yellow because ES can’t find 3 servers to replicate the data, but if the data becomes corrupted on your index the health will be red.
From this page you can create a new index by clicking on Add-Index and selecting whether you want to create a Working or Live index. You’ll also need to input the number of shards for your new index. Once your index is created it won’t have any data in it until you activate it.
Right click on an indexe and select Download Index. This will save a zip file with your index to your computer. You will then be able to use this file to Restore an index on this server or another dotCMS server running the same version.
You can also right click on an index to either Clear the Index, which will delete all the objects stored in the index but will maintain the index, or you can Delete the index which will entirely delete the index from the file system.
The ElasticSearch configuration variables can be found in the src-conf/dotmarketing-config.properties file.
These are some of the variables you may want to modify:
By default the number of shards is set to 2. Having more shards enhances the indexing / writing performance and allows to distribute a big index across disks. More shards on the same disk can actually slow down search performance. Make sure your ulimit -n is set to more than 1024
Having more replicas enhances the search / read performance and improves the cluster availability. If you are running a cluster, the number of replicas should be approximately half your number of servers plus 1.
You should store your index on a fast drive. These generally will match the config value for DYNAMIC_CONTENT_PATH and they store the path where the Live and Working indexes are stored.
These will allow you to configure Elasticsearch on a cluster. If your network can do multicast we recommend enabling the multicast discovery, if it doesn’t you need to define your list of hosts running dotCMS that you want running an index.
If your index becomes corrupt follow the next steps to have dotCMS recreate it:
- Stop DotCMS.
- Delete or Rename the esdata folder (es.path.data).
- Start DotCMS. It will show an error in the logs about not finding an index, this is normal.
- Go to System --> Maintenance and click on the Index tab.
- Click to start a full Reindex, this will recreate the folder under esdata.