The Index tab in the System -> Maintenance tool allows you to manage indexes for single or clustered dotCMS instances.
The detail area displays the following information for each index:
|Status||Status of the index (which working/live indexes are active).|
|Index Name||Identifier of the index, and whether the index is live or working.|
|Created||Creation timestamp of the index.|
|Count||Number of objects in the index.|
|Shards||Number of Elasticsearch shards (underlying sub-indexes) in the index.|
|*Replicas||Number of copies of a given index (*only available on clustered instances).|
|Size||Size of the index (in Megabytes).|
|Health||Colored icon indicating whether the index or index “replica” is being used by a dotCMS instance:|
When you create a new index, you may specify the number of shards in the index. Elasticsearch abstracts the index so that several “shards” (sub-indexes) can aggregate results when a query is performed. This makes multiple separate shards behave like a single index, but enhances performance and scaleability because ElasticSearch only updates the shard(s) that it needs instead of updating the whole index every time.
Shards provide a trade-off in performance. As the number of shards increases the process of updating the index gets faster, but performing queries against the index may become slightly slower (as multiple shards may need to be accessed to perform the query, especially if it's a complex query).
- If you have a site with a large database and/or frequent content updates, you may want to consider increasing the number of shards to reduce the time it takes to re-index content.
- If you have a site with a great deal of front-end traffic, you may want to minimize the number of shards to maximize query performance.
- Achieving the right balance requires a needs assessment and follow up testing.
Actions Available on Indexes
The following right-click options are available on any index:
|Restore Index Snapshot||Replaces the current index with a previously downloaded index snapshot.|
|Download Index Snapshot||Downloads a copy (snapshot) of the selected index.|
|Close-Index||Closes an index, blocking it from performing read/write operations (so it has nearly no overhead on the server).|
|Open-Index||Re-opens a closed index.|
|Clear Index||Clears the index (to prepare for a restore).|
The following additional options are available only on clustered instances:
|Update Number of Replicas||Changes the number of replicated indexes.|
|Deactivate Index||Disables writing to the index.|
|Delete Index||Removes the index from the cluster.|
Note: Clearing or de-activating a live index will display a popup warning message that site visibility may be affected.
- However when you are troubleshooting potential index issues, these actions allow you to “clean” an index before restoring a copy of that index; this provides a much faster and lighter option than a complete site re-index to test or resolve an issue.
Adding a Sharded Index
On click of the Add-Index button a new working/live index can be added and the number of shards for that index can be defined.
Step 1) On click of the “Add-Index” button a new live/working index can be created
Step 2) When a new index is created the number of shards for that index can then be specified.
Step 3) The current live/working index can then be backed up with the “Download Index” and restored (“Restore Index”), into the new index that was created in step 1 using the right-click options shown below.
Important Note: Indexes do take up memory space. Unused/old indexes should be removed using the “Delete Index” right-click option.
Unless configured otherwise, index shards are stored in the /dotsecure/esdata/ directory from the root of dotCMS (e.g. /dotserver/tomcat-X.x.xx/webapps/ROOT/dotsecure/esdata in the default dotCMS distribution).
However Elasticsearch allows you to store each shard in a different location, enabling you to distribute shards in separate folders or on separate disks if desired.
To configure a different location for your Elasticsearch indexes, you must modify the
DYNAMIC_CONTENT_PATH property in your dotmarketing-config.properties file.
Note: It is strongly recommended that all changes to the dotmarketing-config.properties file be made through a properties file extension.
Index Replicas (For Cluster Implementations Only)
When using a cluster, you can create replicas of an index to distribute and mirror the index across multiple servers in the cluster. To change the number of replicas of an index, right-click on the index and select Update Number of Replicas.
Managing Indexes via the API
Index management actions that can be performed from the dotCMS backend can also be achieved using the REST API (including via CURL commands). For more information, see the RESTful API to Manage Indexes documentation.
Note on Number of Replicas
Setting the proper number of replicas for dotCMS's ElasticSearch index can be confusing. It is important to understand that the number of replicas does not equal the number of servers in your cluster.
The number of replicas is how many times you want your index to be copied. For example, if you are running a two node cluster then you should have your ElasticSearch replicas set to “1”. This means that there is the original index entry on one server and 1 replica, or copy, on the other server, so both servers have a copy of all index entries.
Therefore the “rule of thumb” for the proper number of replicas is:
- For clusters with less than 5 servers: Set replicas to one less than the number of servers in the cluster. Examples:
- 2 servers: 1 replica
- 4 servers: 3 replicas
- For clusters with 5 or more servers: Set replicas to 1/2 the number of servers in the cluster (rounded up).
- 5 servers: 3 replicas
- 8 servers: 4 replicas
When a new server is joined to the cluster (see cluster doc), Elasticsearch automatically recognizes the server and begins replicating to it. When a server is removed from the cluster or goes offline, the replicated index may display a “yellow” or inactive status if the number of nodes configured in the cluster does not at least match the number of replicated indexes.
The Elasticsearch index may also be configured to run as a “stand-alone” Elasticsearch server and connect to each node in a dotCMS cluster.