Unpacking the dotCMS GraphQL API: Concepts, Features, and Use Cases
GraphQL is a powerful query language designed to make data retrieval more efficient and flexible. Unlike traditional REST APIs, where clients often over-fetch or under-fetch data, GraphQL allows clients to specify exactly what they need—and nothing more. This makes GraphQL particularly well-suited for content delivery in modern web applications.
When used in a content management system (CMS) like dotCMS, GraphQL offers several advantages:
Single Endpoint for All Queries: With GraphQL, you don’t need multiple endpoints for different resources. A single /api/v1/graphql endpoint lets you fetch all your content in one request.
Precise Data Retrieval: GraphQL avoids over-fetching and under-fetching by letting you define which fields you want in the response. This reduces payload size and improves performance, especially on content-heavy websites.
Self-Documenting Schema: GraphQL APIs are introspective, meaning you can explore the schema and understand the structure of the available data without relying on external documentation because an introspection tool is included.
Nested and Related Data: GraphQL enables retrieving related content in a single query. For example, you can fetch an article with author details, tags, and media assets in one request, simplifying client-side development.
If you’re new to GraphQL or want to learn the basics, I recommend starting with this article on the dotCMS blog. It covers step-by-step how to work with GraphQL in dotCMS and explains the fundamentals.
Here, we’ll focus on key concepts, advanced use cases, and practical tips to help you get the most out of the dotCMS GraphQL API. Whether you’re exploring the Search tool, working with Collections, or diving into subqueries, this guide is designed to highlight features and approaches that may not be immediately obvious.
Understanding the Structure of the dotCMS GraphQL API
Let’s start by exploring how the dotCMS GraphQL API is organized. A solid understanding of its structure is essential to making the most of this powerful tool.
Collections: Exploring Content by Type
At its core, the dotCMS GraphQL API uses Collections to allow you to query content associated with a specific Content Type. A Collection represents all the content entries (or “contentlets”) tied to a given type, giving you structured access to fields and metadata.
For example, if you have a Blog Content Type, you can query its content through the BlogCollection root field in your GraphQL API. This approach makes it straightforward to retrieve general fields associated with that type, like titles, authors, or published dates, without requiring advanced filtering.
A basic query for a Blog collection might look like this:
{
BlogCollection(limit: 5) {
title
author
publishDate
}
}
This approach works seamlessly for well-defined Content Types, allowing you to explore your data efficiently and focus on structured queries.
Using the Search Root Field to Access Content
The “Search” root field in the dotCMS GraphQL API is one of its most powerful tools, designed for broad and flexible content retrieval. Unlike Collections, which focus on a single Content Type, Search allows you to query across all content types using a single request, making it ideal for dynamic and global searches.
How Search Works
The Search root field leverages ElasticSearch’s indexing and querying capabilities to perform Lucene-based queries. This allows you to filter, sort, and paginate content efficiently, even when dealing with complex data sets or mixed content types.
A simple Search query looks like this:
{
search(query:"+title:GraphQL" limit: 5) {
title
contentType
modDate
}
}
Before we dive deeper, let’s take a quick look at the primary fields available in every piece of content, regardless of whether it’s retrieved through the Search tool or a Collection.
interface Contentlet {
creationDate: String!
modDate: String!
title: String
titleImage: Binary
contentType: String!
baseType: String!
live: Boolean!
working: Boolean!
archived: Boolean!
locked: Boolean!
conLanguage: Language!
identifier: ID!
inode: ID!
host: Site
folder: Folder
urlMap: String
owner: User
modUser: User
_map(key: String, depth: Int = 0, render: Boolean = null): JSON
publishDate: String
publishUser: User
}
These fields represent the core fields available for every content item in dotCMS. Fields like title, modDate, and identifier can be queried directly in GraphQL without additional context or prefixes.
However, if a specific content type includes fields beyond the ones defined in the Contentlet interface, those fields must be accessed using the content type prefix in your queries. This behavior reflects the structure of dotCMS’s ElasticSearch index and is crucial to understanding writing efficient and accurate queries.
Let’s consider HtmlPageAsset, which implements Contentlet and has additional fields like url and seoDescription:
type HtmlPageAsset implements Contentlet {
title: String!
url: String!
siteOrFolder: SiteOrFolder!
template: String!
showOnMenu: Boolean!
sortOrder: Int!
cacheTtl: Int!
friendlyName: String
advancedTab: String
redirectUrl: String
httpsRequired: Boolean!
seoDescription: String
seoKeywords: String
pageMetadata: String
}
These additional fields, such as url, template, and seoDescription, are specific to the HtmlPageAsset type. To query these fields, you must use the prefix “htmlpageasset” when constructing your queries.
Querying through Primary Fields
Since HtmlPageAsset implements the Contentlet interface, you can directly query any field defined in Contentlet, such as title or identifier, without using a prefix:
{
search(query: "title:GraphQL") {
title
identifier
modDate
}
}
This query works because title is part of the Contentlet interface, which means they are globally indexed and accessible across all content types.
Querying through Fields Specific to a Content-type
If you want to query for content using fields specific to a content type, such as matching an HtmlPageAsset by its url field, you must include the “htmlpageasset” prefix. For example:
For example:
{
search(query: "htmlpageasset.url:/home") {
title
identifier
...on htmlpageasset {
url
seoDescription
pageMetadata
}
}
}
Even if we use the PageBaseTypeCollection we would need to do something like this:
{
PageBaseTypeCollection(query: "htmlpageasset.url:/home") {
identifier
title
url
seoDescription
}
}
This Rule applies to all content types.
The page Root Field: A Direct Tool for Working with Pages
In addition to search and collections, the page root field is a specialized tool in the dotCMS GraphQL API designed to interact directly with content of type HtmlPageAsset. Thus, it is the go-to option for retrieving full pages along with their metadata, layout, and associated widgets.
The page root field is useful for scenarios where you need all the details of a page in a single query. For example:
{
page(url: "/home") {
identifier
title
url
seodescription
containers {
widgets {
title
widgetCode
}
}
}
}
Choosing the Right Tool for Your Use Case
With search, collections, and page, the dotCMS GraphQL API offers flexibility to interact with content based on your needs. Each root field organizes and retrieves content differently:
Search:
• Best for broad or cross-content queries.
• Allows filtering by any indexed field and supports Lucene syntax.
• Useful when querying multiple content types or combining filters.
Collections:
• Focuses on structured access to specific content types.
• Best for retrieving fields directly tied to a known Content Type.
Page:
• Designed exclusively for HtmlPageAsset content.
• Ideal for retrieving full pages, including layout, metadata, and contentlets.
The dotCMS GraphQL API’s full schema definition can be found here
The ElasticSearch Dependency
Here’s where things get more interesting—and perhaps less obvious. The dotCMS GraphQL API is tightly coupled with the underlying ElasticSearch index. Every query you perform through the API relies on ElasticSearch to retrieve and deliver content.
This dependency brings both power and responsibility:
Power: ElasticSearch enables fast and flexible querying. By leveraging Lucene syntax and well-maintained indices, the API can deliver highly optimized responses, even for complex queries.
Responsibility: The health of your ElasticSearch index directly impacts the API’s performance. Your content will be returned quickly and accurately if the index is up-to-date and optimized. But if the index is unhealthy—or worse, missing—the API won’t function properly. This is an important consideration, as a non-functioning index means no data can be retrieved.
To maximize the potential of the GraphQL API, it’s crucial to:
Understand the Structure of Your Content: Knowing the fields, relationships, and indexing strategies for your Content Types helps you write efficient queries.
Maintain a Healthy Index: Regularly monitor and optimize your ElasticSearch setup to ensure the API performs as expected.
Why Structure Matters
Ultimately, the effectiveness of your queries depends on how well you understand the structure of your content and how it is indexed. By knowing which fields are available and how they’re stored in ElasticSearch, you can take full advantage of features like filtering, sorting, and precise field access.
Practical Applications of GraphQL in dotCMS
Now that we’ve explored the structure and tools of the dotCMS GraphQL API, let’s dive into some practical use cases to highlight its versatility and power. GraphQL’s flexibility shines when handling complex content queries, and dotCMS provides the tools to make these tasks efficient and developer-friendly. For the complete dotCMS GraphQL documentation go here.
Using Fragments to Build Polymorphic and Well-Structured Queries
One of the most powerful features of GraphQL is its ability to handle polymorphic content efficiently, making it ideal for scenarios where data from multiple types of content needs to be retrieved and processed. In dotCMS, fragments are invaluable for creating reusable, clean, and type-aware queries, especially when working with tools like Page.
What Are Fragments?
Fragments allow you to define a reusable set of fields for a specific content type or interface. This reduces redundancy, improves maintainability, and makes handling polymorphic data in your queries much easier. For example:
You can create a fragment for common fields shared across content types.
Define type-specific fragments for additional fields unique to a particular content type (e.g., Blog, Image, FileAsset).
For a detailed explanation of fragments and their usage, refer to the dotCMS GraphQL documentation on Fragments.
Practical Example: Querying Polymorphic Content with page
To illustrate how fragments can be used in a real-world scenario, let’s consider the page root field. The page tool is designed to retrieve HtmlPageAsset content and its associated layout, metadata, and containers. On a given page, multiple types of contentlets—like blogs, images, and files—may be displayed. Using fragments, we can construct a query to handle these different types in a structured and reusable way.
Let’s examine a real-world scenario. You want to query a page located at /index and:
Retrieve the layout, metadata, and containers.
Handle various types of contentlets (Blog, Image, FileAsset) displayed on the page.
Use fragments to streamline the query and ensure reusable definitions for shared and type-specific fields.
Here’s the full example query that can be used on the dotCMS demo site:
fragment CommonFields on DotContentlet {
identifier
inode
conLanguage {
id
}
modDate
}
fragment BlogFields on Blog {
teaser
author {
firstName
lastName
}
publishDate
}
fragment ImageFields on Image {
fileName
tags
}
fragment FileAssetFields on FileAsset {
hostFolder {
folderPath
folderTitle
}
fileName
fileAsset {
versionPath
mime
size
}
}
query GetPageContent($url: String!) {
page(url: $url) {
inode
title
url
seodescription
containers {
identifier
containerContentlets {
uuid
contentlets {
... on DotContentlet {
...CommonFields
}
... on Image {
...ImageFields
}
... on FileAsset {
...FileAssetFields
}
... on Blog {
...BlogFields
}
}
}
}
layout {
body {
rows {
columns {
widthPercent
containers {
identifier
}
}
}
}
}
}
}
How It Works
Fragments Defined at the Top:
CommonFields: Shared fields from the DotContentlet interface, like identifier, inode, and modDate.
BlogFields, ImageFields, and FileAssetFields: Fields specific to their respective content types.
Using Fragments in the Query:
The contentlets field within containerContentlets may include a mix of Blog, Image, FileAsset, or other types.
By using the ... on Type syntax, GraphQL dynamically resolves the correct fragment for each contentlet type.
Efficiently Querying Containers:
The query retrieves all contentlets from a page’s containers, handling each type appropriately with reusable fragments.
Given the query, the following result is expected:
{
"data": {
"page": {
"inode": "abf8fac2-9523-4c4a-b1af-76a13857dca8",
"title": "gql",
"url": "/gql",
"seodescription": null,
"containers": [
{
"identifier": "SYSTEM_CONTAINER",
"containerContentlets": [
{
"uuid": "uuid-1",
"contentlets": [
{
"identifier": "1bc770847157279127fdca116f03c7e1",
"inode": "0e447dd0-ebba-46b2-b465-11480ce63530",
"conLanguage": {
"id": 1
},
"modDate": "2025-01-27 20:58:11.115",
"fileName": "281689755_716682682940236_217640095419815907_n.jpg",
"tags": []
},
{
"identifier": "2f383e7fa65ae22bc87eb754671a78da",
"inode": "7b5fdc1b-c1fd-4af0-b102-1a1f328a9642",
"conLanguage": {
"id": 1
},
"modDate": "2024-06-01 02:27:22.747",
"hostFolder": {
"folderPath": "/application/themes/travel/images/",
"folderTitle": "images"
},
"fileName": "logo.png",
"fileAsset": {
"versionPath": "/dA/7b5fdc1bc1/fileAsset/logo.png",
"mime": "image/png",
"size": 2493
}
},
{
"identifier": "2b100ac7-07b1-48c6-8270-dc01ff958c69",
"inode": "4b337160-a5a3-42a7-b272-7b506603cb72",
"conLanguage": {
"id": 1
},
"modDate": "2024-08-07 18:24:47.909",
"teaser": " French Polynesia has.",
"author": [
{
"firstName": "John",
"lastName": "Smith"
}
],
"publishDate": "2024-08-07 18:24:48.121"
}
]
}
]
}
],
"layout": {
"body": {
"rows": [
{
"columns": [
{
"widthPercent": 100,
"containers": [
{
"identifier": "SYSTEM_CONTAINER"
}
]
}
]
}
]
}
}
}
}
}
Using Subqueries to Fetch Related Content
In dotCMS, relationships between content types allow you to retrieve related content in a structured and efficient way. By leveraging GraphQL subqueries, you can query for a parent content type, drill down into its related content, and apply additional filters to ensure you fetch only the data that matches your specific requirements.
Subqueries are particularly useful when:
You need to fetch content that has a relationship to another content type.
You want to filter or conditionally include the related content based on specific criteria.
You aim to maintain clean and efficient queries without over-fetching unrelated data.
Example Use Case: Fetching Movies and Their Studios
Let’s consider an example where we have two content types:
Movie: Represents a film with fields like title and a relationship field studio that links it to a production studio.
Studio: Represents a movie studio with a name field.
The goal is to retrieve all movies with titles starting with “Toy” and produced by a specific studio ("Walt Disney Pictures"). We can filter the related Studio content using a subquery to match the desired studio name.
Here’s the GraphQL query:
query ContentAPI {
MovieCollection(query: "+title:Toy*", sortBy: "Movie.title") {
title
studio(query: "+studio.name:\"Walt Disney Pictures\"") {
name
}
}
}
How it Works
MovieCollection:
The query parameter (+title:Toy*) fetches all movies with titles starting with “Toy.”
The sortBy parameter (Movie.title) ensures the results are sorted alphabetically by movie title.
Studio Subquery:
The query parameter (+studio.name:"Walt Disney Pictures") filters related Studio content to include only those studios with the specified name.
Given the query, the following result is expected:
{
"data": {
"MovieCollection": [
{
"title": "Toy Story",
"studio": [
{
"name": "Walt Disney Pictures"
}
]
},
{
"title": "Toy Story 2",
"studio": [
{
"name": "Walt Disney Pictures"
}
]
},
{
"title": "Toy Story 3",
"studio": [
{
"name": "Walt Disney Pictures"
}
]
}
]
}
}
Querying Tags and Category Fields
When working with content in dotCMS, fields like tags and categories provide a powerful way to filter and organize your content. Using GraphQL, you can construct queries to retrieve content that matches specific tags or categories and even combine conditions for greater flexibility.
Here’s an example GraphQL query using the ProductCollection root field to retrieve products that have the tag "surfing" and belong to the category "water":
{
ProductCollection(
query: "+contentType:Product +(conhost:48190c8c-42c4-46af-8d1a-0cd5db894797 conhost:SYSTEM_HOST) +Product.tags:\"surfing\" +languageId:1 +(categories:water) +deleted:false +working:true"
) {
title
tags
categories {
name
}
}
}
Let’s dig into an explanation of this query:
Content-Type:
+contentType:Product: Ensures the query retrieves only Product content.
Host Filtering:
+(conhost:48190c8c-42c4-46af-8d1a-0cd5db894797 conhost: SYSTEM_HOST): Filters content to include only those hosted on the specified site or the system host.
Tags:
+Product.tags: "surfing": Filters the results to include only products tagged with "surfing".
Note: The tags field requires the content type prefix (Product.tags).
Categories:
+(categories: water): Filters the results to include only products belonging to the "water" category.
Note: Unlike tags, the categories field does not require a content type prefix.
Additional Filters:
+languageId:1: Ensures the query retrieves content in the specified language
+deleted:false: Excludes deleted content.
+working:true: Ensures the query retrieves content in a working state (drafts).
Alternatively, we can include +live: true to guarantee that we’re seeing published content
Query Variant: Using OR for Multiple Tags
If you want to retrieve products with either the tag "surfing" or "fishing", you can modify the query as follows:
{
ProductCollection(
query: "+contentType:Product +(conhost:48190c8c-42c4-46af-8d1a-0cd5db894797 conhost:SYSTEM_HOST) +(Product.tags:\"surfing\" Product.tags:\"fishing\") +languageId:1 +(categories:water) +deleted:false +working:true +variant:default"
) {
title
tags
categories {
name
}
}
}
In this example, +(Product.tags: "surfing" Product.tags: "fishing"): Adds an OR condition for the tags field, matching content that has either "surfing" or "fishing" as a tag.
Assuming the content includes products tagged with "surfing", "fishing", and belonging to the "water" category, the query will return results like:
{
"data": {
"ProductCollection": [
{
"title": "Surfing Board",
"tags": ["surfing", "water"],
"categories": [
{
"name": "water"
}
]
},
{
"title": "Fishing Rod",
"tags": ["fishing"],
"categories": [
{
"name": "water"
}
]
}
]
}
}
When in doubt, you can always use the Query Search “Show Query” functionality available in the Content Search section portlet of the dotCMS admin. This tool allows you to visually construct your filters, and generate the corresponding Lucene query, then you can refine it further to suit your needs. It’s an invaluable resource for building complex queries, especially with tags, categories, or other indexed fields.
Working effectively with the dotCMS GraphQL API requires understanding how your content is indexed and queried. As we’ve seen throughout this article, crafting precise queries often involves using parameters like working: true, live :true, and deleted: false to ensure you’re retrieving only relevant content currently active.
For more advanced use cases, we encourage you to explore the raw fields available in the dotCMS ElasticSearch index. These fields offer a reliable way to perform complex matches and allow for greater flexibility when querying indexed content. Raw fields, while powerful, require a deeper understanding of your content structures and how they are mapped in the index. You can find more details about their usage in the dotCMS documentation.
It’s also crucial to remember that the dotCMS GraphQL API depends entirely on the ElasticSearch index. This means the health and optimization of your index directly impact the performance and reliability of your queries. A well-maintained index ensures timely and accurate results.
Key Takeaways
Understand Your Index: Get familiar with how content structures are mapped to the ElasticSearch index to write more effective queries.
Use Raw Fields: raw fields provide a safer and more accurate alternative for text matching.
Maintain the Index: Ensure your ElasticSearch index is optimized and healthy to maximize the performance of the GraphQL API.
Leverage Search Parameters: Use working: true, live:true, and deleted: false to filter active content and avoid retrieving outdated or irrelevant data.
By combining these best practices with the flexibility of GraphQL, you can unlock the full potential of dotCMS to deliver powerful, efficient, and dynamic content solutions.