Asset Storage documentation for the dotCMS Content Management System

File Assets in dotCMS are located in both the database and the filesystem. The metadata for an asset is stored in the database, while the the hard assets (binary images, text, css, etc…) are located on the file system.

Asset Properties

The properties of each content item (e.g. the data in the fields of the item's content type are stored in the dotCMS database. All content items are stored in a single table in the dotCMS database, with the columns of the table holding the values of different fields (properties) based on the Content Type of the item.

Hard Assets Location

Content Types which include file content (including both Pages and File Assets) also have file assets which are saved in the file system based on their inodes. The file assets are saved in a folder in a b-tree format in the assets folder using the inode of the asset as the folder name. For example, a PDF with an inode of 71b8a1ca-37b6-4b6e-a43b-c7482f28db6c would be located in the following location from the root of the dotCMS distribution folder:

/dotserver/tomcat-X.x.xx/webapps/ROOT/assets/7/1/71b8a1ca-37b6-4b6e-a43b-c7482f28db6c.pdf

HardLinks

In order to minimize storage requirements, dotCMS uses hardlinks when storing versions of the same asset. In essence, this means that uploaded files, images or videos in dotCMS are only stored once. If further edits are made to the metadata surrounding that asset, which create new versions of the content in dotCMS, the file is unmodified and is still stored once - across all versions that . As a demonstration:

In our starter site, we have the file /images/404.jpg. You can see that this image has a version inode of 249eeb5c-7002-48e8-9ef3-ea6cd8e Looking on dotCMS’s /assets filesystem at that stored image, you can see it stored under the /assets directory here with a size of 47k and an INODE on the file system of 62669676.

$ ls -lih ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg
62669676 -rw-r--r--  5 will  staff    47K Jul 30 17:14 ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg

Now if I make edits in dotCMS to the 404 content - let’s say I want to change the title of the image or set show on menu=true, dotCMS will create new versions of the content but under the covers, the actual 404 image that is stored is stored as hardlinks to the original image. And this is where the magic happens - hardlinks are just pointers and take up almost no disk space. You can test this by editing the content a few times and doing a find on the fs and report back the filesystem inode.

$ find ./assets -name 404.jpg -exec ls -i {} \;
62669676 ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg
62669676 ./assets/4/a/4a352130-523d-44bc-934a-f63e7af4779a/fileAsset/404.jpg
62669676 ./assets/9/c/9c6b1880-c78e-42e4-94d9-725a50a99235/fileAsset/404.jpg
62669676 ./assets/3/0/305d7840-7b1d-45e3-8be1-e6bf8aeb697e/fileAsset/404.jpg

You can see they are all the same inode - 62669676 - which means they are all just hardlinks to the same file system space on disk which is only stored once. You can test this by doing a du on all the 404.jpg files found:

$ du -shc \
> ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg \
> ./assets/4/a/4a352130-523d-44bc-934a-f63e7af4779a/fileAsset/404.jpg \
> ./assets/9/c/9c6b1880-c78e-42e4-94d9-725a50a99235/fileAsset/404.jpg \
> ./assets/3/0/305d7840-7b1d-45e3-8be1-e6bf8aeb697e/fileAsset/404.jpg
 48K    ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg
 48K    total

The original image was 47k. Storing 4 versions of it the image using hardlinks only takes up 48k rather than the expected 188k (47k*4). Now if I edit the /images/404.jpg content again, and this time choose to upload a new image instead, things look different. Let’s say I replace the 404.jpg with another jpg that is 100k and save my content, creating a new version. If I run my find again, I get

$ find ./assets -name 404.jpg -exec ls -i {} \; 62700210 ./assets/0/c/0cef7994-2bc4-4fdc-82f7-f74ac57270f9/fileAsset/404.jpg 62669676 ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg 62669676 ./assets/4/a/4a352130-523d-44bc-934a-f63e7af4779a/fileAsset/404.jpg 62669676 ./assets/3/0/305d7840-7b1d-45e3-8be1-e6bf8aeb697e/fileAsset/404.jpg 62669676 ./assets/9/c/9c6b1880-c78e-42e4-94d9-725a50a99235/fileAsset/404.jpg

And you can see now the inode list has two unique inodes in it - 62700210 and 62669676. To check out how much disk space is now being taken up by these 5 versions of content - we can run our du again and it returns the space taken my these 5 files

$ du -shc \
> ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg \
> ./assets/4/a/4a352130-523d-44bc-934a-f63e7af4779a/fileAsset/404.jpg \
> ./assets/9/c/9c6b1880-c78e-42e4-94d9-725a50a99235/fileAsset/404.jpg \
> ./assets/3/0/305d7840-7b1d-45e3-8be1-e6bf8aeb697e/fileAsset/404.jpg \
> ./assets/0/c/0cef7994-2bc4-4fdc-82f7-f74ac57270f9/fileAsset/404.jpg
100K    ./assets/0/c/0cef7994-2bc4-4fdc-82f7-f74ac57270f9/fileAsset/404.jpg
 48K    ./assets/2/4/249eeb5c-7002-48e8-9ef3-ea6cd8ea9043/fileAsset/404.jpg
148K    total

Assets Location

Default Assets Location

The default filesystem location for the binary asset files is the asset subdirectory within the dotCMS application folder (/dotserver/tomcat-X.x.xx/webapps/ROOT/assets).

Changing the Path to the Assets Folder

You may change the location where the hard assets are stored by changing the ASSET_REAL_PATH property in the dotmarketing-config.properties file:

ASSET_REAL_PATH=/var/data/dotcms/assets

Note: It is strongly recommended that all changes to the dotmarketing-config.properties file be made through a properties extension file.