Privacy Best Practices

Last Updated: Feb 6, 2020
documentation for the dotCMS Content Management System

dotCMS is committed to resepecting the privacy of our customers, complying with all applicable regulations and standards, and providing our customers the tools they need to provide the same privacy commitment to their customers and site visitors.

Since dotCMS is a customizable platform, each specific implementation of dotCMS will have different privacy implications, and each customer is fully responsible for complying with appropriate privacy regulations and standards. In order to assist customers, this document lists a number of privacy best practices to help you ensure that you respect the privacy of your site visitors and comply with all appropriate privacy regulations and standards. Due to the nature and ability to customize dotCMS, these recommendations can not be comprehensive, and are provided solely as a guide to help you in your privacy efforts.

This document is not intended as legal or expert advice, or as hard or comprehensive guidelines. The only recommendations provided here are those that relate specifically to privacy implementations in dotCMS, in regards to dotCMS capabilities and features as they were reviewed for dotCMS internal use; they are shared in order to help you perform your own privacy review and implementation, and should only be used to help you identify areas which may have particular impact within your dotCMS implementation.

We recommend that you read this entire document before publishing your own sites with dotCMS, and we strongly recommend that you implement these best practices for any production server, especially any server which collects information from individuals that you do not have a pre-existing legal (e.g. contractual, employment, etc.) relationship with.

Procedural and Organizational Best Practices

Compliance with privacy regulations and standards is not just a technical activity; it requires full support from the organization, and may require review and/or modification of existing practices and procedures to succeed.

As with all other recommendations in this document, these recommendations are necessarily incomplete, as each organization and each dotCMS implementation are different, and you must perform your own review and take your own steps to ensure you respect your visitors' privacy and comply with all regulations and standards.

The following are some basic guidelines you may wish to review as a basis for your own initial privacy implementation and ongoing compliance efforts.

Privacy Impact Assessment

In order to assist your compliance efforts, it is often recommended that you perform a Privacy Impact Assessment (PIA) to clearly identify all data you collect and how it is used. The EU GDPR specifically requires that certain organizations perform a more rigorous version of a PIA called a Data Protection Impact Assessment (DPIA).

Regardless of what privacy regulations or standards you wish to adhere to, a thorough PIA often includes information on all of the following uses of data:

  • Collection: All data (including potentially de-anonymizing data) that is collected about users and visitors.
  • Processing: The ways all the collected data is processed.
  • Retention: What data is retained, where it's kept, and for how long.
  • Sharing: What data is shared, and with whom (including data you may store or upload to a cloud service).
  • Removal: How data can be removed (including data which has been shared).

The PIA is typically an internal document used to guide and document your compliance with various privacy regulations, and is often the best first step to take in managing your privacy practices and compliance.

References

For more information on PIAs and DPIAs, you may wish to read the following documents and guidelines:

Data Minimization

The EU GDPR explicitly advocates the concept of data minimization, to ensure that no personal data is collected which is not specifically required by the organizational or business purpose for which it is collected. Therefore, it is strongly recommend that, once you have identified how you use personal data, you review all personal data collected and determine if it is necessary to collect and store that data. If any personal data you collect is not strictly required for the intended purpose, consider stopping the collection of that data and removing all previously collected data of that type.

Privacy Policy and Transparency

It is recommended that you update your Privacy Policy to comply fully with the GDPR and other privacy regulations, and ensure both that the Privacy Policy is easily accesible from all locations on your web properties and applications, and data handling practices are clearly displayed when data is collected. Specifically:

  • Ensure that your Privacy Policy describes all aspects of your data use (e.g. Collection, Processing, Retention, Removal policies/procedures, and Sharing).
  • Include a link to your Privacy Policy from all Pages on your site (such as from a common footer).
  • Include a link to your Privacy Policy on or near any form which collects user data.
  • At the point where any user data is collected (e.g. via a form or user registration page), explicitly state all data that will be collected and how it will be processed, stored, and shared.

Server Location

The GDPR applies additional scrutiny to personal data which is collected in the EU and stored or processed outside the EU. To minimize the risk and scrutiny of any personal data you may collect, if you collect any personal (or potentially de-anonymizing) data from EU persons, you may wish to review your hosting, data storage, and data processing processes (and third part vendors) to ensure they all reside within the EU. For example, if you have an AWS private cloud, and collect personal data within the EU, consider selecting an AWS region located within the EU.

Data Retrieval and Removal Requests

The GDPR and other privacy regulations require both that users may request all information you have collected from them (personal or otherwise), and that users may request that you remove all data collected on them at any time.

To ensure that you can comply promptly with these requests, you may wish to consider implementing Content Types, Workflow steps and/or Workflow actions to accomodate personal data retrieval requests and removal requests. Alternately, you may wish to implement automated processes to receive and process these requests.

  • For example, you may wish to create a separate Data Request Content Type which both assists in retrieving and supplying all appropriate visitor data, and which documents that you have received and responded to that request.
    • The same or a similar Content Type could be used to track and manage data removal requests (including custom fields or back-end Pages that perform appropriate searches and/or removal tasks).
  • Alternately, you may wish to create a form where users can request to receive or remove any information you have collected on them, and handle these requests automatically.

Special Note

After processing a removal request, we recommend that you flush the dotCMS cache to ensure all personal data is removed from all memory, disk, and network caches. Depending on how these requests are handled (e.g. manual or automated processing) and the volume of requests you receive, you may wish to either manually flush the cache after a removal request, or perform periodic cache flushes to ensure no personal data remains in caches for a significant length of time after a removal request.


Technical Best Practices

Data Security

Properly handling private data by necessity requires that the data is collected and retained securely. It is strongly recommended that, as part of your privacy compliance efforts, you read and implement appropriate security practices to ensure that any private data you handle is protected from unauthorized access (from both authenticated and unauthenticated users).

In addition, it is recommended that at any point where you collect, process, or share private data, you implement appropriate security measures to ensure that the data is not exposed to potential interception, injection, or other unauthorized access. For example, it is strongly recommended that all forms which collect user data be accessible only via HTTPS, and that any data shared with any other systems be sent via highly secure channels.

  • Implement consent cookie technology on your web properties.
  • Integrate with an Independent Data Provider (IDP) for 3rd party authentication and user account storage and maintenance.

Potentially De-Anonymizing Data

Geolocation

dotCMS supports Geolocation features that enable you to tag content to a specific location (latitude and longitude), and to perform geolocated content searches based on a selected location. In addition, since dotCMS 4.0.0, dotCMS has supported a feature that enables you to automatically geolocate site visitors based on the visitors' IP address using the MaxMind GeoIP2 service.

None of these geolocation features are enabled by default; however the dotCMS starter site, which is included with the dotCMS distribution, contains both geolocated sample content (including Content Types that support geolocated content) and automatic geolocation for site visitors. If you built your site based on the dotCMS starter site Themes and Templates using dotCMS 4.0.0 or later, you may have automatic geolocation enabled in your site, even if you do not explicitly use it or collect the data.

The automatic geolocation provided in the dotCMS starter site is based on a free geolocation service which is imprecise, so it can not be used to uniquely locate or identify users based on user location. Customers may upgrade to the paid version of the same service, and if you have upgraded to a professional version of the service, or if you have implemented another automatic geolocation service, the geolocation information you receive may present some privacy implications.

  • If you do not use automatic visitor geolocation, it is recommended that you disable automatic geolocation, or verify that your site does not submit any visitor data to a geolocation service.
    • In the dotCMS starter site, automatic geolocation code is contained in the file /application/vtl/pages/user-tab/user-info-tab.vtl.
    • You may also search for files which contain automatic geolocation code by entering the query +catchall:*geoplugin* in the Query Tool on the dotCMS back-end.
  • If you use geolocation services for your content, it is recommended that you review how this data is handled and include this data in your privacy assessment.

Historical and Logging Data

During the normal course of operation, dotCMS records and stores historical information about content access and use. This is a necessary function of the system, both to maintain high levels of security and to enable customers to comply with other regulations and regulatory frameworks. Historical data collected and maintained by dotCMS includes access logs, content history, user action records, and several log files which record server and system operational tasks and events.

All of the historical information collected by default is necessary for proper operation, protection, and maintenance of the system, and as such can not (and should not) be disabled. Since you can not eliminate this information, it is important that you ensure that all of this information is properly secured, both within the dotCMS system and on your dotCMS server.

  • Review your log files and dotCMS system content to determine if any personal or potentially de-anonymizing information is included which is not necessary.
  • Consider whether tracking information is needed, and disable it if it is not necessary
  • Ensure that all system information (including content and log files) that contains personal or potentially de-anonymizing data is properly secured from unauthorized access.
    • This includes data within the dotCMS system, on the dotCMS server, and at any point where data may be transferred (including transferring log information or data to an external service).
  • If your log files contain any potentially de-anonymizing data (such as IP addresses), obfuscate/pseudonymize this information in your logs, or aggregate your logging information in a SIEM solution that provides GDPR support and tooling.
    • Most systems that collect and store logs (such as Splunk and Kibana, provide tools to ensure the GDPR compliance of the aggregate data stored in them.

Tracking and Profiling Data

In addition to personal data which may uniquely identify a user, many privacy regulations also require that you identify and properly handle any data which may be used to track user behavior or profile users. It is recommended that you review the following areas of your implementation to ensure you respect visitor privacy and comply with appropriate regulations and standards.

Cookies

Since cookies can contain personal data and may be used to track user behavior, review your use of cookies within your site to ensure that you respect your visitors' privacy and comply with all applicable standards and regulations. Specifically:

  • Require Opt-in for All Personal Data Stored in Cookies
    • Do not use cookies that store any personal information unless your users specifically opt-in to allow this.
    • Explicit opt-in (rather than opt-out) is a requirement of some privacy regulations, including the EU GDPR.
  • Allow Users to Opt-Out of All Other Cookie Usage
    • For any cookies that do not store personal information, allow users to opt-out of their usage.
    • One of the simplest ways to do this is to install a tool that implements consent cookie technology, to enable site visitors to opt out of the use of all cookies (including the default dotCMS cookies).
    • For example, https://cookieconsent.insites.com provides a tool that you can integrate into your dotCMS site to allow site users to prevent all cookie use by your site.
  • Record Visitor Cookie Consent
    • If you use consent cookie technology, consider including the value of the visitors consent cookie in your application server access logs to ensure that visitor consent has been obtained.
  • Do Not Require Cookie Use for Any Services Unless Strictly Necessary
    • Do not provide any services on your site that will fail to work if cookies are not enabled, unless use of cookies is actually required for the service to work.
    • The EU GDPR indicates that visitors should not be penalized (such as by preventing their use of your public services) for not providing personal information, unless the information is necessary for the service to work.
Eliminating Cookies Completely

Some cookies are required for dotCMS to render pages dynamically; therefore when you use dotCMS to dynamically deliver pages, you will not be able to completely prevent the use of cookies. Note that if you use a third-party tool to prevent dotCMS from using cookies completely, pages may not render correctly in some cases.

However you can completely eliminate cookies in two ways:

  • Deliver your content using Static Publishing rather dynamic rendering.
  • Build your own client to store and retrieve dotCMS content via the REST API.
    • Note that when using your own client application, you can still take advantage of dotCMS dynamic rendering (using Containers, Templates, Velocity Code, etc.) by using the Page Layout API to render the HTML your client application will deliver to the end-user.

Clickstream Tracking

dotCMS Clickstream tracking is used to collect aggregate analytics on Page and site visits and engagement, as displayed in the Site Dashboard. Clickstream data is only processed and displayed in aggregate, and therefore does not uniquely identify or track the browsing habits or behaviors of any individual users; however in order to identify unique users, Clickstream tracking does store visitor IP addresses for a short period of time (2 days), and record the Pages that each individual IP address that accesses your site.

Since Clickstream tracking does not provide a way to identify individual users or to profile individual user behavior in any identifiable way, it does not in general pose privacy concerns. However if you do not use the Clickstream features, it is recommended that you disable Clickstream tracking to eliminate the need to identify or address this feature in your privacy review.

Personalization

By default, dotCMS personalization features perform a very limited form of tracking which allows you to deliver content which is personalized for specific users of your site. There are a number of limits to this feature that strictly limit it's privacy impact, including that the information is not persisted beyond the user session, the information is tied to the user session via a non-identifying UUID (and therefore can not be used to profile or de-anonymize any individual), and the information is only available when accessed directly via code or when using the $dotcontent.pullPersonalized() Velocity viewtool method.

However if you are not using dotCMS personalization features (specifically personalized content pulls), you may wish to disable tag accrual in dotCMS. Note:

  • This is not strictly necessary, as the tags accrued via personalization features:
    1. Are implemented in a way consistent with pseudonymisation (not being tied to any identifiable user), and
    2. Are not persisted longer than the user session, so present little to no risk of any privacy violation.
  • However if you are not using the personalized content pull feature, you do not need the tag accrual feature, and disabling this feature will eliminate the need to identify and account for this minimal tracking.

Login Information

dotCMS supports both back-end logins (typically for members of your organization and possibly partners) and front-end logins (for various types of site visitors, including potentially anonmymous users accessing your site using a pseudonym). It is often necessary to collect personally identifying information for users who you wish to give special access to your site, content, or services; however you should review all information required for your site registrations or logins and ensure that it is actually necessary (see Data Minimization, above) and that it is properly protected and managed in compliance with all appropriate regulations and standards.

Note:

  • All User Account information in dotCMS is subject to full security controls and Role permissions.
  • User Account information may only be accessed by users with both appropriate permissions and an additional separate authorization to view User information.
  • All dotCMS User Accounts are assigned a random User ID which is used to identify the user, and it is this ID which is accessible even from within code in your dotCMS system.
    • The User ID provides a level of [pseudonymization]#Pseudonymization], since even if a security breach were to expose a User ID to an unauthorized party, no personal data for that user would be accessible (or processed in any way) unless the code in your specific dotCMS implementation explicity accesses that data.

Third-Party Authentication

dotCMS provides the ability for you to integrate with a number of 3rd party authentication systems such as Oauth (social media accounts), SAML, and custom authenticators. It is inherent that to integrate with an external authentication system, dotCMS must send potentially private data (such as an email address or user name) to the external system, and must receive potentially private data (such as an account name or other identifying information) from the external system.

When dotCMS receives data from an external authentication system, dotCMS may store the information returned by the external system in a dotCMS User Account. This means that, even when integrating with an external system, your dotCMS system may contain personal data about your authenticated users that was received from the external system.

This information sharing (both data sent and received) is necessary for security and authorization, just as when you are using native dotCMS User Accounts. However since it does involve sharing of potentially private data, it is recommended that if you integrate with an external authentication system, you review the authentication to identify what personal information is entered by users when they authenticate (such as a user name or email on the remote system) and what data is retrieved and stored by dotCMS when the user is authenticated.

User Uploaded Images and Files

If you allow users to upload files or images (other than files or images from a controlled source, such as stock images), there is a potential that uploaded files and images can include metadata which contains personal data or potentially de-anonymizing data.

Image files frequenty contain metadata with personal or personally identifying information (such as precise geolocation information); if you do not collect any personal data on your site, metadata in uploaded images may be the area that presents the greatest privacy implications for your site.

How dotCMS Handles Metadata

By default, dotCMS:

  • Extracts all metadata from uploaded files and images uploaded as a file Content Type (such as File Asset).
    • The metadata extracted from existing file Content Types may be viewed from the Metadata tab when viewing the file in the content editing screen.
  • Makes all extracted metadata available for searching withing dotCMS.
  • Displays all images unaltered (which means that all metadata in the image may be accessed by anyone who can view and download the image).

Limiting or Preventing File and Image Metadata

Therefore it is recommended that you address the potential for personal or de-anonymizing data to be contained in uploaded file metadata using one or more of the following steps:

  • Identify if any users (especially unauthenticated users or users who are allowed to be authenticated without providing personal data) are able to upload files or images from an uncontrolled source.
  • Identify how files and images are provided for display or download.
  • If necessary, implement steps to limit or eliminate handling of file and image metadata.

The following are several methods which can be used to limit handling of metadata:

Consider changing from uploading files and images as File Asset content to embedding files and images in content via Binary Content Type fields.

  • Metadata is not extracted or indexed from images and files saved as Binary fields.
  • However metadata is still included in the uploaded file or image and may still be accessible if the file or image is directly viewed or downloaded.
    • Therefore, even if you use this method, you may still want to implement one of the other methods below as well.

Consider restricting file and/or image uploads from content creators.

Here are some possible methods to limit the potential for uploads to contain personal data:

  • Pre-upload all files and images that your image creators may use in content (for example, uploading stock photos from a pre-selected source), or
  • Restrict access to the File Asset Content Type (and other file Content Types) to only users with an understanding of the potential privacy issues and knowledge of how to inspect and modify uploaded images for personal data.

Consider using image filters to display all images which are uploaded by content contributors.

When an image is uploaded to dotCMS and then displayed without any filters, the original image is displayed, including all metadata tags. However when image filters are used, all original metadata tags on the image files are removed, and only the metadata tags added by dotCMS (related only to image characteristics such as resolution and file size) are included in the resulting image.

Note that you can apply a filter to convert an image file to the same format. For example, you can use /filter/Png for a PNG file or /filter/Jpeg for a JPG file, to strip the original metadata without transforming or changing the format of the image.

You can ensure image filters are used in two ways:

  1. Recommended: Edit all uploaded image files using the dotCMS image editor and apply at least one filter to each image.
    • Note that if you check the “Compress” checkbox in the image editor, the “/Jpeg” filter will be applied, without changing the appearance of the image.
  2. Alternate: When referencing images by URL (such as in HTML <img> tags), append image filter URL parameters to all image URLs.
    • For image links manual inserted into content (such as in Textarea fields which contain HTML code), this requires manual editing of each link.
    • However if you are displaying images via Widget code, or by placing content inside a Container on a page, a single change to the appropriate code (in the Widget or Container) can apply an appropriate filter for all images.
    • Note that this method will not work well for images which are uploaded into content using the WYSIWYG editor. If your content contains WYSIWYG fields you may wish to consider either:
      • Replacing WYSIWYG fields with Textarea fields, and performing all text formatting in appropriate Widgets or Containers, or
      • Modifying the TinyMCE configuration to disable insertion of files and images in the WYSIWYG field.

Custom Code

As an extensible framework, dotCMS gives you the ability to extend the capabilities of your site using many different types of custom code, including Velocity code, Javascript Code, Java code (via plugins), and more. dotCMS has performed an in-depth analysis of all code within dotCMS to ensure that it meets privacy requirements and regulations, but dotCMS can clearly not analyze any code you write or integrate with your dotCMS system. Therefore we strongly recommend that you review all of your own code to ensure that you understand how it may impact privacy, including:

  • Collection: Any personal (including potentially de-anonymizing) data collected.
    • We recommend that you pay particular attention to any code which uses visitor information (such as an IP address) to retrieve data from an external service or database (a geolocation service, for example).
  • Processing: How any personal data is processed (even if it was collected from another source, such as being imported or collected from a regular dotCMS form or data input).
    • For example, if you collect a visitor's IP address, and use a service to determine geolocation information on that visitor, this is a form of processing.
  • Retention: What personal data is retained, where it's retained, and for how long.
    • Note that if you collect any personal data in your code and store it in a normal dotCMS Content Type, then it will be retained on the dotCMS server for an indefinite period of time.
  • Sharing: What personal data is shared with any other parties, and who it is shared with.
    • This includes, for example, a cloud service provider or cloud hosting service.
  • Removal: How any personal data collected can be removed.
    • Note that if you collect any personal data in your code and store it in a normal dotCMS Content Type, then you may remove it using normal dotCMS content removal processes.
    • However if you share any data, then additional processes or integrations may be needed to ensure any removal requests are shared and respected throughout the data chain.

There are numerous ways in which you may implement custom code in dotCMS. It is recommended that you review these areas to identify and review any custom code in your specific implementation:

  • Velocity code (in files, Widgets, and textarea fields in content)
  • Javascript code (in files, Widgets, and textarea fields in content)
  • Java code (only if you have added plugins to your dotCMS implementation)
  • Custom fields in Content Types
  • Custom Workflow Actions

Pseudonymization

The GDPR introduces a new concept in data protection law “pseudonymization”. Pseudonymized data is data that is neither completely anonymous nor directly identifying. Pseudonymization uses random identifiers or one way hashes to separate collected data from a visitors personal information, thus making it impossible to to connect an individual person with a data event (such as the viewing of a web page) without additional information that is held separately.

Pseudonymization significantly reduces the risks associated with processing personal data, while also maintaining the data's utility. For this reason, the GDPR creates incentives for controllers to pseudonymize the data that they collect. Although pseudonymous data is not exempt from the Regulation altogether, the GDPR relaxes several requirements on controllers that use the technique.

The long term data dotCMS stores about a visitor does not contain any information that can re-identify the visitor without a unique random UUID only known by the visitor and stored at the visitor's client (typically the browser of the visitor). dotCMS does not connect the UUID of any visitor to any personal information about the user. In addition, the only tracking and profiling information dotCMS collects (via click tracking and tag accrual for personalization) is associated with the non-personally-identifiable UUID. Therefore, by default, dotCMS provides pseudonymization of all data it collects.

However in your specific implementation it is possible to associate personal information entered by a visitor with the visitor's UUID, and if this is done, then it becomes possible to directly associate the personal information with both any tracking and profiling information collected by dotCMS and any additional data collected by your specific implementation. In this case, it is recommended that you implement pseudonymization to explictly separate personal data from all other data your implementation collects. Specifically:

  • If you store any personal data for your site users and visitors, consider implementing pseudonymisation to separate the personal data from all identifying information.
    • For example, you may wish to keep personal or potentially de-anonymizing data in separate, unrelated Content Types with randomized keys which are associated in a third content type.
  • Implement JWT for all back-end accesses, including access with the REST API.
    • JWT uses randomized IDs to authorize all accesses, providing a level of pseudonymization for requests.

Handling Existing Data

Some privacy regulations (including the EU GDPR) do not distinguish between data collected before the regulation start date and data collected after the regulation is in effect; in other words, there is no grandfathering of pre-existing data. This means that, in addition to reviewing your site and ensuring that you properly handle any new personal data collected, you must also ensure that any previously collected personal data complies with the appropriate privacy regulations and standards.

It is recommended that, during your privacy assessment, you review the existing data on your dotCMS properties, and make decisions about the disposition of that data. This includes, for example, whether the data should be retained or removed, and whether it is necessary to inform users that you possess the data (if you have not received a prior authorization from the user to collect and retain it).

Third Party Integrations

If you pass any data from dotCMS to any third party services, libraries, or tools, you should identify what information is being passed and what the third party resource does with that information. If any information is shared with a third party service or site (such as by passing user or website information to an analytics service), you should identify what information is shared, consider whether or not all of that information needs to be shared, and consider any options the third party provider may offer to reduce or eliminate potential privacy impacts from sharing the data.

For example, Google Analytics is used by many dotCMS customers to perform website analytics. In order to provide analytics, information about site visitors is passed to Google Analytics, including the IP addresses of visitors which is considered personally identifying information. However Google Analytics provides an anonymizeIP option that instructs Google Analytics to store IP addresses without retaining the last octet of the address, ensuring that IP addresses can not be used to uniquely identify visitors; if you use Google Analytics, you may wish to consider enabling this flag to reduce the potential privacy impacts of sharing visitor IP addresses with a third party vendor (in this case Google).

Additional Resources

Application Server Privacy Site

Database Vendor Privacy Sites

Additional Vendor Privacy Sites

Privacy Contact Information

dotCMS can not offer consulting on privacy issues related to your specific dotCMS implementation, but we are happy to answer your privacy-related technical questions about specific dotCMS Enterprise and dotCMS Cloud features, and any inquiries about dotCMS corporate compliance with various privacy regulations.

  • For Technical Assistance:
    • If you are a dotCMS Enterprise or dotCMS Cloud customer, please contact dotCMS support to request technical assistance with dotCMS products and features.
  • For Compliance Information:
    • For questions about dotCMS corporate compliance, please contact the dotCMS Data Privacy Officer at privacy@dotcms.com.

On this page

×

We Dig Feedback

Selected excerpt:

×