3rd Party Data - Strategies - Documentation topics on: 3rd party,batch,custom fields,data,importing,includes,integration,javascript,webservices,web services,.

3rd Party Data - Strategies

Integrating dotCMS with data coming from a 3rd party system is a common use case.  Generally, there are four different strategies to integrate with external data, each with benefits and drawbacks.  These strategies are outlined below:

Remote Calls, Web Services

If the 3rd party system is real-time and also highly available, it is recommended to write a custom viewtool or viewtools that read data from the external system (via web services for example) for use on pages and content in dotCMS. This allows a user to build content that can be edited by users around custom data that can be displayed via a shared key.  Examples of this are eCommerce sites - where the "hard" ecommerce content, such as sku, pricing and shipping are coming from an ERP and the page and product information, images, reviews are coming from dotCMS.  URLs can contain the shared key e.g. - http://shop.dotcms.com/shop/model/TC-P65ST60.  In this case, the shared key is read on the dotCMS page via Velocity and then used by the custom viewtool to call a webservice to pull the data and also to query and show associated dotCMS content.

The beauty of using a custom viewtool is that the viewtool and 3rd party data can even be used in custom fields in dotCMS to provide pick lists or custom controls for back end users on the content editing screen to ease the management and "weaving" of the content.

Batch Job (Data Only)

If the data does not need to be real time and is “read only” in the CMS, we would suggest a nightly batch job to bring the data into a DB that is accessible to dotCMS.  You could then use the SQL Macro or write a custom viewtool that pulls this data for display on your site via JDBC.  Again, by making the data accessible in dotCMS and velocity,  one could use the data in custom fields when content item is being edited.   This is the way to go if your data does not need to be real-time and does not need "management",  by which we mean searching, editing, versioning, etc.  An example of this is a corporate website that imports employment opportunity data from a external HR system into a 2nd database (on the same db server as dotCMS) which dotCMS then reads for use on the corporate website.  The data in the db could be updated nightly (via an eternal job)  from a csv file that is sent to the servers, though any method of pushing and syncing the db would be fine.

Batch Job (Import Content)

If your data needs to be manageable (versioned, editable, full-text searchable) once it is in dotCMS, then you might write an importer script (you could start with and heavily customize the content importer plugin).  The script could retrieve data modifications and send them to dotCMS via webservices, CMIS or even as a CSV file.   As an example, a corporate website could import their Employee profiles from an ERP for the entire Company.  The Employees, once they have been imported, are fully searchable and profile changes, images, head shots can be managed in the CMS itself.  

In all three of the above scenarios, the data will appear seamlessly on your websites, both to your site search indexes and to web crawlers- the remote data will just be part of the page .  You can also use dotCMS’ velocity tooling, including built-in page and or block caches to make these 3rd party calls more performant.    

JavaScript "Includes"

That said, there is a fourth method for incorporating 3rd party data into your website and that is through JavaScript:

  1. If you just want to display data from a remote 3rd party web api that is remotely hosted and may not be as available as your sites, then we would suggest using async Javascript to load the data and HTML and then “write” that data to place holder divs on your site.  This is a good solution when you are pulling in data from a 3rd party that is unreliable, such as a twitter feed, and you don’t want to stop your page from rendering if that data is unavailable.  The downside is that it add complexity to developing your sites and is not indexed by search engines.


Recap

Remote Calls/WebServices

Pros:

  • Real Time
  • Separates Data from Content
  • Data lives on the page - all data and content will be searchable by dotCMS site search and webcrawlers
  • Custom viewtool and data can be used in custom fields

Cons:

  • dotCMS will only be as fast and as highly available as the external system.  
  • External data cannot be “managed” through dotCMS, where managed = searched, edited, versioned

Read only data (from separate db)

Pros

  • Separates Data from Content
  • Data lives on the page with content - all data and content will be searchable by dotCMS site search and webcrawlers
  • Viewtool and data can be used in custom fields
  • dotCMS provides a SQL Macro for quickly mocking up integration
  • Does not rely on a remote call for data (only a DB link)

Cons:

  • External data cannot be “managed” through dotCMS, where managed = searched, edited, versioned
  • Out of the box elasticsearch based search tooling does not work.

Batch Content Imports

Pros

  • Imported data can be managed, searched, edited, versioned and permissioned in dotCMS
  • You can use all of dotCMS native content tooling for content access and site building

Cons

  • Keeping data in sync between two systems
  • Maintaining scheduled/batch processes
  • Permissions to data is stored in dotCMS (this might be a pro too)

Javascript "Includes"

Pros

  • Web Developer friendly, no Java development effort
  • Does not tax dotCMS in any way. (though it might slow browser page rendering)
  • Good for when you want real-time data from an unreliable or remote source

Cons

  • The normal cross browser and platform Javascript woes
  • JS content does not show up for search engines, either internally or externally.



Last Note: Use Custom Fields to help your content editors

dotCMS cannot stress enough the importance of using custom fields in dotCMS to simplify the tasks for your content editors.  Custom fields allow you (as the web developer) to develop custom controls on the content editing screen, give you the full power of javascript and velocity and can be used to show/hide fields, tabs, pages, build pick lists, etc…