Service Data Forward Cache
How can a service get quick access to often-used and non-static reference or configuration data to improve overall service performance?
Services often have the need to access configuration or reference data to apply a variety of logic pertaining to the execution of the core service logic. Because the act of getting to that data is crucial, often utility services are deployed to access this data. These utility services perform the actual data retrieval related to this ie. by executing query logic. Consequence is a significant performance decrease of the service which needs the reference data.
Many SOA platforms allow for caching in one way or another. Enhance the utility service logic to, once retrieved, store the data in the cache and allow for cache (record or dataset) expiry to make sure that the data is refreshed once in a while.
Create a (decoupled) cache which can be accessed by various services. Cached elements should be normalized as much as possible to allow for maximum reuse. The cached records can be stored and accessed via keys which retrieve cache data fields, records or datasets, depending on the cache data granularity.
Cache occupies memory. Up-front analysis of the actual caching need and a smart caching strategy must be used to ensure most effective use of memory. Overuse must be prevented as this could dramatically increase the need for memory, which directly influences the need for larger hardware.
Table: Profile summary for the Decoupled Service Data Forward Cache pattern.
Note: The service cache may be enhanced to be utilizing a distributed cache to share data across all relevant nodes in the SOA.
Services often need to access reference or configuration data to execute their core service logic. When the resource used to retrieve this data from is slow, this can have a serious performance impact .This is because the service autonomy is negatively impacted by the slow resource.
Figure 1 - A utility service retrieves reference or configuration data from an unacceptably slow resource
The utility service which is used to access the slow resource, uses a cache to store retrieved values, records or datasets. Next time the same service is invoked, it first checks the cache contents to see if (valid) data is in the cache. If all required data is present in the cache, it will use the cache data instead of running a query on the slow resource. Additionally, for any partial data missing, the slow resource is still accessed whilst updating cache contents with the new data retrieved. Applying a cache to improve performance is a way of storing the data closer to the consumer, a pattern which is also commonly known as a forward cache.
Figure 2 - The utility service uses a cache to store retrieved records into. Once present in the cache, the data can be used instead of the slow resource.
Most SOA platforms allow for deploying a cache. How sophisticated that cache is depends on the technology used by the platform vendor. The basic application of this pattern allows for every service to store data in the cache and use it at a later point in time.
To avoid multiple copies of the same data cached in various places, appropriate governance on how data is cached and accessed (identified) must be applied.
Optionally, a distributed cache manager can be used to share cache contents across the SOA. A distributed cache manager is costly technology so other alternatives to synchronize cache contents across systems may be necessary.
Statically cached data is of no use to most consumers so if the same data resides in the cache for prolonged periods, the cache data should at some point in time be considered stale or 'dirty'. Dirty cache data is useless data: it can be removed from the cache as it is not current. Many ways of keeping cache contents up-to-date exist(*). A mechanism to refresh data in the cache must be implemented. What the refresh policy for data is depends on the data type, business need and scenario. Often a single policy is insufficient to apply to all data.
Generally speaking, the more the version of the utility service which supports caching is used, the better the performance of the service should become as more and more data resides in the cache. By using cached data, no or less external queries are necessary to retrieve data from the slow resource which increases performance and effectively increases service autonomy.
Designing cache management logic can be cumbersome and adds complexity to the service design.
Caching data impacts the amount of required memory resources. Overuse of cached data can become very expensive.
Caching data should be kept close to the service actually retrieving from the slow resource, as caching in all consumers or consumer services would dramatically impact the required amount of memory.
This pattern increases the behavioral predictability of the service as it moves away from slow resource access, hence increasing its autonomy.
(*) Getting "current" data into the cache can be done in many ways, not limited to the following list:
- Store an expiry date/time into each atomic set of data (ie. record) and remove the record from the cache as soon as a cache query happens after the expiry date/time. This forces the service implementation to use the slow resource instead and gives limited (indirect) control as to when cache contents are refreshed.
- Let the cache management system flag datasets as dirty based on cache policies. The cache management system will subsequently remove data from the cache as soon as it is marked as being dirty.
- Use an event driven approach to update cache data contents as soon as changes to the underlying data happen. This would result in the most "current" cache contents.