Introduction
The Drupal RESTful module has a multitude of caching options and sorting through them on your own for the first time can be slow. This article will help you get started with Drupal RESTful caching.
NOTE: RESTful 2.x module was recently released. This article focuses on the 1.x RESTful module version and the techniques mentioned below may not work if you are using any other release.
Your caching options can be controlled at various levels in code. Knowing which layer your application needs is just as important as knowing how to execute, but we’ll start off with how, we'll move on to why later.
Start with Drupal RESTful Caching
To start caching your endpoint, the initial configuration is setting render
to TRUE
in the plugin file under render_cache
key.
RESTful skips caching your endpoint if this setting is FALSE
, which is the default value. In addition to this, Drupal RESTful also ships with support for the entitycache
module for entity based endpoints.
Here's how a typical flow looks like for an endpoint:
function viewEntity($id) {
$cached_data = $this->getRenderedCache($context);
if(!empty($cached_data->data)) {
return $cached_data->data;
}
// perform expensive stuff and construct payload
$values = construct_payload();
$this->setRenderedCache($values, $context);
return $values;
}
$context
is the entity context, like the bundle name, entity ID and any other metadata you might find to be relevant to constructing your cache key. In most cases, just the bundle name, entity type and ID would suffice. RESTful fills in other contextual data like endpoint name, GET parameters, etc. RESTful builds your cache keys in a crafty way so that it is easy to do CRUD operations in bulk. For instance, clearing all caches for the "articles" endpoint would be something like clear("articles::*")
.
Within the RESTful project, RestfulBase.php houses all the caching primitives, like getRenderedCache, setRenderedCache, clearRenderedCache and generateCacheId. The last function, generateCacheId
, constructs the cache key based on the $context
supplied to that endpoint.
Preventing Cache-Busting
It is also worth noting that Drupal RESTful caching allows you to override the key generation logic on a per-endpoint basis. This is especially useful when you want to build a custom cache key.
While working on Legacy.com, we had to build a cache key which is agnostic of specific GET parameters. By default, the generateCacheId
builds a different key for the following endpoints:
- articles/23?foo=123456
- articles/23?foo=567898
- articles/23?foo=986543
Though a different key for each of these calls makes sense in most cases, it is redundant in some cases. E.g. we return the same payload for all the above 3. To change this behavior, we ended up overriding generateCacheId
.
The setRenderedCache
, getRenderedCache
, and clearRenderedCache
operate upon the default cache controller, which can be specified in the plugin using the class
key inside render_cache
. This value defaults to DrupalDatabaseCache
.
This default value can also be explicitly set to your favorite caching backend. In our case, we use the memcache
module and set this value to MemCacheDrupal
. Again, Drupal RESTful allows you to configure caching backends on a per-endpoint basis.
Managing Caching Bins
Cache backends have this concept of bins, which is an abstraction for similar data which can be grouped together. Examples from the Drupal core are cache_filter
and cache_variable
.
There is a bin
setting for every endpoint in the plugin file, which is cache_restful
unless we explicitly specify otherwise. It is advisable to store high traffic endpoints in exclusive bins.
There is an expire
setting for each endpoint, which dictates the cache expiration for that endpoint. This defaults to CACHE_PERMANENT
, which means that the entry will never be wiped off until it is explicitly selected for clearing.
The alternative is CACHE_TEMPORARY
which indicates that it will removed in the next round of cache clearing.
These are the very same constants used in Drupal cache interface’s cache_*
calls. There is a middle ground too, which isn't documented. The expire
value can be set in seconds. This is a deviation from Drupal’s convention of mentioning it as a timestamp.
Varying Caching by Role or User
Some endpoints need to be cached for each role, and some for each user. This granularity can be controlled by the granularity
setting, which takes either DRUPAL_CACHE_PER_USER
or DRUPAL_CACHE_PER_ROLE
. This depends to some extent on your authentication mechanism too.
We wrote our own authentication mechanism and had a user created exclusively for the API and serving the endpoints. We gave this user an exclusive role and configured per-role caching for all the endpoints.
Here's how the plugin configuration looks for one of our endpoints:
$plugin = array(
'label' => t(Recommended Videos'),
'resource' => recommended_videos',
'name' => recommended_videos__1_1',
'entity_type' => 'node',
'bundle' => video',
'description' => t('Get all recommended videos for a given article.'),
'class' => RecommendedVideosResource__1_1',
'authentication_types' => array(
'my_custom_token',
),
'minor_version' => 1,
'render_cache' => array(
'render' => TRUE,
'expire' => CACHE_TEMPORARY,
'granularity' => DRUPAL_CACHE_PER_ROLE,
),
// custom settings
'video_sources' => array(youtube', 'vimeo'),
);
The anatomy of a Cache key
A cache key using the default key generation logic looks like this:
v1.0::recommended_videos::uu1::paet:node::ei:105486::fo:123::ba:abcd
The corresponding endpoint URL looks like this:
/api/v1.0/recommended_videos/105486?foo=123&bar=abcd
The first part is the API version, followed by the resource name, which is "recommended_videos". The next part is either a "uu" or "ur" depending on whether it is user level or role level granularity. Next is the entity type (e.g. node) with a prefix "pa". This is followed by the entity ID part, which is "ei:105486" in this case.
The last part is the truncated key-value list of GET params foo and bar. Each logical section is separated by a "::" so that it is easy to do a selective purge, as in wiping out all endpoints for v1.0 of the API would be a call to clear("v1.0::*")
.
Note that a GET for a collection of resources like latest comments results in a viewEntity
for each item in the collection and as many cache entries. If you want a single cache entry for the whole collection, you have to custom build your payload and call setRenderedCache
as shown in the initial endpoint workflow code snippet.
Other Considerations
Be Diligent, Validate Cache Strategies Early
RESTful is designed as being very modular from the ground up and has a provision for controlling caching settings for every endpoint. Such a high level of control is both good and bad. Digging through an issue for hours because some settings for an endpoint are misconfigured isn’t fun for anyone. Unless the settings are clear and explicit, it makes issues hard to debug and sort out.
Be diligent and validate your caching strategy from the beginning.
Memcache Stampeding
Another thing to look out for is memcache stampeding. Memcache stampeding occurs when a missing key results in simultaneous fetches from the database server, resulting in a high load. Memcache is designed to prevent too many requests from piling up.
With our work with Legacy.com, we could mitigate the need for passing these requests to Memcache by properly managing our Varnish layer. We will detail on how we fixed the stampeding issue and constructed a Drupal RESTful Caching strategy in a later post.
Drupal RESTful Caching Resources
- A video tutorial series on RESTful by one of its authors.
- The proverbial TODOMVC implemented with Drupal RESTful backend and Angular in the frontend.
- Provides a RESTful endpoint to the panels layout and configuration. One of the modules contributed back to community from the Legacy project.
Leave us a comment