Drupal 8 API: types of information in Drupal

Pre-requisites

None.

What does a CMS need to keep track of?

Content management is the whole point of a CMS: to be a (c)ontent-(m)anaging (s)ystem. Early CMSes did merely keep track of content—events, blogposts, comments—but as sites grew in complexity, CMSes also needed to work out how that content should be displayed.

The three complicating factors

How content appears depends on:

  1. different situations e.g. as full page or as a teaser
  2. different site visitors e.g. an administrator or a member of the public
  3. and changing over time e.g. as regular tasks run and need to remember when they last ran.

This applies three sets of conditions on content, and each set of conditions needs information to be stored, to inform the CMS's decision on what to do.

The three new required sources of information

Respectively, we can tackle these three complicating factors, by defining three new sources of information:

1. Configuration is the name given to information required by the site to make decisions about content presentation and general site running. How an event's location field should be displayed—as a dynamic map, or just as longitude and latitude coordinates—is configuration, as is the account number on Google Analytics, which gets notified every time an event is viewed. The email address to contact for site updates is also configuration, as is whether or not updates should be checked at all. Even whether or not a site is in maintenance mode, is configuration. Configuration is characterized by its semi-permanence—it rarely changes, and is usually only changed by an administrator—and by its semantic specificity—related configuration options can be bundled together and named based on their meaning.

2. Session is the name given to information required by the site to make decisions about what each site visitor should be able to see and do. Whether a site visitor is logged in, and if so then what roles they have, is determined by each site visitor's unique session. Temporary information, relating to the current site visitor only, can be stored in the session, although it may be lost if the site visitor logs out. Session is characterized by its temporariness—it only persists as long as a given site visitor's login, and disappears along with any browser cookies—and by its visitor specificity—information in each session is only available to the site visitor whose session it is.

3. State is the name given to information required by the site to make decisions about content presentation and general site running, which might change over time. When the regular "cron" task is run to perform various administrative functions—updating the site search, processing any queued tasks in bulk—it references the last time it was run, which is stored in state. Once it has finished running, the cron task updates this time stored in state. When caches are cleared, and Drupal packages up all CSS files into a single file to improve network performance, then each page needs to know where to find this file, and so its filename is stored in state. State is characterized by its temporariness—it only persists until Drupal or some user action causes it to be changed, and then the precise new content of state is decided by Drupal itself, rather than storing user input—and its semantic specificity—like configuration, each state option is named based on its meaning.

Including content itself, then, we have four sources of information that Drupal needs to have knowledge of, and be able to manipulate and display.

Side-by-side comparison

It might be useful to provide a comparison table, condensing the above information and more, to quickly demonstrate the differences between the four types of information in Drupal. While the distinction between the four types might be a bit vaguer in practice—when is something state and not configuration?—then the following nonetheless tries to distinguish between them for most circumstances:

Different types of information in Drupal, compared side by side.
Type Lifetime Changed by Identified by Accessed using
Content Semi-permanent Maintainer (editor) Numeric* (or UUID) Entity API
Session Login/browser Visitor's actions Visitor's cookie Session manager service
State Varies Drupal itself* Name State API
Configuration Semi-permanent Maintainer (admin) Name Config API

*Usually: this comparison table is meant to be a very generalized overview, so there will always be exceptions.

The key introduction in this table, compared to our discussion above, is the right-hand column of APIs. We'll discuss these later and so don't cover them now; but as we cover them, we'll link back here.

It's tempting to include another line in this table, for Cacheing. Historically, it's been difficult to work out where state should live in Drupal. Configuration has always been something that, in principle and practice, should be exportable to form a kernel of "how your site should behave": this export could be used to partly restore a damaged website, or re-used in a different website to be able to share configuration that was hard to achieve in the first place. Because state can change arbitrarily often, it hasn't suited being stored alongside configuration, as it would make these exports look (falsely) out of date.

For this reason, state has in the past often ended up in the cache, as this seemed to suit its volatility. However, this puts strain on any caching strategy, because caching should be fundamentally a way to improve performance, and the site should work fine without it. While the state of any given moment is volatile, state as a long-term concept still does need to exist in some form or another. Drupal 8 has finally split it out from caching, and so we don't treat the cache as an information resource, only as a performance tactic.

What you should see

Most of these tutorial blogposts end in a demonstration you can implement yourself. However, the key point of information storage in Drupal is that the four types are both different in meaning and use, and also different in how you access them. So each of the four APIs will have its own demonstration, and we don't repeat those here.

But congratulations! you have learnt about these different types of information in Drupal.

Further reading etc.