Building a Drupal site with Behaviour-Driven Development

(This article first appeared on the Agile Collective blog.)

The Global Canopy Programme (GCP) needed to retrieve news syndicated from many public sources, manage it via an internal application, then re-syndicate it reliably to several public-facing websites. This application—called Forest Desk—needed to be described and built “just in time”, both to fit the clear initial requirements but also to adapt to any discoveries made along the way.

Background

I was asked to work with Agile Collective, as a technical partner, to help GCP to build Forest Desk using everyone's current preferred framework of Drupal 7. It promised to be a neat, compact site, with the potential for good in-house user experience and great performance. However, while the fundamental goals of the project were clear, all parties all agreed that the project had to:

  • support rapid iteration of features and just-in-time specification of the details,
  • while also being reassured that ongoing development was stable and correct.

With those two demands in mind, the paradigm of behaviour-driven development or BDD was decided on very early in the project. Like test-driven development more generally, BDD improves understanding of, and adherence to, requirements. This is done via a 3-step process for each requirement:

  1. Formalize the requirement: in this case, write it down as a semi-human-readable set of “scenarios” using a limited subset of English, called Gherkin.
  2. Use software to turn these scenarios into automated tests, and run them even before the required functionality is in place, to show they “fail as expected” in its absence.
  3. Build the required functionality (tweaking the scenarios where they were initially unclear) until the test passes.

This simple BDD process could be followed for each piece of functionality in turn, and embedded in a wider project-management framework like Scrum. Such an environment can lead to not just a usable end product, but also tried and tested functionality underpinning it, both now and the future.

Getting agreement

Everyone brings their own assumptions to a project, and often two people can be using the same words and phrases to describe quite different ideas that each has in mind. The traditional approach for sounding out such potentially dangerous scope misalignment has been to do a lot of up-front discovery, but this doesn’t just delay a project, but can even provide a false sense of security: becoming out of date; or being so large that it hides inconsistencies which only become apparent as the deadline approaches.

Instead of a discovery-phase approach, each requirement is sounded out in the "just in time" fashion alluded to earlier: only when required, and even then only in terms of the behaviours of the website as the user tries to attain certain goals. Gherkin allows expression of this behaviour from the site user’s perspective in a series of scenarios like:

  • Given
    • I am logged in
    • And I am on the homepage
  • When
    • I click on “My account”
    • And I click on “Edit profile”
  • Then
    • I should see a “Your username” element
    • And I should see a “Your password” element

Although bits of the above can be omitted, there’s not much deviation permitted from the format. And because it’s so restrictive, it requires people to spell out their thoughts more clearly, eliminating misunderstandings.

Still following the principle of just-in-time specification, all stakeholders allowed the project to proceed even if the functionality isn't completely fleshed out. This allowed subtle, hard-to-translate bits of functionality to remain un(der)described at the specification stage; for example:

  • Given
    • I am logged in
    • And I press a button and some custom functionality happens
  • Then
    • I should see new content retrieved from some remote syndication

The italicized bits of the specification are not commonly understood by programs which can read Gherkin. But the point is that such a test will fail, and draw attention to itself rather than permit them to be forgotten. In the meantime, doing this allows the project to keep moving, accepting that either everyone gets involved to flesh out such scenarios out further, or leave such sections to the developer’s judgment as an implementation detail.

Letting computers understand requirements

Once requirements were spelled out in the limited syntax above, in text files on disk, a separate application called Behat could read the text files, visit the website in development, and see if the functionality was built yet.

Behat can act just like a human being sat in front of a web-browser (with some limitations), but can also utilise the rest of a developer’s computer to set the website up (ditto). What this means in practice is that, for example, one need never specify particular user accounts in tests, or have particular content already set up on a development copy of the website. Behat can prepare content and accounts for tests, ready for the tests to then make use of them.

A successful run by Behat would result in a computer screen full of literally green lights: each line of the scenario goes green if Behat both understands it and also can actually fulfil it: click on this link; or put this text into this input box.


A running BDD test, using Behat. Green lights mean go!

However, if anything was either broken or not yet written, the developer would receive clear red alerts that action was required.

I was also able to extend Behat’s knowledge of what constitutes a comprehensible action on a website. It already knew about clicking links, or hitting submit buttons, but a key requirement was to get it to understand RSS, the format for syndicating content on the web. I was able to do this very quickly with custom PHP code.

In no time at all, I would have a new Gherkin requirement, being interpreted by Behat, and showing lots of red warning lights: because by that point, it was only specified, not built! But at the same time, older functionality would still show green lights, indicating that more recent work was not conflicting with the requirements of somewhat older work.

Building the functionality

Lots of the technologies built upon already existed, and were tried and tested. The RSS standard was chosen for syndicating data, and Drupal has very good support for both consuming this standard from remote websites (using the Feeds module) and also repackaging it for consumption by other websites (using Views and Views RSS). Because this was intended as an internal project, I also implemented a clear, plain, crisp administrative design throughout, provided by the Adminimal theme.

When it came to building each feature, once its scenario was specified in Gherkin syntax, most of the work was possible using Drupal content types, views and other configuration, which could be stored alongside the codebase using the Features module. This allowed changes of configuration, which could then be saved, and then always if necessary return to the previous state of configuration should any BDD tests no longer pass.

A few small customizations were necessary, but I could isolate these:

  1. When a website user was making the decision to re-syndicate content, on the same page as the content a link was required to the exact remote RSS feed that was the source for the content.
  2. When the content was being re-syndicated from Forest Desk to another GCP website, not only did its content tagging need to be preserved (e.g. this content pertains to “Palm oil”) but the tagging needed to end up in the right vocabulary on the remote site (e.g. “Palm oil” is a commodity, not a country or piece of legislation). This required a custom extension to RSS.
  3. Similarly, when content tagged with geographical locations (e.g. longitude and latitude points) needed to be re-syndicated, the GeoRSS standard needed to be implemented to re-display the locations in the right format.

However, each customization, in effect, supported a very specific piece of functionality; removing it only affected that piece, leaving the rest of the website functioning correctly.

Summary

Behaviour-driven development permitted a website to be specified and built quickly and within agile principles. It also meant that at the end of the project a successfully running test suite was delivered along with the website, a suite could be automated for peace of mind both during the acceptance phase but also for any future ongoing development. This means that changes of all sizes can be made with much greater confidence.

Moreover, the resulting suite of BDD tests is a living, breathing part of the website project in its own right, indirectly serving the site’s users by guaranteeing a minimum quality of experience and functionality. As the website itself grows, the suite will not just support the growth, but will be able to grow with it and become richer at the same time.