When PHP closures are an anti-pattern that can block testing and maintenance

While I was still moving my skills in the direction of PHP, its object orientation always felt a bit spindly and scaffold-y compared to the fluid malleability of Python, the language I'd used for a long time before then. So it was with some pleasure that I saw PHP start to support closures in 5.3.0. Defining closures on the fly, then assigning closures as object properties.... Why, it would only be a short step from here to ducktyping, monkeypatching and all that other crazy partying that just seemed so exciting from the wrong side of the fence!

Seven years on, and while I still love the expressive power and just-in-time flexiblity of closures, I think they often end up a substitute for much better patterns in code. Closures are quick to cobble together, compared to object classes, interfaces, inheritance, traits, autoloading: you just start typing right where you are. And while that can be powerful, it can often result on code that can't be tested in isolation, or even becomes too powerful to control.

Here are two anti-patterns I've seen recently, which you would do well to avoid in your own code: however much you might (quite understandably) want to use closures elsewhere.

The closure where an interface would promise more

The biggest anti-pattern—the one that caused the most headaches—was in passing closures in to an object at several different points, in order to handle several different species of other-objects.

For example, let's imagine a single iterator over an incoming data stream (e.g. rows of a CSV file). For a particular type of input (e.g. address imports) this iterator needs to cede control to some functions to do the heavy lifting of processing the data; for a different type (e.g. product imports), something else has to do the lifting.

Here's roughly how this was implemented:

<?php
class Task
{
    public function do($csvFilename)
    {
        $that = $this;
        $preparationClosure = function(array $csvTable) use ($that) {
            $that->storage = [];
        };
 
        $perRowClosure = function(array $csvRow) use ($that) {
            $that->storage[] = $csvRow;
        };
 
        $finalizationClosure = function(array $csvTable) use ($that) {
            $that->applicationContainer->database->save($this->storage);
        }
 
        // Now iterate!
        $csvIterator = new CsvIteratorClass($csvFilename);
        $csvIterator->iterate(
            $preparationClosure, $perRowClosure, $finalizationClosure
        );
    }
}

Some people will look at this and howl, and some will look at it and think it's fine. There's a lot to discuss about it, and I'm sure there are several people out there who can't imagine code like this ever ending up in their projects. But you have to bear in mind that this was a distributed team of different levels of experience—much like my varied, if modest, readership—and there were deadlines, and complex requirements, and no clear conventions to point to.... A junior developer began the work, and the rest of what they did was so valuable that it was never refactored.

From the perspective of this blogpost—that closures have a habit of bringing in antipatterns—what's the real problem with this code? Well, the closures (the antipattern) are in effect acting as a precise opposite of dependency injection (the pattern): they're making the business logic implicit in the closures maximally untestable in isolation outside of the container class or method. This rules out unit tests and sometimes even integration tests in favour of system tests, which are generally more involved, slower and harder to cover all the code-level cases. Also, the write-many-times aspect of closures means they can rapidly diverge drastically on different code paths, even if the real-world situations are very similar. They don't help with consistent application of domain knowledge or modelling.

This situation is crying out for the code to be turned inside-out, and the three closures added to an object implementing some kind of interface (probably itself inheriting from \Iterator. Here's a simple first pass:

<?php
interface CsvIteratorInterface extends \Iterator
{
    protected $storage;
 
    // Every closure becomes a method.
    protected function preparation();
    protected function perRow();
    protected function finalization();
}

This interface could then be implemented to iterate over a particular CSV file that needed to be saved to the database:

<?php
class AddressCsvIterator implements CsvIteratorInterface
{
    // Dependency injection: set database either
    // explicitly or using services.
    protected $database;
 
    public function setDatabase(Database $database)
    {
        $this->database = $database;
    }
 
    // Interface methods now defined below.
    /* ... */
}

Now, everywhere that you call this code, you can demand the interface and your code will know what it's getting. The original Task#do() method becomes:

<?php
class Task
{
    public function do(CsvIteratorInterface $csvIterator)
    {
        foreach ($csvIterator as $row) {
            // Do nothing.
        }
    }
}

Lots more refinements immediately spring to mind, in part because of the empty foreach()—maybe the Task should handle saving to the database, or maybe the iterator should have its own method to iterate over itself—but none of these were possible until the closures were turned into a class, and the class used to inspire an interface.

When we went down this route, the code rapidly became decoupled from everything else, making it DI-able and testable in isolation: and therefore much more maintainable. Maintenance problems arising from this part of the code—always very hard to debug—largely went away.

The closure passed down to an infrastructure component

Let's say you're calling an existing method on an object, and you want to do something if the method fails somehow. The classic example I've seen is caching using the PECL Memcached library:

<?php
 
$cachedItem = $cache->get(
    $key, 
    function ($memc, $key, &$value) use ($localObject) {
        /* ... */
    }
);

Within the comments would be (potentially a lot of) logic, involving business/domain elements and interactions, in order to fill the cache in the first place. Either that, or that same logic would be obfuscated behind a single domain-level call.

This feels wrong: as a caching solution, Memcache(d) sits at the lower level of infrastructure: it's rarely something involved in business logic and it certainly shouldn't have an effect on that logic. Passing a closure down into this layer is like dropping a deep-sea exploration vehicle into murky depths, to perform some complex technological magic among the silt and eerie, valve-bristling creatures that live down there. It's an exciting thing to do when you really need to, but it's not necessarily a procedure boring enough to use as a permanent foundation of something else.

All of that is gut feeling, that the natural order of things are somehow inverted here (which as I say often seems exciting, but wanton excitement isn't necessarily our goal.) We can make this gut feeling more concrete if we think in terms of injection again: what are we injecting into what? Ideally, when a service X (e.g. a database) is injected into a service Y (e.g. a business component) then X should be from our business perspective dumber and more reliable than Y. X is the helper and Y is the doer. That's not what we've got here.

When we write tests for a business/domain component, to test it in isolation, we have to mock out the infrastructure. This decision—what to mock, what to test—is irretrievably bound up with assumptions about what, for the purposes of our testing, will likely behave in certain ways. But in this situation, a business component—the logic required to populate the cache in the first place—is being passed into a mocked infrastructure component. As soon as you mock out your infrastructure, the two business components no longer talk to each other directly, which fragments the domain.

This all sounds a bit theoretical, but: how easy is it to test this code, involving all the non-infrastructure objects? Quite hard:

<?php
class BusinessComponentTest
{
    class testFoo
    {
        $testable = new BusinessComponent();
        $other = new OtherBusinessComponent();
        $cache = $this->getMock('Memcached');
        $cache->expects($this->once())->method('get')
            ->willReturn(['data']);
 
        $testable->setCache($cache);
        $testable->setHelper($other);
 
        // Never touches $other.
        $testable->foo();
    }
}

The $other business object is never touched, whether we mock it or not. Do the two business objects integrate correctly? We can't know for sure, unless we write a system test, or start embedding logic in your mock's methods; something like:

<?php
class BusinessComponentTest
{
    class testFoo
    {
        $testable = new BusinessComponent();
        $other = new OtherBusinessComponent();
        $cache = $this->getMock('Memcached');
        $cache->expects($this->once())->method('get')
            ->will($this->returnCallback(function($key, $closure) {
                $closure();
            });
 
        $testable->setCache($cache);
        $testable->setHelper($other);
 
        // Never touches $other.
        $testable->foo();
    }
}

Rapidly you'll find yourself adopting different mocking patterns in different tests: maybe you'll even create an unmocked DummyMemcached class, and start peppering your tests with that instead; now you have two problems. Far more testable intead, if not as neat to look at (as a submersible diving vessel exploring uncharted territories) is to make the cache call, check the response, then call whatever would populate the cache:

<?php
 
$cachedItem = $cache->get($key);
if ($cachedItem === null) {
    $data = $localObject->doSomething();
    $cache->set($key, $data);
}

It's not as neat, but it tests. It separates. It injects.

Summary

Closures are a great shorthand for logic, especially when you want to e.g. use array_map to perform some calculation on every member of an array. But when overused they can spoil injectability, testability and maintainability. As an application's use of closures grows, it's worth always looking out for opportunities to turn them into more "old-fashioned" objects, interfaces and even control flow. After some ten years or so of programming in PHP full time, I'm finally reconciled to the fact that it isn't Python, or Javascript; as such, it has its own, specific patterns.

Add new comment