A WTF at the heart of your Drupal feed aggregation

Embedding JSON in XML. Hah, that's ridiculous, right? Almost as ridiculous as running a successful blog in .NET/ASP. Well, RSS can combine with JSON to quickly get a Drupal site to consume complex data structures over a webservice.

Drupal's core Aggregator module understands RSS2.0 with no tweaking, putting the text in the <description/> element into the content of quasi-node objects, so you can aggregate all sorts of syndicated content. You could build your own Google Reader if you liked that sort of thing, with articles from the BBC sitting alongside those from the Guardian.

So far so boring. And, on one level, it doesn't get much more interesting than that: Aggregator understands neither Atom XML (rich content) nor RSS that contains Dublin Core fields. There's therefore a limit to how much you can extend the actual XML format.

But what if you get a remote application to produce an RSS feed like this:

<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
  <channel>
    <title>Hello, world</title>
    <link>http://example.com</link>
    <description>Recent updates</description>
    <language>en</language>
    <item>
      <title>Sample JSON encoded content</title>
      <link>Foo</link>
      <description>
        {"text": "This is some lovely JSON text"}
      </description>
      <pubDate>Mon, 24 Nov 2008 22:07:03 +0000</pubDate>
      <guid isPermaLink="false">none</guid>
    </item>
  </channel>
</rss>

"What if?" Well, you get a quasi-node of content whose body contains the literal JSON text. Not terribly exciting. But Drupal's powerful themeing system means you can override the way that such content is .

Drop a file into your theme's directory called aggregator-item.tpl.php and containing the following:

<?php
$data = json_decode($content);
print $data->text;
?>

Voilà! You've unpacked the JSON data packet and accessed the content. And the packet, being JSON, can contain however much hierarchical data that you want. You could essentially encode whatever you liked at the webservice side and unpack it at the webconsumer side. You can't pickle objects very easily, unfortunately, but my recommendation is to avoid doing that sort of thing.

(You might need to empty your cache, if you've got any sort of zealous cacheing switched on. And this specific example will only work on PHP 5.2, unfortunately: json_decode() is a recent addition to the already-polluted default PHP namespace. You could use the PHP serialize() format if you've got an older version of PHP, or some other serialized data format that PHP can understand.)

If you were building all this from scratch, then of course you'd use either XML or JSON throughout, and not this weird hybrid solution. If you were building it from scratch. And if you are building it from scratch: let me know when you're done.