Pretending that Javascript is XSL, part 3: hCard to vCard

In previous posts (part 1, part 2) I established the possibility that there were advantages to making Javascript more functional, to bring it in line with CSS and XSL. I didn’t say what these were, particularly, but I then provided a few bits and pieces on top of jQuery to make Javascript just that: functional and quasi-XSL in its behaviour.

Now I’d like to start exploiting that behaviour, and I’m going to use the hCard microformat to illuminate its use. Briefly: a microformat is a set of agreed HTML classes used to invisibly encode structured semantic data in HTML; hCard is the implementation in HTML of the vCard specification for “virtual business card” files, using microformat classes. If you mark up people’s addresses using the hCard classes, then it’s possible to automate the conversion from hCard-enabled HTML to vCards, meaning you can click on buttons on webpages and have a vCard served up to you containing the contact information present in the webpage, verbatim, in a format you can put into your address book of choice.

One of the most-used conversion methods—Brian Suda’s X2V, a web service which converts XHTML with hCard markup into vCards and then presents them to the site visitor—uses XSL. In fact, that was what got me thinking about this whole system. Brian’s work is neat, although his own server takes a hit every time someone uses the web service (and it only works on XHTML, not non-XML HTML. What if, I thought, we could get the browser to do it instead; if we could implement template-like functional Javascript?

Anyway, below we find a couple of hCards, culled more or less directly from the Microformats examples page.

Frank Dawson
Lotus Development Corporation
work address (mail and packages):
6544 Battleford Drive

Raleigh NC 27613-3502

U.S.A.
+1-919-676-9515 (w, vm)
+1-919-676-9564 (wf)

Netscape Communications Corp.
work address:
501 E. Middlefield Rd.

Mountain View, CA 94043

U.S.A.
+1-415-937-3419 (w, vm)
+1-415-528-4164 (wf)

They look like slightly unstructured HTML, don’t they? That’s sort of the point. But hidden in the HTML are vCard classes. How do we tease them out with Javascript?

Well, there’s a question to be asked before that, I suppose, which is: why would we follow your method, and not someone else’s? What’s so good about functional Javascript? Good question. Well, if every hCard looked like the above, then you could write some completely procedural Javascript to turn it into a vCard. No problem.

But what if the order of the content was changed? The hCard—indeed, microformats in general—has quite a malleable structure, with some classes sometimes appearing on elements inside other elements, and sometimes not. What if there were more telephone numbers and email addresses, and what if they turned up in all sorts of different orders? These are just HTML classes, after all. With procedural Javascript you could start writing switch/case statements to cover every opportunity, and essentially come up with one big unavoidably recursive function. It’ll be hard to structure, hard to maintain and completely unmodular. A document-driven method of extracting the vCard, on the other hand, doesn’t need to worry about all the various different combinations of nested elements: it would just keep one eye on context and process whatever it found. Also, the development cycle could be faster, because templates could be overridden without breaking existing behaviour: just use the template() command to override existing behaviour.

Let’s instead assume you’re following my every word. For this next bit, you’ll need Firefox and Firebug, or to stuff all these instructions into a single file. Otherwise, you’ll have to take my word for it. Firstly, I’ve included jQuery on every page of my blog, so if you’ve got the ‘bug then you don’t have to resort to my insert-JS bookmarklet to squirt it in.

So: first, create the treewalker() and template() functions from part 2. Next, assign treewalker() to body and everything below it:

template("body, body *", treewalker);
template("body, body *", treewalker, "default");

You could restrict this assignment to everything within the .vcard elements, by giving the relevant CSS specifier instead, if there were a lot of content outside the hCards. It would speed up the initial setup phase, but it does complicate the demonstration so I’ve left that refinement out.

Remember we ran the treewalking before? Do that now:

var result = document.body.treewalk();

All being well, you should get a blank string back. Now it’s time to start adding some alternative rules with template(). Try this:

template(".vcard", function() { return "BEGIN:VCARDn" + this.default() + "END:VCARDn" });

Now run the treewalker again. Oh, each hCard has just given you a vCard! An… empty vCard. Isn’t that great? Um. We can add to that, though:

template(".vcard .fn", function() { return "FN:" + $(this).text() + "n" + this.default(); });
template(".vcard .org", function() { return "ORG:" + $(this).text() + "n" + this.default(); });

Now document.body.treewalk() doesn’t just return a vCard for every hCard, but it knows about names and organisations. Also, because we keep including the call to this.default() in our overrides, we still treewalk into any element inside the FN or ORG containers.

What about emails? Well, in the source we can spot an a.email element up there, so let’s give the following a whirl:

template(".vcard .email", function() { return "EMAIL;TYPE=internet:" + this.href.replace(/mailto:/, "") + "n" + this.default(); });

Try running document.body.treewalk() again. Hm. I don’t know about you, but I’m getting an error from that. Ah, wait: sometimes we have span.email rather than a.email. Spans don’t have @href attributes. Well, we could change the above rule and immediately reapply it using template() with no ill effects. But instead let’s keep it in place, and use a more specific specifier to override it just on spans:

template(".vcard span.email", function() { return "EMAIL;TYPE=internet:" + $(this).find(".value").text() + "n" + this.default(); });

Re-run the treewalker. It now finds all email hCard elements and brings them out into the vCards!

I’ll leave you with one more demonstration, for the slightly more complex TELephone field. As you can see above, there are lots of “types” for this field (Work, VoiceMail, etc.) and these sit in child elements of the telephone element. So we need to assign overrides to both the telephone element and its children.

Here’s a rule for the TELephone container:

template(".vcard .tel", function() {
  var t = "TEL";
  // Run defaults to get types where appropriate
  t += this.default().replace(/,/, ";") + ":";
  // See if we’ve got a “value” child
  var val = $(this).find(".value");
  return t + (val.length ? val.text() : $(this).text()) + "n";
});

This method is a bit more complex because we need default() to just get the .type children, and then we reach down to get. Maybe if we could give specifier argument to the default behaviour e.g. default('.type') first, then default('.value')… But that’s a project for another day, I think. Right now, let’s assign a rule to the types children and then run our treewalker:

template(".vcard .tel .type", function() {
  var jQ = $(this);
  return "," + (jQ.attr("title") ? jQ.attr("title") : jQ.text());
});

Result? You should now have Javascript which can produce vCards (currently without geographical address support, as I don’t have time and you might get bored) from the hCard microformat. It’s easy to extend, easy to maintain and, in my opinion, fairly concise. Here’s the whole shebang, less the two framework functions from my previous posts:

// Start with body
template("body, body *", treewalker);
template("body, body *", treewalker, "default");
// vCard wrapper
template(".vcard", function() { return "BEGIN:VCARDn" + this.default() + "END:VCARDn" });
// FN and ORG
template(".vcard .fn", function() { return "FN:" + $(this).text() + "n" + this.default(); });
template(".vcard .org", function() { return "ORG:" + $(this).text() + "n" + this.default(); });
// Email - A and SPAN tags
template(".vcard .email", function() { return "EMAIL;TYPE=internet:" + this.href.replace(/mailto:/, "") + "n" + this.default(); });
template(".vcard span.email", function() { return "EMAIL;TYPE=internet:" + $(this).find(".value").text() + "n" + this.default(); });
// TEL
template(".vcard .tel", function() {
  var t = "TEL";
  // Run defaults to get types where appropriate
  t += this.default().replace(/,/, ";") + ":";
  // See if we’ve got a “value” child
  var val = $(this).find(".value");
  return t + (val.length ? val.text() : $(this).text()) + "n";
});
// TEL types
template(".vcard .tel .type", function() {
  var jQ = $(this);
  return "," + (jQ.attr("title") ? jQ.attr("title") : jQ.text());
});

And that’s it. I hope the approach comes in useful. By next year, you’ll have hCard-enabled pages, with vCard conversion in the browser. Happy Christmas!