subtleties

There seems no CSS code smell because it is all CSS code smell

The biggest CSS code smell at all is all too frequently /* hidden from view */

The concept of code smells is admirably applied to CSS in this blogpost. CSS is rarely considered as a "proper" programming language, so it's nice to see it treated with as much rigour as other languages receive.

However, the more I read it, the more I worry that the biggest code smell I come across in CSS isn't even mentioned; worse, it's actually demonstrated, by the very CSS examples and accompanying text. You see, I'm worried that nowhere in the article is a single mention of comments, their use or their abuse. Meanwhile, over around 100 lines of quoted code, there are only two copies of the same comment, a comment that adds so little to the reader's understanding that it exemplifies bad comments.

I would suggest if there's one cross-platform, cross-technology "code smell" then it's not commenting your code. But certainly there's a belief I've encountered among frontend developers that good CSS is, by its declarative nature, self-explanatory; that LESS or SASS, combined with judicious attention to nesting, mean that you don't need any accompanying human-readable documentation. Maybe that's true; but if we're talking about code smells, then: if you discovered this culture in any other programming community, wouldn't you do a double-take?

Of course, the brilliant programmer can in theory code without comments; well-written, lucid code can be readable enough that extra human documentation isn't required. But who's that good a programmer, all the time? I'm certainly not, especially when it comes to CSS. What proportion of CSS authors, with a keen eye for visuals and understanding of how to achieve them, are also really good programmers, in the sense that they understand and appreciate dependency, modularity, refactoring, maintainability, etc.... Meanwhile, how does the Dunning-Kruger effect influence the opinions of the very worst programmers, bolstering their belief that they too are good enough to code without comments? How much CSS work is done by designers, or junior developers, who can be skilful in their own fields, but not trained to write good code?

The simplicity and accessibility of writing CSS means that, if we're not careful, we will all end up sitting on the plateau of mediocrity, because a plateau is a confortable place to be. Rigorous coding, through attention and attendance to code smells, is a great way to start to climb the foothills of expertise. So we can't let the assumptions of the prevailing culture keep us on that plateau: let's train ourselves to comment our CSS, until we're good enough to train ourselves not to.

Pressflow minor versions and double leading slashes

Between Pressflow 6.22.102 and 6.22.104, there was a small but substantial change. The url() function no longer normalizes leading slashes.

This brought it back inline with Drupal 6.x behaviour, but has also led to a number of our sites showing links with two trailing slashes. Browsers interpret <a href=”//…”> as a reference to a domain, not a resource (so as if it were http://…) and this leads to broken URLs.

The problem arises when a Drupal path is passed through url() (or calling functions like l()) more than once. This can happen in your own code, or it can happen in Views: if you use Views fields, and render a path out, but then e.g. exclude it from display and use its token elsewhere.

If you’re building your own links with Views fields, don’t retrieve a node path field [path] for use, as this immediately gets rendered with a leading slash and is then unuseable as a link token elsewhere: it will get a second leading slash. Instead, get the bare node ID [nid], and build links of the form node/[nid]. When these get passed through url() one time only, they get turned into the friendly [path] aliases anyway.

Blog category:

Learn to love CSS margins

Margins concern themselves with how different elements negotiate space on the page, so you don't have to.

Most web developers, when they find they have no other choice than to turn their hand to CSS, find margins a pain in the rear. This is because, unlike padding, the margins on a given element don't always appear the same on the page; the resulting space depends on what other elements are near them. But while rigid predictability is sometimes the most important aspect of some page structure, other arrangements of elements need a more subtle approach.

Padding is easy

Padding feels like a physical thing, analogous to the metal spacers once used in letterpress typesetting. It's like a second, inner border: inviolable, rigid in their shape. So a programmer "knows where he is" with padding, and if you convert ems or points to pixels on the fly you can pretty much "count the number of pixels in an element's padding" based on the original CSS declarations themselves combined with the size of that one element. You don't need to worry about what the neighbouring content will do, unless you've already made it behave weirdly with your own CSS rules applied to e.g. the other element's overflow.

Padding dimensions A,B,C,D around element w x h

You know where you are with padding. A declaration of { padding: A B C D } on an element of browser dimensions w × h yields an element area of wh and padding area of w(A+C) + h(B+D) + AB + BC + CD + DA. Always.

This means that padded elements will always take up the amount of extra space that you specify. On the one hand, this is very useful for pixel-perfect design implementation. If you have headers and footers - site livery - which have pretty much the same content on every page of a site, you can add paddding to the relevant elements and know that they will look just right. Because padding is simple, it's also less likely to be buggy (you avoid the IE6 double margin bug on your floated-left and -right devices).

However, padded elements are simple in another sense: they're stupid. Sometimes it feels like they're behaving like a man who's forgotten about the plank he's carrying across his shoulders, and barge other elements out of the way when a more subtle effect is required. 

Margins are hard

Margins, unlike paddings, pay attention to what's around them. The top and bottom margins of adjacent elements will, in certain circumstances, collapse onto each other. This sounds like either: just something you have to take into account; or, worse, a tricksy annoyance. But it's really because margins are not about pixel perfection, in the sense of making sure that a given element is a precise number of pixels from its parent, or from the top of the page.

Elements decide where they are with margins. The top element 1 has { margin-bottom: M }, whereas the lower element 2 has { margin-top: N }. The actual space between the elements (assuming no padding or borders) is M or N, whichever is greater: max(M,N).

Margins are intended to improve legibility and overall presentation when the designer cannot predict in advance the order of every single element. Content flow on the page can be improved by margins in a way that padding can't achieve: they attempt to lay out the content using whitespace that's as aesthetically and readably pleasing as possible, given many combinations of elements. This can be more important than you think in pages with a lot of (potentially user-generated) content: when you have lists, headings, paragraphs and blockquotes all jumbled together in any old order, how can you dictate the padding between any given two elements?

No rules only heuristics

Margins sound like they provide a lot of flexibility, but how should they be used in practice? If margin-top and margin-bottom are likely to collapse into those of other elements, is there a straight choice to be made between them; or, as a (probably too discursive) Stack Overflow question has it, which do you prefer and why?

The simple answer is that there is no simple answer. This is a conversation you must have with the original designer, if you're not sure what to implement. But generally unless the design has some odd layout that screams "ask the designer what they meant by this", here is a rough guide - a set of heuristics rather than rules - to establishing good margin and padding usage in your CSS. You don't need to follow these to the letter; but they're a good starting point.

  • Paragraphs and ordered/unordered/definition lists should have symmetrical top and bottom margins. This is because they're "symmetrical" elements: a paragraph typically doesn't care about the content before or after it, but is just part of the page flow. It also means that the paragraph--list gap is the same as the list--paragraph gap, which makes a kind of semantic sense. Depending on the "tightness" of the design whitespace, paragraphs will typically have 0.5-1.5 em vertical margins; lists could have as much as 2em.
  • Headings need bigger margins above than below. This is because while a heading's text marks the beginning of one section, the whitespace above it implies the *end* of the previous section. Headings should therefore bind more tightly to what's below them.
  • Blockquotes can be spaced vertically like lists. They also need horizontal spacing, which browsers typically apply with margins. You should be able to do this too, although padding will work just as well and will also provide you with a decent canvas for things like speech bubble backgrounds. 
  • Structural elements like divs shouldn't need any margins but can inherit the ones from their children. If they're in the header and footer, use padding to make sure they sit correctly next to other elements on all pages.
  • However, overall header, footer and section containers (whether this means the dedicated new HTML5 elements or just divs with particular IDs) should have some vertical margins on them, so that there is still nice amounts of space between them and any accidentally un-margined content (or if a paragraph next to the footer would otherwise end up too close.)
  • Horizontal margins don't collapse. Because of the aforementioned bugs in IE6, I would typically use horizontal padding instead, unless you're sure that the element is never going to float. The end result is almost the same either way. Exceptions to this rule include when you have a limited choice of markup (you can't use nested divs and horizontal padding instead, for example.) You should still test them

You can start by blindly applying the points above, but please don't assume that they will invariably lead you to the specific design you have to implement. You'll always have to take that step back and consider: what margin will this element "like to have", when it appears on the page? What if this paragraph element (for example) is adjacent to another paragraph, or a heading, or a list? And what might the heading or list have to say about its own margins in that case?

When "it depends" is the best answer

Because margins have a cleverness to them, the temptation is often to out-smart them; to hem them in with extra CSS rules; in effect, to use tricks which make margins behave more like padding. I don't mind these so much, but I think they miss the point off margins, which is: like CSS itself, they do some of the typesetting work, so we don't have to figure it out every time.

This doesn't answer the burning question, though: margin-top or margin-bottom? Who's right? I think it's worth going through some of the discussion on that Stack Overflow question and examining it in the context of "intelligent margins", or maybe "margins giving us intelligent elements" to see whether there can ever really be a last word in the great debate between top and bottom.

I always use margin bottom, which means there is no unnecessary space before the first element.

If you're using margins at all, then you're saying that it's up to the elements to decide what is and isn't unnecessary. If a paragraph needs a bit of air to flow around it, then it will probably still need that if it's at the top of the content area; by using margin-bottom, you're denying the first paragraph its air. Of course, you can fix this by making sure that whatever elements paragraphs are in are always somehow padded away from the elements above them, but you lose the cleverness of margin-collapse during the fixing. Besides, for every design that calls space before the first element "unnecessary", there will be another design which makes that same claim about space after the last element.

Depends on context. But, generally margin-top is better because you can use :first-child to remove it.

Yeah I usually use margin-bottom as well and then assign a last class to the last one in the bunch.

These both seem to be falling into the same trap as the previous question, but coming to that trap from a different direction. Having to establish special rules for special elements, in order to make a container wrap tightly around its contents, seems an odd thing to do. If your container is tight, then you have to implement extra rules to make sure that container keeps its distance from other elements. Which you can force with other layout elsewhere, but it's what margins are meant to guarantee.

Incidentally, as an aside:

This really depends on what you're designing it for and why. Something you could do, which is helpful, is setup generic styles for default padding/margins you commonly will be using

I would assign spacing on elements in a semantic fashion, and use contextual selectors to define behavior for that collection of elements.

I agree with the second poster completely here. In fact, one of the reasons I didn't put a glib "it depends" answer on the SO question in the first place was that the first of these two posters started his response in the same way, but then the rest of his answer was wrong enough that I was wary of joining an "it depends" camp.

But as I thought more and more about the original question, and all its SO answers,, I realised there was another reason that I didn't want to answer it directly. There is simply no answer within the context defined by the question itself, and shaped by all the existing replies, that I would consider valid. No combination of margin-top, margin-bottom and rules about contexts can themselves help dictate good CSS practice. I did think initially that the question should be closed as subjective, but maybe instead it would be better to answer it: not, "it depends"; but, "it's a bigger problem than the question permits."

Blog category:

Wordpress might not be better than Drupal, but it's still a worry

In other news, I've got massive piles of apples and oranges I'm trying to get rid of. It's to make room for some coal. Can you give me a hand?

Recently a blogpost discussed Jen Lampton's superb conference session: "Wordpress is better than Drupal: developers take note." The author said that they really liked the session, but

  • Wordpress isn't actually better than Drupal
  • Because you can't compare Wordpress and Drupal
  • So the usability issues arise from a false comparison
  • And here's what's going to solve our problems instead

In the first paragraph or two it managed to simultaneously both praise Jen's talk and completely deny the core message of it. I think that highlights a number of trends in the way that the Drupal community as an aggregate (though by no means as a whole) tends to think about usability versus technology. Actually, I think the communities behind most large open-source projects behave this way, and it's ones like Wordpress or Ubuntu that are actually unusual. But I think it's worth talking about this phenomenon in the context of Drupal.

Apples and oranges

Anyone who argues a point of view can be sure they've ruffled some feathers, when someone else complains that they're "comparing apples and oranges." It's a definite sign that the complainer doesn't like the problem that's been raised, and instead wants to treat an informal analogy as if it were a keystone in a tense, logical structure; then attack the analogy instead. It's like hurling around cries of "ad hominem". It doesn't really prove or solve anything. Apart from anything else, what if my point is to compare apples with oranges?

Whatever you compare Drupal to, whatever your motives, whatever important (often fundamental) issues you want to highlight about the way Drupal does things... eventually you'll have to compare Drupal to something that isn't Drupal. You will have to compare the apple to some other fruit. That's the only comparison worth making. Are we just going to forever compare Drupal 5 to Drupal 4.7, Drupal 6 to Drupal 5, and gradually Drupal 7 to Drupal 6, purely as a teleological exercise in patting ourselves on the back? Of course not! And however close the something else might be to being Drupal, there'll be some slight difference, which will inevitably provide an opportunity for hairsplitting. Maybe we could call it fruitsplitting instead, because that difference in variety of fruit can be used as an excuse, if not a reason, to dismiss your broader motivations in making the comparison.

"X isn't a CMS." "Y isn't in PHP." "Z isn't aspect-oriented programming." "P is aspect-oriented programming, but it's intended to be purer than Drupal, with pointcuts." "Q doesn't have a genial tousle-headed giant as its benevolent dictator."

Apples aren't oranges, but then some apples aren't the same as other apples, either. Apples come in all sorts of varieties. So you can even compare apples with apples, if you like, and someone will come along and protest. Except they won't so often these days. Want to know why? Well, here's a story about apples, from the ever-readable George Monbiot.

Briefly: there used to be a huge market in different apple varieties. Hundreds and hundreds of varieties: Belle de Boskoop, Michaelmas Red, Brown Cockle. And then the supermarkets came along, and they decided that they could essentially compel their customers to standardize on a handful of apple types: Royal Gala, Jazz, Cox's. They did it, because the customer didn't really mind what type of fruit he was eating.

And that was essentially the end of the line for most types of apples. "Extinct, extinct, extinct." Most of them just gone. The free market effectively wiped out those varieties, ruthlessly and ignorantly. It reduced the number of breeds to a tiny fraction of what it was. People compared apples with other, different apples, and they just picked... apples. Nobody will fruitsplit between apple varieties now, because they're just all apples.

Drupal already has plenty of greengrocers, fruiterers of the programming world, people who rightfully and meticulously care about, and tend with love to, our apples and our oranges. But the prize, the broad user base, the real success story... that all lies in attracting non-greengrocers. And they'll just think: "Apples. Oranges. Other apples. Whatever." They probably won't care about different types of fruit; they've been told that their website has to have its vitamin C for the day, so they'll get the apple that's easier to get into, and to hell with your awkward-to-peel oranges. If your business model depends on selling individually wrapped fruit to other greengrocers, then good luck with that. I'm off to take some coals to Newcastle.

Who's going to tell the customers?

The blogpost suggests that:

Instead of comparing a toolset and a product, there are other, appropriate comparisons that businesses and developers should be making...

Well, it's wonderful that I know that now. But who's going to tell everyone else to change their ways? Is there going to be some sort of government-funded outreach programme? Will we all be making some sort of world tour? I try to picture a random drupal.org user appearing, like a superhero in a flash, in offices around the globe. "Hark! I hear someone in Mumbai making comparisons between Wordpress and Drupal that appear reasonably valid to them based on their expectations derived from: Drupal websites, Wordpress websites, drupal.org and wordpress.org. Yet I must intervene, swiftly! To the nearest train station!"

Anyone who decides that the solution is to tell businesses and developers what comparisons are appropriate will end up like Wowbagger the Infinitely Prolonged, travelling in time so that they can tackle simultaneous disparagements of Drupal. We can't rely on that: Drupal ultimately has to be able to say this for itself: it should be obvious to these businesses and developers what Drupal should be compared to. Otherwise these businesses and developers will continue to make comparisons based on the (admittedly incomplete and sometimes wrong) information they have available to them. 

Saying what comparisons end users ought to be making is a lot like that other canard, "educating the user." Unless you're talking about a site's administrative user base, people at a client who you could count on one hand, that's a pretty mammoth task. Who gets to ensure that such universal education about the vagaries of, say, module weights, or the vagaries of CCK field display, eradicates doubt in the same way that we once got rid of smallpox? Similarly, if Drupal's success against Wordpress depends on people only ever making the comparisons we want them to make, we're in trouble.

There's also tendency to call Drupal a CMS, right up to the point where someone complains it can't do what a CMS does out of the box. Then it's called something else. A toolset. A content management framework. A management framework toolkit. A content toolkit framework management system tool. It might deflect debate, but ultimately the jargon just puts people off. You might have won the battle with potential new users, but the war goes to whichever CMS with limitations that accepts from the outset that it's a CMS with limitations, and tackles those limitations---in Drupal's case, most notabbly user experience and getting started with it if you've never heard of Drupal before---head on.

The documentation on Drupal.org is pretty impenetrable to the average newbie, but if you can figure it out then you get the impression that Drupal is basically a CMS, a bit like Wordpress only with other (waves hands) stuff. So it must have a WYSIWYG editor, only somehow more so, right...? And the user experience of Drupal 7 is a great improvement, much like improvements that have already happened in Wordpress; but, despite the fantastic news that we're moving to beta releases, only greengrocers are using that right now. If we want non-Drupal users to make the right comparisons about Drupal, today, will we really fix it by arguing about distributions; products; toolsets; downloads; projects; modules; features...?

The technology will save us, of course

Here's a handy tip. Technology alone will probably never save you. You only have to have sat through Jen's talk, and to have read about Hagen's similar Wordpress-Drupal-Joomla walkthroughs in the unconference on DrupalCon Monday (I blogged about Hagen's marvellously excruciating walkthroughs on the g.d.o usability group) to know that the technology won't save us. Or rather, specific bits won't save us. Aegir won't save us. Products won't save us. Distributions won't save us. The drupal.org redesign probably won't save us either, because that's nearly finished and yet we're still denying the very usability and expectation problems that are still in plain sight.

Don't get me wrong. All of these things are great tech--really fantastic, mindblowing tech---and they solve specific problems that developers come across. I think drush make is one of the most exciting bits of Drupal I've ever used. But I'm a developer! Of course I will think that! Not only do I know how to set a Drupal site up in 15 minutes, but I also know from experience how to avoid the many, many ways in which it will take you two days. These tools  don't fix the fact that Wordpress is easy to use. Wordpress just works. Drupal needs tools to help you install it, and command-line fu before you can even make those damn upgrade errors go away, go away, please just go away.

You can't wait for technology to solve these things. Distributions won't change the fact that you can click around the drupal.org website for ten minutes and still not be completely sure what it is, or how to use it. The learning curve is still too large; and I'm just talking about how you use drupal.org, not how you use Drupal. Until everyone in the world is re-educated to know, and care, about what a distribution is, and how it's subtly different from just getting the blasted thing to work, then distributions won't save us.

What will save us is usability. User experience reviews. Jen Lampton's talk. Hagen Graf's talk. Watching users try to get to grips with this thing we love, when they just don't love it so much, won't excuse it as much. Feeling the cringe when a new user falls into the same trap over and over again, and knowing you won't be able to appear on their shoulder to help them. Then, though: this is key. You can't just say, "well, we'll let the themers fix that," or "maybe the forums can discuss it," or even: "when users ask for the wrong thing it is rarely, if ever, the right answer to humor them." Instead, you have to accept the application has a usability problem; and work out a plan fix it, right there and then. Developers, deciding to fixing usability problems above functionality problems. Again and again and again. Test, observe, cringe, fix, repeat.

It's a slog. But it could work where technology won't, because ultimately end-user functionality with serious usability problems will end up massively underused, and as far as the greater course of Drupal is concerned it might as well not be written.

Does this matter?

I feel like I've gone on a bit, and if you feel like that too then I apologise. After all, the usability community in Drupal is healthy and growing. People are starting to take usability seriously. But I feel like there's still no UX standards to point to, with the same weight and depth as the coding standards. Talk is silver; code is gold; UX is presumably 24-carat meh. There are still no easy ways of sharing usable interface patterns as easily as one might share patches. Usability is still a teenager, and it's easy to freeze it out of the conversation it in favour of adults talking about established technical preferences. Usability needs cultivating, and the user's behaviour needs to be king, whatever comparisons that user might make. We're right to worry about Wordpress, but that's a good sign! It means that Drupal is doing something right. It's worrying about the user.

Maybe, returning to the original blogpost, "the time is right for Drupal products." I'm happy to agree with that. Featurization and ease of deployment certainly suggest that productization will swiftly follow. But that has nothing whatsoever to do with the usability talks. It really won't solve the fundamental problems that Jen's talk tried to address.

One thing she said really stuck with me; as a developer, I took note. Well, I took lots of notes, but here's what stood out for me:

"Wordpress is behind us, but they can move fast, and they're looking at us."

With version 3.0, Wordpress quietly turned itself into a CMS. Maybe they saw fields were working well for Drupal, and thought: we'll have that. "It's simple but it'll do: let's ship!" Bang. Instant easy CMS. Usable too.

That's the point, in the end. Usability is not a nice-to-have. It's the canary in the cage, the indicator species: when usability suffers, it's telling you about lots of other attitudes that lead to the user being made unhappy. Whereas Wordpress emphasizes usability; it's friendly to the user out of the box; with things rich-text editing that the user wants to have without any issues; with straightforward media management; and with upgrade methods so simple they're frightening.

Drupal? Drupal's a brilliant, smart, well-oiled, massively functional framework (it's far more fun to develop with than Wordpress.) It's a CMS, except when it's not (but when it's not, it's still a rich seam for developers to mine.) It's still tricky for newbies to install (but developers learn the tricks.) Image handling is odd, and if you pick the wrong method early on you're in for a lot of pain later (but developers know which one to pick.) Upgrading modules is not a one-click experience (unless you're a developer and you've played with drush.)

Yet with all that in mind, here's a question. If you'd never heard of Drupal or Wordpress before, and you were 90% of the internet, rather than a developer; if you were the same sort of user as a client's marketing guy, who's never written a line of PHP in his life; based on the experience of trying to get each of them up and running yourself, which would you choose? Apple or orange?

A Drupal view serving multiple tabs

Views, tabs and menus generally shouldn't be this hard, but here's how to get a multi-tabbed view working.

Drupal's menu hierarchy is a big and complex beast. It acts as both the repository for registered menu callback handlers (and their associated permissions handlers) and as a way of building more mundane frontend menus for people to click round. It serves both static hierarchical side menus and also dynamic tabbed contextual menus: if you're on a user's profile page, there are "View" and "Edit" tabs, with "Account" and maybe "Profile" sub-tabs under "Edit"; yet this menu hierarchy doesn't exist in any real sense. This complexity, and this slight disconnect between all the various bits of menu.inc and menu.module, means that menu often gets exposed to other modules in a counter-intuitive way.

Say you've got a listing of content (using Views, naturally). You want this listing to sit at /posts . But you also want an advanced search to sit at /posts/advanced , and you need tabs across the top of the page. Should be easy, right? Create a view, put it on a menu path, in a menu, then create a second view and, oh, I don't know: put it as a menu tab? As a default menu tab? As a normal menu item under the menu item you just built? Or maybe build a page, and put that on the menu, then create two views as child menu tabs, or maybe one as the default menu tab, or....

After lots of fiddling, I realized you could take advantage of Views' ability to clone displays within a single view, to solve this fairly straightforwardly.

  1. Start off by building your "Default" display---the initial display a view gives you---so that it matches the field and filter criteria you want to sit at /posts . 
  2. Then create a "Page" display to handle the standard search. Set it to be a normal menu item, at path /posts . Check it's on the menu hierarchy.
  3. Create a second "Page" display with exactly the same configuration. Under "Page settings", give it the path '/posts/default' and make it a "Default menu tab" with the parent as "Already exists."
  4. You now have a tab to handle your default, non-advanced search, and a menu hierarchy entry to tie it to. Further tabs should now be straightforward.
  5. Your advanced search can now be created as a third "Page" display. As with #3 above, give it a path '/posts/advanced' and make it a "Default menu tab" with the parent as "Already exists."
The trick here is to remember that two tabs need four view displays: the view's core display; a display that never gets seen but sits on the menu hierarchy; and two displays for the standard and advanced tabs. There are other ways to do this with fewer view displays, but only by having e.g. a module handle the main menu entry or one of the tabs. Ultimately two tabs need three menu handlers, and three menu handlers needs a four-display view.

 

Postcode lookup must not suck

Because people won't put up with much if there's no benefit for them anyway. 

Recently my wife and I were trying to work out why she couldn't submit her address details to a website, even though I could. As we watched her behaviour in filling out the form, we encountered error after error: or rather, exceptional circumstance after exceptional circumstance. And it was clear that very few of the circumstances had been considered, that error handling was the absolute bare minimum, that the form was set up to be almost a trial to use. The postcode lookup part of the process was probably the source of the most unhandled exceptions: difficult if not impossible for the power user to flow through; unwieldy for the standard user; of almost no benefit at all to the web newbie.

People still think of their workflows on the web like they're workflows. Over here there's the start; over there is the goal; somewhere in between there might be some intermediate stages, but ultimately you go from over here to more or less over there eventually. It comes as something as a shock to most people that their beautiful webform does not encompass a workflow: the web has holes all over it; the user is a ball bearing and your application is a pinboard:

Antique pinball machine

Above we see a slightly clumsy metaphor for your web application. The end point of your own particular "workflow" isn't even something visually obvious like the shark's head. It's, oh, let's say that small metallic gate in the bottom right with the red doors (luckily for you) open. The best you can hope for is that the user caroms through your website hitting as few pins as they have to and ends up in one of a trillion end points, that they don't close the browser, or reach a form error, or silently lose their submission, or navigate elsewhere in irritation.

In fact, it dawned on my wife first, and then gradually on me, that postcode lookups are not intended to directly benefit the user filling them in. Instead, they're meant to force the user---remember that phrase---to provide a canonical address, and not the address. That is, the user comes to your site with an opinion about where they live and limited good will about your "product", and the postcode lookup is a mechanism for forcing them to discard the former, while the application as a whole is trying hard to get them to keep hold of the former.

Good luck with that.

Richard Rutter figured out the dirty secret behind postcode lookups---that they're not for the user---long before my wife and I. In order to mitigate this natural tension between forcing the user and keeping them happy, he's done a sizeable chunk of work to condense the postcode lookup pattern here. Along with a quite lively and informed conversation in the comments, this post nails much of the core of the pattern that lookup needs. Much of what's frankly miserable about using a postcode lookup is indeed tackled there, but there's an important omissions that I think needs dealing with. Roughly speaking, that isnever force your users to pretend to be someone they're not.

As an example, consider Spotify. Since inheriting a slightly ropey eMac, I've been able to listen to Spotify, and I like it. I think Spotify is a gamechanger in the field of streaming music. I've heard albums on Spotify that I would never have bought. And yet: I would never consider purchasing Spotify Premium. The obvious barrier for me is that I use Linux, and there are no native Linux binaries for the Spotify desktop client.

People keep telling me that Spotify runs really well on the Windows emulator WINE. I'm sure it does. But that misses a more fundamental point: if something wants me to enter into a relationship with it, commercial or otherwise, I should not have to pretend to be a demographic I'm not in order that the relationship can be properly fulfilling. I'm not a Windows user, and it's an affront to a paying customer to expect them to make out that they're a type of user that they're not if they want to buy your stuff. More consisely: I don't take offence at the interface requirements; I take offence at what they imply about the respect for my needs as a user.

With that in mind, consider the first decision block in Richard's workflow:

does the user know their postcode?

As this is the first pin the user's pinball hits, then this is the one that alters their final resting place the most. This is the critical pin. And what does it tell me, a user who knows his postcode but also knows his address, and doesn't want to bother with lookup? It tells me that I have to pretend to not know my postcode if I want to be in a situation where I don't have to put it in. I have to play games with the application, and mask my true intentions. No, the postcode lookup system must allow for users who simply do not want to fill out their postcode. Postcode lookups should therefore begin with a simple choice of two buttons by a traditional form label. The buttons should not make any assumption about the user's reasons for choosing either route:

Address:  [LOOK UP POSTCODE]
               [JUST LET ME TYPE MY ADDRESS]

When you press the first button, both buttons should still remain on the page: the user might decide they wanted to press the second one after all. In fact, as we're probably using AJAX here, the minimum necessary modification to the form is just to add a postcode box (and to move and maybe change that button) like this:

Address:  [ postcode ]   [LOOK IT UP]
               [JUST LET ME TYPE MY ADDRESS]

You should still present the user with the ability to change their address. The "speed bump" of having to press the button is what works in your favour, is what gets you access to canonical data; anything more than a bump will run the risk of walloping the user right off your pinboard.

If at any stage the user clicks on the second button, the webform should then change to this view:

Address:  [ Address, either as a set of textfields or as a textarea ]
               [ postcode ]   [LOOK IT UP INSTEAD]

For simplicity, I'm almost tempted to drop lookup button altogether at this point: the user has made their choice, after all. But you should never make it difficult for users to go back in a workflow, especially when (almost certainly) the browser back button will have been disabled by these shenanigans.

In comment 10 on his blogpost, Richard agrees with the idea of one big textarea for the address: possibly even including the postcode in that textarea! Again, the simplicity is appealing. You could do all sorts with this to make the user's experience easier: regex matching behind the scenes would retrieve the postcode; the address could even be automatically split into lines rather than setting real estate aside for the user to split them up themselves. It sounds great, but it's not an established pattern, and I think a lot of users---especially power users---would mistrust it. Better to go with Address 1, Address 2 etc. even though from a data perspective they're a horror, slightly improved---but, again, made more complex for the user---by labelling the last address line as "town". But this last detail is up to you. Do some A/B testing. See how it goes.

Richard's workflow, with the addition of the basic prototypes above, permits us to move towards as usable a system as postcode lookup will ever be. Usability means the least number of pins on your pinboard, and exactly the right pins, the ones that nudge and tilt the user just enough. And so we end up with a system that still satisfies your original remit---to nudge the user towards using your shiny, expensive, time-consuming, postcode lookup service, with all its concomitant costs in development and maintenance---while catering for the users who simply do not want to, who will never want to, and who will actively object to your site giving them short shrift if they try not to use it.

I'll make a prediction here, that the users who try not to submit a postcode will tend to be the users you want: the digital natives, the users in flow, the people who will buy ten things from a polished user interface without even stopping to think about it. When they reach that very first pin on the board, they briefly want to be your application's friends on the web. Your application should consider itself honoured. The least it could do in return is be polite.

Implementing a columnar grid system on Graceful Exits

Hierarchical grid comes next. And some proper design chops, maybe.

In the spirit of getting it out there, I've now begun implementing a columnar grid system on my site. As discussed in Mark Boulton's excellent talk at DrupalCon Paris 2009, and reported briefly in my live notes from the conference. such grid systems are a basic underpinning of consistency and visual clarity on a site. You start at the grid, then decide how your actual page elements are going to fit into it.

This means you can begin with a columnar grid of five evenly-sized columns (such as de Standaard's website, one of Mark's examples) but then build e.g. a two- or three-column layout on top of it (or, as de Standaard seem to have done, a sort of 2.5-column layout). Your columns need not be evenly spaced, and you don't even need to use columns at all: modular and hierarchical grids can be used to great effect, although they suit print somewhat more than the web. It's hard to overstress the importance of helping the user to make these implicit connections: the grid gives your site visual predictability, and means your users are more likely to find what they want; it's implicitly tied to both visual aesthetics and also basic site user experience and usability.

I decided to have a seven-column grid for this site, as that gave me a lot of flexibility in page layouts: a three-column page could be 2-3-2 or 1-5-1. Consistency of field combinations is important, but as with the Guardian's website, as long as your underlying columnar lets users make that implicit connection without realising it, you can still push the boundaries of inconsistency on top of that. I decided on a field width of 124px, with a 15px margin between fields. Seven columns and six margins make 958px, which is comfortably close to the equally comfortable yet ultimately arbitrary 960px limit I'm personally used to.

So far, as I've not exposed much of the site, you won't see the effect of a lot of this, although the blog index should now be broadly following a 2-3-2 layout. When I've got the full content for the finalized IA in place then you should see more of the effect of being able to vary page layout while sticking to the seven-column grid. It looks scrappy now, and will probably look scrappy then---I'm not exactly a perfectionist when it comes to design---but I'm far prouder of it than any other site design I've done from scratch.

Tonight we're gonna parse like it's 1997

Opinions are like closing angle brackets: everyone's got one, but some stick out more than others, depending on your kerning

Via Sean McGrath comes a reasonably lucid and comprehensible redux of the argument about of whether or not the XML standard should (or should have) stipulated draconian error handling. I hope I'm not misrepresenting Avery when I boil a lot of it down to his three broad "real-world" examples to this:

  1. Not well-formed XML, produced by a legacy application that takes ages to fix, is rejected by draconian parser
  2. Not well-formed XML is accepted by a permissive parser
  3. Well-formed XML is accepted by draconian parser

and I hope he's also happy for me to then state that his argument consists broadly of the suggestion that 1 and 2 are together more likely than 3, hence permissive parsers obtain for you the lion's share of the "real world" parsing instances; or, if you prefer, via a slightly more complicated profit-and-loss argment, that making your parser permissive, and sanctioning permissive parsers, contributes a lower overall cost through lumbering us with poor legacy applications, divided up among all the parsing events, than having to fix those legacy applications.

However, an application of Postel's Law to the process of implementation should not be confused with being able to apply it to the original specification. And besides, do those examples really portray the real world, nearly twelve years after the argument first took place? How much XML is out there, and how much of it is bad XML, and how much of it remains bad XML for long enough for it to cause a problem? I don't think it's clear that draconian error handling in the wild has held back RSS syndication, Google Maps, web services, or RDF so much as that, beyond a certain tipping-point (say, 2002?) they've ensured the rapid takeup thereof (with the possible exception of RDF until recently, for its own reasons).

XML is unbelievably popular today, so popular and routine in its use that you almost don't know it's there in most applications. and I think---purely from my own experience---it's plausible to suggest that that's at least partly because consumption of XML is easy; in turn, this is because basic production quality is enforced.

HTML (SGML) is a format (specification) that, because of its messiness (its complex rules), its parsing permissiveness (its potential for misunderstandings), and a whole host of cultural reasons (ditto), was terribly hard to write reliable consumption software for. Even now, there's around half a dozen good browsers, and in part that's surely because of the entry barrier to writing browsers: permissive parsing of real-world mistakes remains a complex task.

I also have dim and partial knowledge of SGML in the old-skool publishing industry, where a licence for fully-featured SGML software could set you back tens of thousands of pounds six or seven years ago, and that price didn't seem to be heading down under market pressure. In comparison, XML parsing is cheap, easy, and ubiquitous. There are free and open-source CMS and blogging packages that can do it; I have access to dozens of command-line tools that can do it; publication, syndication and webservice consumption are things that happen, almost as though nature intended it that way. A lot of that must surely be down to XML's combination of rule simplicity and parser rigour. As Dave Winer says on the subject of Postel's Law and XML:

I yearn for just one market with low barriers to entry, so that products are differentiated by features, performance and price; not compatibility. Compatibility should be expressed in terms of formats, not products.... Anyway, the other half of Postel's Law is just as interesting, but so far no one is commenting. Think about it, if everyone followed the second half, the first half would be a no-op. You could be fully liberal in an afternoon or less.

Mark Pilgrim's history of draconianism versus tolerance seems to consist of a lot of tolerantists pontificating about what they've decided the "draconian" argument is: I can't believe that Tim Bray, even if he really were a lone voice, would have been such a reluctant paper tiger. But like the 1997 tolerantists, I've thus far waded in with my own interpretation of events. And despite dealing with XML on a daily basis, I find that during so many of the tasks I have to accomplish the XML layer is able to fade almost completely into the background.

Of all the problems I encounter at work well-formedness of XML happens very rarely, compared to those concerning the quality and stability of my own algorithms, application control flow, scaling and coping with heavy load, and logging and bailing out. Whether XML's ease of use is in 2009 is a result of the small rule set in XML making well-formedness easy, or the initial decision in favour of draconian parsing, all decided back in 1997, we'll probably never be able to tell. All that's certain is that there'll always be opinions about it, and somewhere in the rambling above is mine.

For all your idiomatic English needs

If your copy, rewritten and redrafted with a broad audience in mind, no longer engages: scrap it, and write as if to a respected, friendly colleague.

Some six months ago I received a newsletter from a respected company, active in open source, and providing graded services including a reliable free one. However, the first paragraph of that newsletter (ostensibly written by the CEO) said:

We hope you are enjoying the — service for all your — needs. We are passionate about our customer promise: to provide the best online — solution in the market with a focus on ease of use, personalization, security, and privacy. To keep you updated, this monthly newsletter highlights places to use —, new features, the latest market news, and other solutions that you might find interesting. Thanks again for using the — service and please let us know how we are doing.

Today I received an email from Matt Mullenweg. Here's how it began:

If you were living under a rock you might have missed our 2.7 release, which included the most significant interface update in WordPress' short history and has been pretty well-received.

It's also been pretty bug-free, which is why there was a longer-than-normal period of time before an update.

We won't fault you for the rock thing, but for rockers and curmudgeons-who-never-upgrade-to-a-.0-release...

It's probably unfair to single out the first newsletter---after all, lots of companies end up with flat, generic and slightly spammy copy in their newsletters---but my reaction to Mullenweg's email reminded me really vividly of my reaction to the email from the other company. It takes guts to write in Mullenweg's style when you know you're talking to a large and varied audience, but I think it's paid off (to the extent that I'll forgive him the minor typo that just slipped off the end of that quote!)

There's no hard and fast rule for this, except that if you're not interested in your copy then nobody else is going to be. Another tip is to get Matt Mullenweg to write it for you. He's probably got lots of free time now that 2.7.1 is out.

When whitespace isn't whitespace, but it is white [:space:]

It might be whitespace, but it's not being entirely candid with you.

After much wrestling with hexdumps, Matthew highlighted an issue for us today of the stealthy ninja linebreak. Here it is. Are you ready? Right: "
"

Did you spot it? Unlike all the other linebreaks in this Wordpress post, it hasn't been converted to a <br/> or <p/> tag, so Wordpress didn't. Not entirely fair of me to expect it to, though, as strictly speaking it's the line separator, \u2029. It has a nonidentical twin brother, the paragraph separator, \u2030. A shady pair of characters, these two: intended for printing use rather than computer use, like many of the other (horizontal) spacings in the General Punctuation code chart.

The reason that Matthew even noticed something was going funny in the first place was that Coldfusion's JSStringFormat doesn't escape it, but some output streams entirely filter it out. Javascript would see whitespace and sometimes die with formatting problems, but text dumps of the database revealed nothing at the command line. It was a sort of Mandelbreak, appearing and disappearing as if by random until he revealed it by dumping the actual bytes.

Ultimately, though, if you're filtering out high-end codepoints, why should you care? Well, imagine someone typed this into your site:

ja
vascr
ipt:alert("foo!");

If that looks like a blatant Javascript link to you (it does to me, on Firefox 3), then check the source. So: on the round trip from user input, through filters, into the database, back out again and then to whatever rendering system you're using: can you be certain that those line separators wouldn't just magically disappear, leaving you with some cross-site scripting?

Normalizing whitespace---reducing all whitespace clusters down to a single space---is a useful way of at least taking the sting out of such odd characters. Matthew initially mentioned as a problem that:

Firefox's Javascript parser (and possibly others) treats it as an end-of-line if it encounters one

But this can be put to good use: assuming it's consistent in its parsing at all levels, Javascript should have an innate understanding of whitespace beyond the ASCII character set. So it should spot the stealth spaces and be able to normalize them. Putting '\s' in the input field on this tester script does indeed return a list with both \u2028 and \u2029 are both on the list.

So if you're accepting user input then your first set of gatekeepers could be this code, running when a form is submitted:

jQuery(":input").each(function(i) {
  jQuery(this).val(jQuery(this).val().replace(/\s+/, " "))
});

You should never depend solely on browser-side sanitization, of course, as whatever you get the browser to do, a cross-site scripter can fake having done.

An equivalent server-side solution would be to use a regex parser that supports, whether implicitly as '\s' or explicitly as '[:space:]', POSIX character classes. These are regular expression terms in square brackets, based on a special colon-delimited marker: "[:space:]" means "any one of all the ASCII whitespace characters, plus any character which has the Unicode \p{Z} character property" (character properties are Unicode's equivalent of character classes). Picking a handful of technologies, Coldfusion claims to support POSIX classes; Python will support them in version 2.7 (a quick test suggests that '\s' in the re module is limited to catching only ASCII whitespace); and Opera was patched in version 9.52 only a couple of months ago to prevent XSS attacks utilizing these very characters.

Until there's a straightforward way, in a large and distributedly developed project like blogging software or a framework, to catch all the looky-likey high-codepoint whitespace, then only whitelisting---of characters and of markup---will really guarantee that the content you let into the system will resemble---both in security terms and byte by byte---the content you finally release. So how safe is your site, and how robust your offline content workflows?

Pages

Subscribe to RSS - subtleties