.php? .cgi? .who-cares?

Simon makes the case for disambiguated URLs. He’s right, largely. I would say as a proviso, though, that URLs need to be hackable by the developer as well as by the user. The internal spaghetti that translates URLs to files in Django and Ruby on Rails is an initial barrier to developing with those systems, and often it’s easy to lose track as you edit some file in the controller directory called something random that sort of relates to the camel-cased name of the class you’re writing, which is then called by some random include in the views directory, but appears on the front end as a URL like ‘/bill-clinton/favourites/sandwiches/fillings’. Once you know the trick it’s easy, but then you can say that about sawing a woman in half. Or a sandwich.

Given the examples in the post and its comments imply it anyway, it’s probably not worth mentioning that omitting your default directory index should go hand in hand with including the last slash on such links, to prevent a double-hit on the webserver when it auto-redirects example.com/foo to example.com/foo/. But I’ll mention it for completeness, because I have an eighty-column mind. (Is that still standard behaviour for a webserver, by the way?)

Omitting your default directory index is favourable from a utilitarian point of view anyway, as it means you guarantee being able to e.g. move to some static holding page when you’ve got problems with your dynamic site, or at least reducing the number of redirects you’d have to set up if you moved between scripting languages altogether (admittedly switching to RESTlish would eliminate that altogether). In fact, I just took advantage of that very ambiguity, on the upcoming new Quiet little Lies site, to change a Location: redirect into something a bit fancier in PHP.

Canonicalization I’ve never liked, though. It seems to imply that all your data should have a hierarchical treelike structure, which is OK only if that’s the case. What if I want to be able to find the photos bit of Simon’s profile, then another day Simon’s subsection of the photos section? Having something retrievable from only one place is a sure way to reduce the number of people who can find it when they want it: this is the canonical, if you will, problem with real-life bookshops, where if you’re not a tourist you can often find it difficult to find an Ordnance Survey map, unless you try to think like a tourist. And I’m just not fat enough to think like the typical tourist visiting Oxford’s Blackwell’s.

Comments

I like to do all the programming with ugly URLs, then add the pretty versions later. The framework I use, fusebox, plays well with this method. I don't know about Django, but I think with RoR you *need* the pretty URL. Bummer.

Fusebox plays nicely, as long as you code the fusebox itself with transparency in mind. I've been administering a legacy fusebox-like application at work and it's been pretty hairy to say the least: everything shares the same variable scope, there are case-sensitivity problems under Linux, and the code is one big tangle. It's probably not the fault of the fusebox premise, but it doesn't help in this case.

Really, you can't beat a URL corresponding to an actual file location for developer transparency. If your mapping between fuseaction and URL is transparent, then at least you know where to look for it, and it's easier to go from ?fuseaction=publish to view/publish.x than e.g. grepping for mod_rewrite regexps

Fusebox has it's problems, agreed. Particularly older implementations. I've seen some pretty crazy piles of code myself. The problems of scope can be dealt with if one is careful, or uses functions/classes. I see your point now about url transparency. One url = one file is definitely not the case in most fusebox applications.