Universal re'locator

Thursday, January 18, 2007 - 11:48

What happens when nobody will take responsibility for a standard that the web relies on?

RSS, the standard millions of us use to syndicate content, and view other people’s syndicated content, was originally invented by Ramanathan Guha at Netscape, for use on its my.netscape.com portal. Soon afterwards, Netscape lost interest in the format, leaving it ownerless and later on picked up by a development community spearheaded by UserLand Software.RSS 0.91 became 1.0 and 2.0, yet despite the deprecation of the grandaddy of them all 0.91 is still around and in use, arguably because of the vast overcomplications in its immediate successor and the divisions that it caused in the community.

The problem with that is as follows. Every time someone views an RSS 0.91 syndication feed with certain types of syndication software, their computer attempts to get the DTD from this location on the my.netscape portal—it’s hardcoded into the way that the software understands what XML format it’s dealing with. So this URL gets plenty of hits:

http://my.netscape.com/publish/formats/rss-0.9.dtd

Which is great, until Netscape decide—legitimately, one might argue—to update the my.netscape portal and get rid of the DTD. Which they did, at the start of the year. At that point, a good portion of the syndication lights go out across the world. And although we now have a moratorium until July 2007, nothing has really been solved in the long run.

Anyway, Netscape shouldn’t have to support the bandwidth of millions of DTD downloads for a standard they declared defunct—when did they sign the don’t-be-evil contract?—and maybe people should “just” move to a newer version of RSS, or Atom. But this whole episode is an Ozymandian warning of what is to come. We’ve reached the point where the URLs of industry (one-time) giants are simply no longer to be trusted as the location of standards.

One day Microsoft, and Sun, and IBM, will cease to exist, and their websites become the 22nd century equivalent of Google-adsensed search engines (Google, of course, will be around forever, more’s the pity). Sooner or later something really horrible will happen for the open communities, say Purl disappearing for good, taking things like the Dublin Core XML specification with it. We need to know how to deal with the loss of their specifications and standards now: the unreliability of the URL as a locator for DTDs and schemata. Or is the only lesson we can draw from history, that we’re destined to wander from standard to standard as the specifications drop off the radar, leading the nomadic life of those standardized today, obsolete tomorrow?

Comments

argle (not verified)

Thu, 18/01/2007 - 16:38

Permalink

Isn't this what the Public identifier is for? I've not fiddled with RSS, but the equivalent for XHTML 1.0 is:

DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

where even if the URL is unavailable the "-//W3C//DTD XHTML 1.0 Strict//EN" should be sufficient to identify the DTD that, frankly, every RSS reader should have cached locally anyway.

(Relatedly, ISTR this is why there wasn't supposed to be anything downloadable at the URL that defines a namespace.)

jps (not verified)

Sun, 21/01/2007 - 21:07

Permalink

I don't know what you mean by "for", really. I can't take that public identifier and use it to validate the content. You might as well say that you can treat the URL that follows it as a string that isn't really a URL, just an identifier, and junk the PUBLIC identifier as redundant. It doesn't solve the lookup problem.

Whether or not you cache the standard locally is a separate issue too, I think. If there are millions of copies of the DTD about the place then as far as reliability is concerned that's as bad as having no authoratitive single copy. And if your cache has no latency time after which it goes and rediscovers the authoratitive standard then you run the risk of being unable to refresh a somehow polluted cache.

I'm glad to know there's some reason why you can't download anything at the URL that defines a schema namespace, though: it's a shame it doesn't really make up for the fact that any process whereby you might try to automate schema referencing, downloading and validating is fundamentally screwed by having bugger-all to download at the xmlns: URL, even if the relevant authoratitive standards are still alive and well.

I'm a Drupal Association member!

My individual membership of the Drupal Association

Drupal 8 API tutorials

Want to learn about the Drupal 8 APIs, with worked examples? Follow my series of tutorials, covering routing, caching, entities, config and much more!

Want to hire me?

I'm currently

fully booked

Blog category:

futurology, news, standards

Recent blogposts

Altering the length of a Drupal 8 text field that contains data

Friday, July 21, 2017 - 11:31
A menagerie of testing: behavioural, unit, system, smoke, regression, oh my!

Friday, June 2, 2017 - 10:11
Including Javascript in Behat tests, all inside a headless, virtual machine

Tuesday, May 30, 2017 - 16:51

All blogposts

About me

I'm J-P Stacey, and I'm a freelance technical developer and software architect, working with Drupal, Javascript, Symfony, PHP and devops, with experience in project and process management and an emphasis on usability.

I live in the UK; my website is self-hosted on bigv.io; my email is hosted by Google, and that's also what I use to share files. (More info|What is this?)

Secondary menu

Universal re'locator

Comments

Isn't this what the Public

I don't know what you mean by

I'm a Drupal Association member!

Drupal 8 API tutorials

Want to hire me?

Blog category:

Tags:

Recent blogposts

Altering the length of a Drupal 8 text field that contains data

A menagerie of testing: behavioural, unit, system, smoke, regression, oh my!

Including Javascript in Behat tests, all inside a headless, virtual machine

About me

Find me elsewhere