Very flat, Excel

Database design, as an intellectual exercise, can be tremendously satisfying. As one builds up relations and constraints, making your database just rigid enough to support the more intelligent model soon to be layered over it, it’s possible to feel a sense of future-problems-solved: this check will prevent the model from trying to call a service a finished product; this trigger will roll back any unpassworded changes to a given provider. At the end, a complex relational system, which is normalized just to the point where normalization yields diminishing returns, is a work of craft if not of art.

So it’s always unsettling when the client then asks for such heavily relational data to be exported to a spreadsheet.

There are many excellent reasons why applications sit on top of a database instead of a spreadsheet, but most of them are hard to explain to clients, many of whom tend to use spreadsheets only as highfalutin Word tables. I calculated that I could express the data relevant to a single provider in one spreadsheet (multiple worksheets); it would therefore take around 1000 files, which the client understandably declined to accept.

Clearly a compromise was necessary, and in this instance it came about by asking the client what they actually needed: in this case, there were two separate requirements pulling in opposite directions that, singly, were easy to solve: their technical advisor wanted a schema, or at any rate a UML diagram, so they could build a companion application along similar lines to ours; their project co-ordinator wanted a checklist of the provider data we had, to avoid duplication during import and see how to proceed with their own data audits.

Exit gracefully: After having finally teased out the two separate requirements, and dealt with one by a simple dump of the database schema, I identified the tables which could, just about, be flattened. I planned ahead by checking exactly how many were involved in each many-to-one relationship—no more than two, in the areas I agreed to flatten—and eventually was able to promise the equivalent of an address book for providers. For the task in hand, this was more than adequate.

When building a database for a client, and one which might see some reuse or multiple simultaneous use, always make sure you can justify each constraint or relation not just in terms you can understand but in ones that stress the benefit to the client’s well-tended data: there will be a benefit to the data, so this isn’t as hard as it sounds. But, as you build, look for where you might need to make quick simplifications in future. Identify the quasi-flat areas you might be able to isolate, because sooner or later the client is bound to ask for data in that shape.