postgresql

Deserving of a serious LAMPing

Reinventing primary keys, one horrifying integrity error at a time.

The LAMP-stack community frequently shows their disdain for foreign and primary keys in databases, and sometimes with reason borne of experience. MySQL historically has been little more than a nice language for comparing a set of unrelated spreadsheets, so referential integrity has had to happen at the application layer or not at all. As such, careful MySQL users put a lot of work into ensuring referential integrity, without the help of the database.

But the eyes of even the most hardened LAMPer would widen, if he were to dip into a conversation to find someone saying the following about something they’ve built over PostgreSQL:

Yeah, I could add the current date/time to the end, so as long as you don’t add more than one [entry] with the same name within 1 second, it’ll be fine.

It might take you a while to work out (a) what that’s referring to (b) what kind of error in thinking it demonstrates and (c) what it implies about the quality of the underlying code. By that time—if you’ve worked at all with that sort of programming—your head may well be in your hands.

SELECT needle FROM haystack;

Ever wondered just where in a complex database you stored that particular bit of information? Or ever had to reverse-engineer someone else’s database, about which you know nothing, with only the output from an application to guide you? In these cases it’s handy to be able to search every text cell, in every column, in every table.I found a solution to this problem on the web, specific to Microsoft SQL Server. This solution uses SQL Server’s inbuilt system tables to find all the table and column names and thus search the entire database. Although it doesn’t seem to run fast enough to be of use in a live environment (several seconds for a large database) it’s still a useful development tool.

Unable to find an equivalent for PostgreSQL, I decided to write the following. The function and datatype can be loaded into a current database by e.g. typing:

i ‘/path/to/file/f_search_all.sql’

at the psql command prompt when connected to the database you want to search. Then run:

SELECT * FROM f_search_all(’some search text’);

to retrieve any matching tables and columns, plus a sample of the text that matched.

Here is the code: http://www.jpstacey.info/blog/files/code/f_search_all.sql.

As with the SQL Server solution, this works by referencing PostgreSQL’s system tables, in this case pg_class and pg_catalog. Most of the code consists of loops over static SELECTs and so is quite speedy: the only slow point is the EXECUTEion of dynamic code where the table and column names (stored in varchars) are evaluated as part of a SELECT string. Interestingly, these same system tables power the psql backslash commands. If you type:

set ECHO_HIDDEN t

within psql, and then e.g. examine a table using

d table_name

then psql will show you the queries it has made to provide you with the well-formatted output (thanks to Ian Bicking for that tip).

Blog category:

Subscribe to RSS - postgresql