What constitutes the perfect data migration?

Data migration can be defined loosely as the process of transferring structured information from one format, supported by one data store, to a possibly different format, supported by a typically different data store.

But what does it mean to migrate data "perfectly"? Data is, after all, not just text. Preserving encoding is a sine qua non, but what about the structure? When we migrate, we endeavour to preserve as much of the information encapsulated in both form and content as we possibly can.

But what does it mean, to preserve information? When we manipulate information, we can do so in a way that the original data can be completely reconstructed from it, via some (usually hypothetical) inverse process to the one we just applied. Such a manipulation is lossless. Lossy manipulations might involve discarding fields from the data, or squashing two fields together in such a way that they can no longer theoretically be prised apart again. Such manipulations do not preserve information: we cannot reconstruct the precise decisions made during the migration, at a later date, to retrieve that granularity we've lost. A perfect data migration, therefore, might be one where not only is the information preserved in theory, but also the information content is preserved in practice too. 

But where might the difference between theory and practice arise? The simpler the actual, practical implementations of any manipulation, the less likely that the theoretical perfection will be spoiled by any practical issues: a network glitch; an unforeseen encoding problem; a loose constraint not satisfied; an indexing protocol no longer supported. It follows that the only truly perfect migration operation is the identity operation: field by field, record by record, structure by structure; format by format; relationship by relationship; character by character. We might say that the most perfect migration is a straight, byte-by-byte copy of the original database's storage mechanism to a new instance: data untouched; schema unchanged.

But is that a strict enough requirement? After all, copying can introduce errors; filesystems might store bytes differently, and who knows how the implicit storage of the data on disk is affecting your speed of data writing or retrieval? No, copying one byte at a time might not be perfect enough.... Arguably, the one true perfect data migration is no migration at all. Leave the data where it is, and let its structure remain identical with itself.

A frivolous point to make, perhaps. But a sobering thought when you set out to migrate data; it turns out that the Hippocratic oath for someone planning to do so is not merely first, do no harm; but rather: first, do nothing at all.