Data protection opt-ins are not true/false and they're not even multiple true/false

The standard developer approach to data protection is to put a checkbox at the end of a form and let a content editor specify the text that's going to go there. People then submit the form, and the state of that checkbox—checked (true) or unchecked (false)—is stored alongside whatever else they submitted.

But how an organization implements its data protection policies, and what paper trail results from that procedure, can have legal ramifications. Is it sufficient to just store a checkbox's state? And if not, what might provide both a more honest record of the site visitor's wishes, and also a reasonable audit trail as a deliverable resulting from enforcing the data-protection policies?

I'm a developer and not a lawyer. Data protection, even if we're only talking of within the UK, is a complex subject which groups like the CFG can help with, and it's a subject where best practice changes all the time (and is likely to change again in 2018.)

But I think it's illuminating to go through the discovery process that I recently helped one of my clients to negotiate. This was done in order to convince one of their other suppliers of the complexity of the solution they required; meanwhile, the supplier was understandably keen to push for whatever solution was the simplest from their software's perspective. I set out the thought processes below, if only because if you ever find yourself in the same discussion, you might also be able to make use of some of the same responses I gave!

The single checkbox: a true/false

As mentioned above, this is the first solution anyone ever implements. It looks something like this:

  • From time to time we might want to get in touch with you over email. If you don't want us to contact you, please tick this box: ☐  — store true if checked, false if unchecked

The marketing department fights to make the procedure opt-out; the user advocates (if there are any) fight to make the procedure opt-in. And once decided, there it must stay.

From the perspective of this blogpost, the point is that this system is only really of any merit if your organization has one webform (perhaps a "contact us" form?) and you never, never change the text of the label to something that means something else.

Especially don't change it from opt-in, to opt-out: from the perspective of someone wanting to honestly audit your policies, even the potential to do this could cast doubt on all of the data protection responses you've collected thus far: where do the trues-mean-yes stop, and the falses-mean-yes begin?

This solution was clearly not flexible enough for the size of the client, and the number of incoming submission streams it had: we started looking at other options.

Two checkboxes: two true/falses

What if you have the potential to create many webforms? This might be via Drupal's Webform module, or some large third-party campaign platform: or it might be something a developer needs to build each time. Either way, you need to have a policy and a solution that work together.

The simplest first-step improvement on the previous is to have two checkboxes, with two different names: either of which you can drop on any given form:

  • Click this box to opt in: ☐  or
  • Click this box to opt out: 

This situation is certainly an improvement, and solves the immediate problem of perhaps needing opt-in for some situations, and opt-out for others.

But it still limits you to specific wording: if, for example, your wording signs people up to occasional emails, how can you then at a later date justify changing the label wording. to sign people up to occasional SMS alerts? Even if you only wanted to do it for new subscribers, merely changing the text on the same data would require a lot more logic to work out which subscribers had clicked what, and when, and what that meant. And worse, if the wording was entirely changeable, what if a novice copy editor (or an oversight) led to the opt-in checkbox having opt-out text?

Once again, all of these eventualities would cast doubt on any data already harvested: once again, this was a sign that the quality and fidelity of the data being stored might indeed be worth doubting.

Two checkboxes: storing the text

At this point the provider, who still wanted to be able to store simple data if possible, asked us: why not send you the text, if someone clicks on the button? That way, you always have a record of what they've selected. So for the two possible types of button—opt-in or opt-out—there are four possible situations that could be anticipated:

  • Click this box to opt in: ☐  — store nothing
  • Click this box to opt in: ☒  — store opt-in label text
  • Click this box to opt out: ☐  — store nothing 
  • Click this box to opt out: ☒  — store opt-out label text

At first glance this seems to solve a lot of our problems: it stores both the sense of what the user has intended, not just the state of a checkbox in a HTML form. But it quickly dawned on us that the situation was ambiguous: for a given submission from a site visitor, how does an automated system know if the absence of a text label meant they had opted in, or opted out? Worse, what if the person who edited the form simply forgot to add a checkbox, and our policy was for opt-out checkboxes? It would look like nobody had opted out; and someone was bound to complain.

Storing text and true/false for every label

The eventual solution agreed on was to store the sense and the state: form submissions would result in storage of both the label, the sense of what the site visitor had seen; and also the state of the checkbox, what they had decided to agree to (or not.) This way, the paper trail was unequivocal: when someone had opted in, it was clear what they had opted into; and vice versa for opt-outs.

An improvement at the point of storage was to match the sense to a set of previously approved label texts, in order to retroactively enforce a coherent data protection policy across the entire organization's forms: submissions could not reach the CRM, unless the text matched an approved list. If someone set up a form with unapproved label text on the data protection checkbox(es), then someone would be alerted by the first incorrect submission, and the text could be changed to match one of the approved alternatives.

Finally, with this storage of complex data, both sense and state, there was a solution that all parties could agree on. My client was able to use the counterexamples to convince the supplier, not only that it was a reasonable solution for the client's purposes, but that it could also help the rest of the supplier's client roster tighten their processes for enforcing data protection policies.

Summary

Data protection policies change all the time (and benefit from proper legal advice, which again I stress is out of the scope of this blogpost!) When a checkbox's label can vary from user submission to user submission, whether quickly (every submission) or slowly (potentially every few months) then what must be stored is both the sense and the state: the wording of what the user was asked to agree to; and whether they did so.

This storage of two data elements for each submission helps provide a more honest audit trail for how users have responded to data protection questions, whatever the internal policy of the organization.

For a large and complex organization, with submissions including data protection information arriving from many different sources, the sense can be matched to a list of different possible statements that a submitter might be meant to see. This permits clearer and neater storage of what someone submitted, and helps the organization enforce policy decisions across all those different sources.