PII is Dead

For anyone involved in privacy in the late 90’s and early aughts, ‘PII’ or ‘Personally Identifiable Information,’ had a very specific…

Colin O'Malley

Sep 14, 2016 • 3 min read

For anyone involved in privacy in the late 90’s and early aughts, ‘PII’ or ‘Personally Identifiable Information,’ had a very specific, bright line meaning. PII referred to the data that needed to be protected. Email, phone number, postal address, etc. Everything else was effectively harmless. Websites and marketers could go virtually unregulated for privacy practices if they simply resisted the temptation to touch PII. But a series of gaffes and marketing tech innovations have made it patently obvious that wide categories of data beyond PII have the potential to ‘identify’ an individual and to produce messaging so personal, that it can shake the ‘private’ sense an individual has when browsing the internet. When viewed with this history in mind, we really should not have been surprised when the FTC began to declare (1, 2) that all manner of device IDs and associated data were also ‘PII.’ Or rather … Maybe we should have been surprised that they used the term at all, as it has largely outlived its usefulness.

PII seemed to reach ‘peak utility’ in the late 90’s, when DoubleClick was sorting through the implications of its Abacus merger with the FTC and the parties effectively agreed to refrain from combining cookie level data with PII. A peace was born. But cracks emerged quickly.

Search data without PII was dropped intentionally and researchers demonstrated that it could be used to identify individuals. Programmatic ad buying emerged, allowing the widespread application of 3rd party data on top of individual ad network browsing data. 1st parties emerged with their own cookie data, on top of which they could deploy their own PII. Data aggregators emerged that partnered with a network of 1st parties to collect PII and sell this PII on top of the programmatic infrastructure. Statistical ID technologies emerged that allowed the most basic attributes, like user fonts and browser version, to be leveraged in device identification. And there were significant market consolidations, with 1st party custodians of PII like Google and Yahoo merging with 3rd parties and dominant new players like Facebook emerging with 1st to 3rd party capabilities.

Collectively, these steps served to make data: a) easily transportable across parties and contexts, b) easily used for very personal messaging, c) regularly harnessed for creative purposes that were not contemplated by either technologists or policy makers when the original rules of the road were documented. Data across all categories have ended up playing an important role in identification and messaging, with no single data element or category playing a gate keeper role. Clearly, one party’s agreement to refrain from PII use would not be sufficient to preserve a user’s privacy, especially when privacy can be impacted without PII and multiple 3rd parties might also be engaged with the user at the same location with access to all range of additional data, including PII.

PII was a useful term when everyone knew what it meant, and it properly wrapped itself around the primary categories of data that we wanted protected. Now that it is leaking all over the place, we can either expand the term to be inclusive of an ever increasing list of data categories, bounded only by the creativity of next month’s industry innovations or a privacy researcher’s experiments, or we can stop the madness and give the term a proper burial.

When the EU dropped the notion of PII in data protection law more than two decades ago, in favor of ‘Personal Data,’ with an inherently expansive scope, many of us in the US thought the Europeans were crazy. They now look prescient. The FTC, many years later, is effectively conceding the point by cramming ‘Personal Data’ into this shell of a term. Perhaps they have to, as it is the policy device they have in front of them, locked into the firmament with existing law and precedent. The FCC is also moving in a similar direction with their new proposed rule making.

The lawyers will have a field day fighting over the finer points of how the phrase is used or not used in existing privacy policies and contracts and how that needs to change (or not) and how to deal with existing legislation that leverages a term in flux.

The broader implications for the rest of us are clear:

Manage all data as if it needs to be protected. Because it does.
Don’t expect consumers to understand privacy assurances that are limited to your use of ‘PII,’ because that doesn’t map to their current conception of personal privacy.
The EU will still be an outlier in global data protection law, but they are also looking more like an innovator, with the US regulators increasingly looking across the pond and borrowing concepts. Personal data, right to be forgotten, …

We had a good run, PII. It’s not you, it’s us. We need to move on.

Sign up for more like this.