Clearly, the Ashley Madison hack of this week is all the buzz in infosec. Actually it’s broader than that and we’re seeing a huge amount of buzz in the mainstream media too. After all, it has all the right ingredients: shady underworld characters, threatening demands and of course, sex.
But there’s a unique angle with this one that I found particularly interesting and that’s the premise on which the hackers claim to have broken into AM’s things. One of the bugbears clearly outlined in their accompanying letter of demand was that AM had been charging customers $19 for their “Full Delete” service which allegedly netted them a tidy $1.7M last year. That’s right – you had to pay them in order for them to remove your data! Alas, according to the attackers, “Full Delete” means anything but and that’s where things get interesting.
I get that AM saw a lucrative revenue stream in charging people to remove their data because let’s face it, it’s the sort of site where one may later rethink the choice they had made and want to very hurriedly cover their tracks. But removing all traces of someone’s identity is far harder than it sounds once you consider all the points where they leave a trace.
The easy one is AM’s database that inevitably contains customers, their interactions, their messages and anything else to do with the day to day transactional nature of running a web property. Even then, you’re probably talking about relational databases with referential integrity that means it’s just not as simple as yoinking a record; you’ve got all sorts of dependencies on that record so there’s both the technical challenge of how you deal with removing the customer (it may just mean scrubbing identifiable data) then the social side of how it’s dealt with (what do those with interaction history with the individual now see).
But then there’s the backups of that data – the same records will exist somewhere onsite and probably somewhere offsite too. It may well be incremental or rolling backups so there are now multiple copies floating around which is great for disaster recovery but it means that “deleting” is not just a single thing.
Now let’s move onto all the ancillary data about the individual that’s indirectly collected as they use the service. The web server logs, for example, hold personally identifiable information (and yes, an IP address is personally identifiable) that includes the activity of users as they navigate the site. Third party services that integrate with the site also have access to an extensive amount of information; a public CDN or social media widgets, for example, know a huge amount about individuals’ movements based on uniquely identifiable attributes about their browser and location combined with the address of the page they’re embedded on.
In AM’s case, the hackers claim that payment records for the site’s users are still retained and paradoxically, this includes the financial transaction for the “Full Delete” service! They illustrate how this information about one particular gentleman can easily be matched to the fact that he enjoys dressing up in lingerie. And of course there will be traces of the financial transaction in various places, including in the guy’s credit card statement.
There is no such thing as “Full Delete”, there are only degrees of delete that go from zero to something less than one hundred percent.