Researchers Call Clickstream Data ‘So, So Dangerous’
Supposedly anonymized “clickstream” data collected by ad-targeting firms easily can be tied to individuals, according to two German researchers who sussed out the porn-browsing habits of a judge, the drug preferences of a politician and evidence in an ongoing cybercrime investigation using data provided to them by the firms.
Svea Eckert and Andreas Dewes revealed the results of their research at the recent Def Con hacking conference in Las Vegas. Their conclusion: In the wrong hands — which, in their opinion, is any hands — clickstream data can be dangerous. At the very least, collecting clickstreams is an egregious violation of privacy; at worst, it’s tantamount to a criminal enterprise.
The researchers began by gathering clickstream data about three million German citizens from some of the largest advertising data services brokers worldwide. Clickstreams comprise information about internet users’ browsing habits, generally collected by browser cookies and add-ons. The streams are very detailed, cataloging webpage visits, links clicked, search terms, purchases, videos viewed and other behaviors.
Eckert and Dewes determined 95 percent of the data they obtained from the aggregators was collected by 10 popular browser extensions that did not make clear to users the extent of data the extensions collected.
According to the researchers, the firms that provided data for the study “anonymize” the material by connecting a list of sites and links visited to a customer identifier. That way, targeted ads may be delivered to consumers without exposing the consumer’s identity.
However, Dewes noted that comparing the clickstream data with easily obtained public information allowed he and Eckert to correlated customer identifiers with distinct individuals. The pair used news and entertainment links shared on social media, YouTube videos users had rated or on which they had commented, forum posts, blog comments and photos posted online.
“With only a few domains you can quickly drill down into the data to just a few users,” he said.
In many cases, Dewes revealed, the clickstreams included links to users’ personal social media admin pages, making personal identification simple.
“The public information available about users is growing, so it’s getting easier to find the information to do the de-anonymization,” he said.
Eckert said she found the intimacy of the portraits created from the clickstreams alarming. Although many of the profiles revealed nothing illegal, quite a few revealed information that could expose individuals to public ridicule or, in extreme cases, extortion.
“This could be so creepy to abuse,” she later told the BBC. “You could have an address book and just look up people by their names and see everything they did.
“After the research project, we deleted the data because we did not want to have it close to our hands anymore,” she added. “We were scared that we would be hacked.”
Eckert said she is now convinced plans like the U.K. government’s proposed age-verification scheme for porn-site visitors easily could go awry.
“If you are strong on data protection, then you should not be allowed to [collect personal information about users],” she said.
In cases where collecting personally identifiable details is essential, imposing limits on the length of time for which the data could be kept would lessen the potential for disaster if clickstreams were leaked or hacked, she added.
“You have to be very careful,” she said. “It’s so, so dangerous.”
Image © TMarchev.
One Comment
Leave a Reply
You must be logged in to post a comment.
Pingback: Researchers Call Clickstream Data ‘So, So Dangerous’ – TripleXers Blog