This article was written by Samuel Lengen, Research Associate at Data Science Institute, University of Virginia, and originally appeared on The Conversation, a not-for-profit news site dedicated to unlocking ideas and knowledge from academic experts.
New proposed legislation by U.S. senators Mark R. Warner and Josh Hawley seeks to protect privacy by forcing tech companies to disclose the "true value" of their data to users.
Specifically, companies with more than 100 million users would have to provide each user with an assessment of the financial value of their data, as well as reveal revenue generated by "obtaining, collecting, processing, selling, using or sharing user data." In addition, the DASHBOARD Act would give users the right to delete their data from companies' databases.
As a researcher exploring the ethical and political implications of digital platforms and big data, I'm sympathetic to the bill's ambition of increasing transparency and empowering users. However, estimating the value of user data isn't simple and won't, I believe, solve privacy issues.
The data collected by tech companies consists not just of traditional identifying information such as name, age and gender. Rather, as Harvard historian Rebecca Lemov has noted, it includes "Tweets, Facebook ( FB -1.14% ) likes, Twitches, Google ( GOOGL -0.68% ) ( GOOG -0.87% ) searches, online comments, one-click purchases, even viewing-but-skipping-over a photograph in your feed."
In other words, big data contains the mundane yet intimate moments of people's lives. And, if Facebook captures your interactions with friends and family, Google your late night searches, and Alexa your living room commands, wouldn't you want to know, as the bill suggests, what your "data is worth and to whom it is sold"?
However, calculating the value of user data isn't that simple. Estimates on what user data is worth vary widely. They include evaluations of less than a dollar for an average person's data to a slightly more generous US$100 for a Facebook user. One user sold his data for $2,733 on Kickstarter. To achieve this number, he had to share data including keystrokes, mouse movements and frequent screenshots.
Sadly, the DASHBOARD Act doesn't specify how it would estimate the value of user data. Instead, it explains that the Securities and Exchange Commission, an independent federal government agency, "shall develop a method or methods for calculating the value of user data." The commission, I believe, will quickly realize that estimating the value of user data is a challenging undertaking.
More than personal
The proposed legislation aims to provide users with more transparency. However, privacy is no longer solely a matter of personal data. Data shared by a few can provide insights into the lives of many.
Facebook likes, for example, can help predict a user's sexual orientation with a high degree of accuracy. Target has used its purchase data to predict which customers are pregnant. The case garnered widespread attention after the retailer figured out a teen girl was pregnant before her father did.
Such predictive ability means that private information isn't just contained in user data. Companies can also infer your private information, based on statistical correlations in the data of a number of users. How can the value of such data be reduced to an individual dollar value? It is more than the sum of its parts.
What's more, this ability to use statistical analysis to identify people as belonging to a group category can have far-reaching privacy implications. If service providers can use predictive analytics to guess a user's sexual orientation, race, gender and religious belief, what is to stop them from discriminating on that basis?
Having been let loose, predictive technologies will continue to work even if users delete their part of the data that helped create them.
Control through data
The sensitivity of data depends not just on what it contains, but on how governments and companies can use it to exert influence.
This is evident in my current research on China's planned social credit system. The Chinese government plans to use national databases and "trustworthiness ratings" to regulate the behavior of Chinese citizens.
Google's, Amazon's and Facebook's "surveillance capitalism," as author Shoshana Zuboff has argued, also uses predictive data to "tune and herd our behavior toward the most profitable outcomes."
In 2014, revelations about how Facebook experimented with its feed to influence the emotional state of users ended in a public outcry. However, this instance just made visible how digital platforms, in general, can use data to keep users engaged and, in the process, generate more data.
Data privacy is as much about big tech's ability to shape your personal life as about what it knows about you.
Who is harmed
The truth is that datafication, with all its privacy implications, does not affect everyone equally.
Big data's hidden biases and networked discrimination continue to reproduce inequalities around gender, race and class. Women, minorities and the financially poor are most strongly affected. UCLA professor Safiya Umoja Noble, for example, has shown how Google search rankings reinforce negative stereotypes about women of color.
In light of such inequality how could a numerical value ever capture the "true" value of user data?
The proposed legislations' lack of specificity is disconcerting. However, even more troubling might be its insistence that data transparency will be achieved by revealing monetary value alone. Numeric assessments of financial worth don't reflect data's power to predict our actions or guide our decisions.
The DASHBOARD Act aims to make the business of data more transparent and empower users. However, I believe that it will fail to fulfill this promise. If lawmakers want to tackle data privacy, they need to regulate not just data monetization, but more widely address the value and cost of data in people's lives.