The wake of AI tech: why the US should adopt GDPR immediately

Nov 24

Currently, the United States does not require companies to provide consumers with any legal basis for collecting their data [1]. More than that, as big tech companies are constantly training and releasing AI-based language models, the federal government still has not established any fundamental principles of data protection. The arbitrary governance on different AI models poses a real threat to every individual’s data shared on the internet. To combat this unconsented invasion of privacy, the US should implement a universal data protection model that would establish stricter limitations to its largest data-collecting corporations, as well as ensure and widen the fundamental rights of the consumer.

US data privacy regulations are full of legal loopholes for companies to deceive the consumer and collect excessive amounts of data. Apart from some industry or age based regulations such as GLBA, FERPA, or COPPA, most of the regulations generally do not obligate corporations to ask the user before collecting their data. There are also no restrictions on the subsequent use of this data, including the possibility of sales to third parties. Such absence of regulation results in the lack of investment into data protection systems – one of the main breach prevention mechanisms. At the end of the day, it results in the US being at the top of global rankings of countries that have the most data breaches per capita.

This problem is now more relevant than ever. Lack of universal regulations throughout the country makes it a perfect medium for AI-based tech companies to collect excessive amounts of data and never face legal accountability for protecting and processing it. For instance, the algorithms are trained with datasets that might include and (once the model is released) publicly expose sensitive information such as names, phone numbers, and addresses [2]. Beyond just including personal identifiers, the training datasets often include paid content such as articles of The Economist or Harvard Business Review, thus depriving the creators of a deserved income source. Under this model, the consumers have a legal paywall shortcut wherein they may ask AI to generate and expand on summaries of these articles. This is currently not even the worst exploitation of data that AI tech companies are creating. Looking at OpenAI’s Article 1 of the privacy policy [3], it becomes clear that all 100 million users of ChatGPT are subject to some of the most questionable and excessive data collection tools. These means include but are not limited to collecting information about browsing activities, sharing data with unspecified third parties, or providing no data collection expiration date. That is to say, by using OpenAI’s production, a user, without any explicit consent, is stripped away from one’s rights to the privacy of any web activities. More than that, not only can sensitive and private data serve as datasets for further training of the models, but also might be sold or shared to any third parties OpenAI partners with.

OpenAI, Microsoft, and other emerging AI-focused companies will not change their methods of data collection and processing if they will not be required to. This is precisely why the only means that could protect the consumers is a law that strictly regulates the data exchange between company and consumer. Implementing a GDPR-like data protection model would tackle most of the legal loopholes that currently jeopardize the consumer the most. GDPR [4] is a regulation adopted by the EU limiting the extent to which companies can exploit users’ data, as well as obligating them to get an explicit consent of the user to collect the data in the first place, in addition to many other features. Primarily, GDPR enables the consumer whose data is being used to make requests to view collected data about oneself, as well as get acquainted to how precisely it is being used. Such legal practice establishes one of the most crucial data’s legal definitions – it is conceived as an extension of a user (like a type of inextricable intellectual property) rather than an addition that could be separated. That is to say, if we consider anything we type in “Google”, “Bing”, or “ChatGPT” as an extension of our mind and body, we must have an inherent right to know how this extension is used. What would such a policy mean? Consumers' awareness about the ways companies process their data would not even be the most important part. Given the policy’s induced transparency, huge scrutiny would be exerted on corporations to make sure their data collection and use is ethical. In other words, “OpenAI” would have to specify what elements of browsing activity they track and, likely, if the collection of data is excessive, they would be put under users’ pressure to limit such data collection to a reasonable extent. However, GDPR takes even more steps to protect the user. The established right to be forgotten ensures that every EU company is required to delete the user’s data upon the user’s wish [5]. If the consumer changes their mind and realizes they do not want a company to process their data, they have an undeniable right to ask for it. Needless to say, once the corporations are threatened with losing their main source of revenue—consumers' data, the pressure to process consumers’ data ethically and reasonably is greater.

Skeptics might claim that there are no mechanisms to enforce such regulations and even if they are enforced, this would result in retarded development of technology innovations. Regarding the enforcement, European Union serves a drastic but effective example of enormous fines for non-compliance (up to 4% of global turnover) [6]. This has resulted in some tremendous fines up to $786M for companies that did not comply with some of the general data processing principles [7]. Yet as a result, such enforcement methods have turned out to be one the most successful means to ensure compliance to the policy. Given the successful implementation, the barriers for data collection definitely slow down most of the big tech’s operations. However, from a legal standpoint, GDPR primarily imposes transparency and bolsters consumer’s rights. Then if these two become a limitation for companies to collect data (either because of public backlash or consumers retracting their data in case of unethical data processing), one must ask whether that is a direct consequence of a policy, or of the current ways the company has been collecting data up until now.

In the near future, the U.S. might be unwilling to pass such an overarching data protection law, both because it has been reaping the fruits of the loose data privacy regulation and because some parts of GDPR (like Article 17) are at odds with some of the current US legal precedents (Google v. Garcia [8]). This, however, does not mean that a federal law cannot establish a more thorough definition of the data’s place in a legal framework, as well as regulate data processing more strictly. That is, the US policymakers could select the best-fitting articles beyond the right to access or the right to be forgotten — e.g. the rights to be informed, notified, or object. One thing is clear – even a partial version of such data protection regulation would strengthen the companies’ accountability and the consumers’ trust. In the long-term, both of these things bring significant mutual benefits to each party.

Bibliography

Global Legal Group, “Data Protection Laws and Regulations Report 2022-2023 USA,” International Comparative Legal Guides International Business Reports (Global Legal Group)
Machine Learning Security and Privacy, “Privacy Considerations in Large Language Models,” Google AI Blog
OpenAI, “Privacy Policy,” https://openai.com/privacy/.
General Data Protection Regulation (GDPR), September 27, 2022, https://gdpr-info.eu/.
General Data Protection Regulation (GDPR), “Art. 17 GDPR – Right to Erasure ('Right to Be Forgotten'),” 12 June 2017, https://gdpr-info.eu/art-17-gdpr/.
General Data Protection Regulation (GDPR),“Fines / Penalties,” October 22, 2021, https://gdpr-info.eu/issues/fines-penalties/.
List of GDPR fines,“GDPR Enforcement Tracker,” https://www.enforcementtracker.com/.
“Garcia v. Google, Inc..” Harvard Law Review, 8 April 2016, https://harvardlawreview.org/2016/04/garcia-v-google-inc/

Radvilas Pelanis

Radvilas Pelanis is a staff writer for the Harvard Undergraduate Law Review for Spring 2023.

The wake of AI tech: why the US should adopt GDPR immediately

Bibliography

Between Rehabilitation and Punishment: America’s Approach to Juvenile Justice

Affirmative Action and the Color-Blind Doctrine: Constitutional or Constructed by Race?