Law & Art: Copyright in the Time of GenAI
Since the World Wide Web opened to the public in 1993 and Facebook rolled out to college students in 2004 (expanding to the broader public in 2006), the way Americans create, consume, and engage with news and other media has transformed in a short matter of decades. Under the First Amendment, our United States Constitution provides the highest protections to free speech as mediated through print newspapers, magazines, and books, and this degree of protection was extended to include Internet communications through Reno v. ACLU in 1997. So while administrative agencies like the Federal Communications Commission (FCC) control to some degree the content of radio, television, and cable broadcasting, there is no comparable agency* regulating Internet content in the United States. Instead, the Executive and Judicial branches are currently addressing issues from a reactionary standpoint (which by design, is quite slowly), while technology continues to advance at an ever-faster pace, particularly since the advent of generative artificial intelligence (GenAI).
Although America’s Founding Fathers may never have dreamed of a WWW or the social, political, and economic issues raised by complexities introduced by an increasingly interconnected world, there are some key guidelines and precedents to navigate this uncharted domain. Following the U.S. Constitution and law of the land - “Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances.” As written in singular balance within the First Amendment, the Government established its contract with the People to speak freely (demonstrating with actions, when desired or necessary), to hold it accountable, and to empower a free press to enforce transparency and accountability on a larger scale. This freedom is further empowered by Constitutional Copyright law in Article I, Section 8, Clause 8 that holds: “Congress shall have Power… to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.” Since 1790, Congress has adopted copyright statutes - originally modeled after British law continuing through to our current Copyright Act of 1976 that was internationally harmonized when the U.S. joined the Berne Convention in 1988 (Calvert, 2020). Taken together, we must look no further than the United States Constitution to interpret appropriate application of GenAI, especially in the sphere of news media, where impact to free speech is of the greatest consequence to the integrity and survival of American democracy.
In regards to American copyright law, Justice Oliver Wendell Holmes stated in Bleistein v. Donaldson Lithographing Co. "it would be a dangerous undertaking for persons trained only [in] law to constitute themselves final judges of the worth of pictorial illustrations, outside of the narrowest and most obvious limits” (Calvert, 2020). An important precedent set by Holmes in 1903 that allows judicial review to focus exclusively on the legal application of copyrighted content, without regard for aesthetic qualities that could otherwise censor content based on individual (or, may be argued - political) preferences. This precedent has been upheld in recent high-profile cases involving fair use [under copyright law] and AI, providing contemporary examples to follow as The New York Times takes on OpenAI (GenAI giant known as ChatGPT with largest investor Microsoft) in a landmark case that will likely shape the future of AI development, and possibly the regulatory environment for the Internet within the U.S. (Allyn, 2025).
As I write and cite this article, I credit authors whose copyrighted work was essential in providing background to these ideas, both honoring the original source and providing legitimacy for my new work. This is the foundation of copyright law, but not its entirety. Current copyright law gives to the author (or the owner of the copyright) the sole and exclusive right to reproduce the copyrighted work in any form for any reason, including the specific rights to 1) reproduce the work, 2) prepare/create derivative works, 3) publicly distribute the work, 4) publicly perform the work, 5) publicly display the work, and 6) publicly perform a digital sound recording (Calvert, 2020). Under these rights with Constitutional origin, consent of the copyright owner must be sought before an original work is reproduced, broadcast, or displayed (including on Internet forums) with exceptions for fair use.
The U.S. Supreme Court evaluated fair use in Google v. Oracle in 2021 when Google used ~11,500 lines of Java SE script for their new Android software platform. While the Court found that Java’s script was copyrightable, they also assessed that “computer programs differ to some extent from many other copyrightable works because computer programs always serve a functional purpose” (Google, 2021). The Court ruled in favor of Google, whose new Android software program (using ~0.4% of Java SE script from their Application Programming Interface/API tool) was found to be fair use by all four criteria - as in limited utilization (compared to total available API content) of original work was transformed to create an entirely new product that did not serve as a substitute for the original work, but in fact could benefit the original copyright holder (by finding a new application for their work). This precedent is critical when looking towards utilization of copyrighted work in training and creating new works with GenAI - where fair use may apply or fail, based on current technological developments.
While GenAI continues presenting new opportunities and challenges in public and private sectors, the U.S. Supreme Court has more recently examined transformative use under copyright law in Andy Warhol v. Goldsmith in 2023. On the surface, this case may seem entirely unrelated to future precedents for AI, but the principles involved are essential in pointing where judicial review (and perhaps future regulatory oversight) may lead. The case considered a photograph taken of musician Prince by Lynn Goldsmith (the original copyrighted work) that was used as the basis of a silkscreen created by Andy Warhol, which was later licensed [by the Andy Warhol Foundation/AWF] to Conde Nast in 2016 for the “Orange Prince” cover of their commemorative magazine. In consideration of fair use under copyright law, the Court reviewed Warhol’s “Orange Prince” versus Goldsmith’s photo against the following criteria: 1) purpose and character of the use, 2) nature of the copyrighted work, 3) amount and substantiality of the portion used in relation to the copyrighted work as a whole, and 4) the effect of the use upon the potential market for or value of the copyrighted work. In the end, the Court ruled in Goldsmith’s favor, since AWF licensed [“Orange Prince”] without the photographer’s prior consent in a magazine where the original work was not credited, and most importantly, shared “substantially the same commercial purpose” as the original (Warhol, 2023).
Perhaps in consideration of Justice Holmes’ much earlier precedent in Bleistein, the Court noted while “‘Orange Prince’ can be perceived to portray Prince as iconic, whereas Goldsmith's portrayal is photorealistic, that difference must be evaluated in the context of the specific use at issue” (Warhol, 2023). This distinction is important in application of Warhol’s precedent moving towards GenAI, as the Court did “not attempt to evaluate the artistic significance” of photograph nor silkscreen while assessing fair use under copyright law (Myers, 2023). The Court went further to qualify their decision by stating this “does not mean that all of Warhol's derivative works, nor all uses of them, give rise to the same fair use analysis” reinforcing Holmes’ much earlier precedent in Bleistein (Warhol, 2023).
When transitioning from Google and Goldsmith’s precedents to applications for GenAI, Justice Kagan noted in her dissenting opinion for Warhol that “the more transformative the new work, the less will be the significance of other factors, like commercialism, that may weigh against a finding of fair use” (Myers, 2023). A problem ChatGPT (backed by Microsoft) has recently faced in the case brought by The New York Times is that the tool “makes use of vast amounts of pre-existing content in order to train its algorithms” - material, in this case, that includes deep cuts into the Times’ copyrighted archives (Myers, 2023). In a compelling complaint from The New York Times (NYT), the New York Daily News, and the Center for Investigative Reporting, the Plaintiffs claim “GenAI tools can generate output that recites [Times] content verbatim, closely summarizes it, and mimics its expressive style…also wrongly attributing false information to [The Times]” (NYT, 2023). The end result, the complaint states, is to “undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue” - a situation which bears striking resemblance to Andy v. Goldsmith (NYT, 2023). The Times rather urgently insists that GenAI “creates products that compete with it that threatens The Times’s ability to provide that service,” which could have long-term and irreparable damage to freedom of the press not only for NYT, but to all news media sources [under the First Amendment] if their claim against GenAI is validated (NYT, 2023).
To evaluate fair use for GenAI under copyright law, we must have a general understanding how Large Language Models (LLMs) like ChatGPT function. The latest models of GPT are trained on exceptionally large datasets “the equivalent of a Microsoft Word document that is over 3.7 billion pages long” (Pope, 2024). As for its inputs, 100% of copyrighted work (when used) would be included. Outputs are more difficult to identify with current technology, as the “‘natural-language’ response produced upon user query” may sometimes include “‘memorized’ parts of source works included in training [or input] data” producing “near-verbatim reproductions” (Pope, 2024). The “expressive content” produced by GenAI LLMs like ChatGPT presents several challenges for copyrighted material; first that outputs may be so similar as to effectively reproduce or display (at least in part) the original, or create a derivative work. Second, and most importantly where it concerns news media sources, since current technology does not identify (or link) original source work, it’s challenging (if not impossible) to identify how much of an original copyrighted work is produced in the LLM’s output, except by very manual processes. Based on precedents like Google and Warhol, identification of sources and percentage of an original work used (relative to the whole) would be required to establish fair use exceptions to copyright law.
This leads to the final problem of addressing impact - whether LLM outputs like ChatGPT create a “rival with competing information” that ultimately displaces original source content - the claim made by the NYT in the recent case against Microsoft (Allyn, 2025). This may be the most important consideration, albeit the most challenging to assess, as economic reactions take time. If history has borne witness in how Facebook revolutionized social interactions, how Netflix and YouTube transformed consumption of videos, or how Google became a verb, we may be weary of allowing ChatGPT or any LLMs run unchecked by the same copyright law and fair use standards that protect and preserve creators, including America’s free press.
As to what the future of GenAI looks like, “other publishers including Associated Press, News Corp., and Vox Media have reached content-sharing deals with OpenAI,” but consent to sharing copyrighted work (as inputs) does not address several other concerns with LLM outputs raised by the Times case (Allyn, 2025). The stakes are much higher than simple copyright law when inextricably tied to the First Amendment, so while we may consider Google and Warhol for reference, the New York Times will likely set a landmark precedent all its own. Meanwhile, the U.S. is a member of the Organization for Economic Co-operation and Development (OECD), an international body that publishes “AI policy initiatives across 60 participating countries & territories” (Vulcano, 2025). Within the OECD dashboards, Ernst & Young have observed several global trends in AI regulations, with the EU Artificial Intelligence Act (passed in 2024) as the “first major contributor to general AI regulation” (Vulcano, 2025). The EU regulation classifies AI into four categories - minimal risk, limited risk, high-risk, and unacceptable, providing examples of each and varying degrees of legal obligations.
Because the Internet, by its very nature, is a place of global exchange, it behooves the U.S. to seek alignment with international bodies when forming localized regulations. With high-stakes for innovation and constitutional protections, a balanced approach might involve creating a regulatory agency (comparable to the FCC) structured like the EU AI Act’s two-armed “scientific panel of independent experts” and enforcement office (Vulcano, 2025). According to Ernst & Young, many jurisdictions are using “regulatory sandboxes as a tool for the private sector to collaborate with policymakers to develop not only safe and ethical AI systems, but also rules that will support the future development of such systems” (Vulcano, 2025). By providing a framework of lower- to higher-risk AI systems, varying degrees of oversight and penalties could be assessed, with a focus on mitigating bias through routine audit of AI models, thorough testing, validation, and fallback requirements, transparency in AI modeling, informed consent from users with regard to data collection, and strict adherence to data privacy and encryption requirements (Vulcano, 2025).
Beyond intellectual property concerns, security of GenAI systems must be considered of the highest importance due to its inevitable training on personally-identifiable and sensitive information that is subject to current regulations. This may be the strongest argument leading towards new regulatory oversight, keeping a “human-in-the-loop” at defined intervals, depending on categorization of the AI system (e.g. minimal risk systems may not require such oversight). With endless possibilities for AI and the risks to intellectual property, personal data security, and most importantly, democracy itself; there remains far too much at stake to allow progress to continue unchecked without some regulatory guardrails. The most pressing question, however; while the Times and Microsoft continue gathering evidence, is whether technology will evolve faster than regulations required to censor any more damage done?
*While the FCC has established some regulations applicable to the Internet, the agency regulates interstate and international communications by radio, television, wire, satellite and cable.
UPDATE: Reuters v. Ross Intelligence decided 11FEB2025 contains the most recent precedent applied to AI with regard to fair use, further reinforcing above discussion.
Content is attributed exclusively to Hype Girl Media without assistance of AI and may not be reproduced without prior authorization nor associated with unnamed individuals or entities.