A high-profile plaintiff could soon join the growing number of parties who have brought lawsuits against OpenAI, the maker of ChatGPT. After licensing negotiations between OpenAI and The New York Times reportedly broke down last month, the media giant is allegedly considering whether to sue the ChatGPT creator to protect the intellectual property in its reporting. A lawsuit from the Times would undoubtedly bring increased attention to the copyright issues raised by generative AI and has the potential to put the fair use doctrine to the test.
As a large language model (LLM), ChatGPT works by training on vast amounts of Internet text data and uses deep-learning techniques to predict the language most likely to follow a given sequence of words, as well as how to weigh the importance of different words, thereby generating human-like responses to natural language inputs. Known as generative artificial intelligence (AI), this technology does more than simply regurgitate the data on which it trained. Generative AI produces brand new content that did not previously exist in its training data.
The issue of copyright infringement therefore lies in the manner in which ChatGPT trained—by scraping vast amounts of content from the Internet. Copyright infringement occurs when original expression from a valid copyright is copied without permission. In the case of generative AI, the technology trains on billions of data points, meaning the question of access is theoretically not in doubt. Any of those billions of data points that are the subject of a valid copyright could potentially give rise to a claim of infringement by virtue of the AI training on them without permission.
Already this year, two groups of authors—one led by comedian Sarah Silverman—have sued OpenAI, alleging that the unauthorized inclusion of their books in datasets used to train ChatGPT infringed their copyrighted works. The cases join the ranks of other pending lawsuits against generative AI companies, including the image generator Stability AI, which similarly allege that the AI models have trained on artists’ works without authorization. In one of the cases involving Stability AI, the judge presiding over the case indicated he was “inclined to dismiss most of Plaintiffs’ claims without prejudice,” in part because the plaintiffs had not identified with any particularity the works that were copied. However, he also noted that an artist with registered copyrights has likely “asserted a cognizable claim of direct infringement against Stability AI for copying her work at the ‘input’ stage.”
This question of the training data input into ChatGPT is the reason that OpenAI and the Times have been negotiating this summer to reach a licensing deal. Under a license arrangement, OpenAI would presumably compensate the Times for the use of its news stories in training ChatGPT. However, absent a deal, the Times could sue OpenAI alleging similar grounds as those raised in the abovementioned lawsuits.
These copyright lawsuits are likely to turn on the application of the fair use doctrine. The fair use doctrine permits use of a copyrighted work for purposes such as criticism, news reporting, and parody. Courts consider four factors in evaluating whether an alleged infringement constitutes a fair use: the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and the effect of the use upon the potential market for or value of the copyrighted work.
In 2015, the Second Circuit affirmed a ruling of no copyright infringement in a claim brought by a group of authors against Google, finding that Google’s digital copying of tens of millions of books for its Google Books library was a highly transformative fair use. OpenAI will almost certainly cite to this case in defending the copyright lawsuits against it. In a possible suit by the Times, OpenAI would likely make arguments similar to those Google made in the 2015 case, namely, that its copying of news articles to train the ChatGPT tool was transformative and therefore constitutes fair use.
However, recent guidance from the Supreme Court in Warhol Foundation v. Goldsmith suggests that works with the same commercial purpose as the copied work are less likely to be considered transformative, which could have a significant impact in a case brought by the Times. If ChatGPT can provide responses that summarize original reporting done by a Times reporter, then the ChatGPT user would be less likely to visit the Times website to read the article in full. If the Times can argue that OpenAI’s use of its works in ChatGPT creates a replacement for the online newspaper site, then that could weigh heavily against a finding of fair use.
Whether the matter goes to court, or whether the parties agree to a licensing deal, it is bound to have consequences for how other content producers view ChatGPT and how OpenAI will work with copyright holders going forward.
 See Tremblay v. OpenAI Inc., U.S. District Court for the Northern District of California, No. 3:23-cv-03223 and Silverman v. OpenAI Inc., U.S. District Court for the Northern District of California, No. 3:23-cv-03416
 See, e.g., Getty Images (US), Inc. v. Stability AI, Inc., U.S. District Court for the District of Delaware, No. 1:23-cv-00135; see also Sarah Andersen, et al., v. Stability AI LTD., et al., U.S. District Court for the Northern District of California, No. 3:23-cv-00201.
 Sarah Andersen, et al., v. Stability AI LTD., et al., U.S. District Court for the Northern District of California, No. 3:23-cv-00201, Hearing on motions to dismiss held on 7/19/2023..
 Authors Guild. Google, Inc., 804 F.3d 202 (2d. Cir. 2015).
 Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith, No. 21-869 (U.S. May 18, 2023)
You may also like…
New York, New York —September 13, 2023—The International Trademark Association (INTA) has filed an amicus brief...
Responses to the consultation will inform the options for implementation of the Treaty The Intellectual Property...
Disney’s “Wagatha” documentary is set to launch despite Rebekah Vardy trademarking “Wagatha Christie”
The successful registration of the word mark “WAGATHA CHRISTIE” in the UK (UK00003895558) attracted significant press...