The United States Copyright Office recently launched its “Artificial Intelligence Initiative,” set to “examine the copyright law and policy issues raised by artificial intelligence.” The initiative comes as a “direct response” to “striking advances in generative AI technologies and their rapidly growing use by individuals and businesses.”
The Copyright Office has said that its new initiative will assess both “the scope of copyright in works generated using AI tools and the use of copyrighted materials in AI training.” In other words: the Copyright Office is taking a hard look not only at AI-generated or AI-assisted outputs—but also plans to increase the scrutiny on AI inputs.
This month, the Copyright Office kicks off a series of four public listening sessions slated to take place through May. These sessions will be followed by informational webinars in the summer, and then, later in 2023, the Office will “publish a notice of inquiry soliciting public comments on a wide range of copyright issues arising from the use of AI.”
One issue, copyrightability and registration, is the subject of an already-published statement of policy in the Federal Register that focuses on providing guidance to those seeking to register AI-generated works. But the statement also offers breadcrumbs about the coming conversation on AI inputs that stakeholders ignore at their peril.
The Copyright Office’s statement observes that AI today “‘train[s]’ on vast quantities of preexisting human-authored works” in order to “use inferences from that training to generate new content.” And it previews that its forthcoming notice of inquiry will address “how the law should apply to the use of copyrighted works in AI training and the resulting treatment of outputs.” Between now and the issuance of that notice, there will be active participation from all sectors, including rightsholders and their representatives.
Indeed, the Copyright Office has invited and hosted commentary on AI inputs over the past several years. A fall 2021 conference co-hosted with the US Patent and Trademark Office (archived at https://www.copyright.gov/events/machine-learning/), for example, elicited: warnings about irreparable damage to the market for human-created expressive works; disagreement over the viability of the fair use defense where AI has been trained using a corpus of expressive works; and lengthy parsing of the role and scale of licensing across different AI use cases. That 2021 conference itself built on conversations held the prior year at the Copyright Office, during a February 2020 symposium co-hosted by the World Intellectual Property Organization (archived at https://www.copyright.gov/events/artificial-intelligence/). That symposium brought together AI companies and creators, along with academics, in-house IP and technology leads, and policymakers from around the globe.
Now is the year that the Copyright Office appears to have committed to formalizing its yearslong information gathering. What proposed rulemaking follows remains to be written. The lion’s share of attention will likely remain fixed on debates over whether and to what extent the involvement of human authorship in AI outputs remains a requirement for copyright registration. But any company relying on externally sourced datasets for machine learning should see the Copyright Office’s new initiative as a signal flare for critical guidance to come as to AI inputs.
Written by Allison Aviki, Partner and Megan Fitzgerald, Associate, Mayer Brown