A federal judge in San Francisco ruled late on Monday that Anthropic’s use of books without permission to train its artificial intelligence system was legal under U.S. copyright law.
Siding with tech companies on a pivotal question for the AI industry, U.S. District Judge William Alsup said Anthropic made “fair use” of books by writers Andrea Bartz, Charles Graeber and Kirk Wallace Johnson to train its Claude large language model.
Alsup also said, however, that Anthropic’s copying and storage of more than 7 million pirated books in a “central library” infringed the authors’ copyrights and was not fair use. The judge has ordered a trial in December to determine how much Anthropic owes for the infringement.
U.S. copyright law says that willful copyright infringement can justify statutory damages of up to $150,000 per work.
Spokespeople for Anthropic did not immediately respond to a request for comment on the ruling on Tuesday.
The writers filed the proposed class action against Anthropic last year, arguing that the company, which is backed by Amazon and Alphabet, used pirated versions of their books without permission or compensation to teach Claude to respond to human prompts.
The proposed class action is one of several lawsuits brought by authors, news outlets and other copyright owners against companies including OpenAI, Microsoft and Meta Platforms over their AI training.
The doctrine of fair use allows the use of copyrighted works without the copyright owner’s permission in some circumstances.
Fair use is a key legal defense for the tech companies, and Alsup’s decision is the first to address it in the context of generative AI.
AI companies argue their systems make fair use of copyrighted material to create new, transformative content, and that being forced to pay copyright holders for their work could hamstring the burgeoning AI industry.
Anthropic told the court that it made fair use of the books and that U.S. copyright law “not only allows, but encourages” its AI training because it promotes human creativity. The company said its system copied the books to “study Plaintiffs’ writing, extract uncopyrightable information from it, and use what it learned to create revolutionary technology.”
Copyright owners say that AI companies are unlawfully copying their work to generate competing content that threatens their livelihoods.
Alsup agreed with Anthropic on Monday that its training was “exceedingly transformative.”
“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different,” Alsup said.
Alsup also said, however, that Anthropic violated the authors’ rights by saving pirated copies of their books as part of a “central library of all the books in the world” that would not necessarily be used for AI training.
Anthropic and other prominent AI companies including OpenAI and Meta Platforms have been accused of downloading pirated digital copies of millions of books to train their systems.
Anthropic had told Alsup in a court filing that the source of its books was irrelevant to fair use.
“This order doubts that any accused infringer could ever meet its burden of explaining why downloading source copies from pirate sites that it could have purchased or otherwise accessed lawfully was itself reasonably necessary to any subsequent fair use,” Alsup said on Monday.