'Systematic' Infringement

Microsoft, OpenAI GenAI Copying of NYT Content 'Not Fair Use,' Says NYT Complaint

December 28, 2023 by Paul Gluckman|Top News

Microsoft and OpenAI generative AI (GenAI) tools rely on large-language models (LLMs) “that were built by copying and using” millions of the New York Times’ copyrighted news articles, in violation of the Copyright Act, the Digital Millennium Copyright Act and other statutes, alleged the New York Times Co.’s' complaint Wednesday (docket 1:23-cv-11195) in U.S. District Court for Southern New York in Manhattan.

Through Microsoft’s Bing Chat and OpenAI’s ChatGPT, the two companies “seek to free-ride” on the NYT’s “massive investment in its journalism by using it to build substitutive products without permission or payment,” said the seven-count, 69-page infringement complaint. It seeks a permanent injunction, plus statutory and compensatory damages and restitution. Susman Godfrey and Rothwell Figg are representing the NYT. Bing Chat recently rebranded as Copilot.

Microsoft and OpenAI “have refused to recognize” U.S. copyright protections in the law and the Constitution, said the complaint. Their GenAI tools can generate output that “recites” NYT content verbatim, “closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples,” it said.

Using valuable intellectual property in these ways without paying for it “has been extremely lucrative” for Microsoft and OpenAI, said the complaint. Microsoft’s deployment of NYT-trained LLMs throughout its product line “helped boost its market capitalization by a trillion dollars in the past year alone,” it said. OpenAI’s release of ChatGPT “has driven its valuation to as high as $90 billion,” it added.

Microsoft and OpenAI didn’t immediately comment. News/Media Alliance President-CEO Danielle Coffey hailed the complaint, saying it demonstrates “the value of quality journalism to AI developers.” Quality journalism and GenAI can complement each other “if approached collaboratively,” but using journalism without permission or payment “is unlawful, and certainly not fair use,” Coffey said.

The New York Times Co. objected after discovering that Microsoft and OpenAI were using NYT content “without permission to develop their models and tools,” said the complaint. For months, the plaintiff tried reaching a negotiated agreement with the defendants, “in accordance with its history of working productively” with large tech platforms “to permit the use of its content in new digital products," it said.

The company's goal during these negotiations “was to ensure it received fair value for the use of its content, facilitate the continuation of a healthy news ecosystem, and help develop GenAI technology in a responsible way that benefits society and supports a well-informed public,” said the complaint. The negotiations haven’t led to a resolution, it added.

Previously, Microsoft and OpenAI defended their conduct as protected fair use on grounds that their unlicensed use of copyrighted content to train GenAI models serves a new “transformative” purpose, said the complaint. But there’s nothing transformative about using NYT content without payment to create products that substitute for the NYT “and steal audiences away from it,” it said.

The output of GenAI models from Microsoft and OpenAI “compete with and closely mimic the inputs used to train them,” said the complaint. Copying NYT works for that purpose “is not fair use.”

The law doesn’t permit “the kind of systematic and competitive infringement” that Microsoft and OpenAI have committed, said the complaint. “This action seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe” for unlawful copying and use of the NYT’s “uniquely valuable works,” it said.