SummaryThis highly publicised case has been brought by The New York Times against Microsoft and OpenAI in the US District Court Southern District of New York, relating to ChatGPT (including associated offerings), Bing Chat and Microsoft 365 Copilot. It follows a period of months during which the NYT said it attempted to reach a negotiated agreement with Microsoft/OpenAI.
The Complaint raises arguments of large-scale commercial exploitation of NYT content, through the training of the relevant models (including GPT-4 and the next generation GPT-5), noting that the GPT LLMs have also 'memorized' copies of many of the works encoded into their parameters. There are extensive exhibits (69 exhibits, comprising around 2000 pages) attached to the Complaint. Exhibit J in particular contains 100 examples of output from GPT-4 (as a 'small fraction') based on prompts in the form of a short snippet from the beginning of an NYT article. The example outputs are said to recite NYT content verbatim (or near-verbatim), closely summarise it, and mimic its expressive style (and also wrongly attribute false information - hallucinations - to NYT).
The Complaint also focuses on synthetic search applications built on the GPT LLMs which display extensive excepts or paraphrases of contents of search results, including NYT content, that may not have been included in the model's training set (noting that this contains more expressive content from the original article than would be the case in a traditional search result, and without the hyperlink to the NYT website).
The claims are for direct copyright infringement, vicarious copyright infringement, contributory copyright infringement, DMCA violations, unfair competition by misappropriation, and trade mark dilution.
On 26 February 2024, OpenAI filed a Motion to Dismiss in relation to parts of the claim to direct copyright infringement (re conduct occurring more than 3 years ago), as well as the claims relating to contributory infringement, DMCA violations and state common law misappropriation. In particular, OpenAI alleges that the 'Times paid someone to hack OpenAI's products' and that it took 'tens of thousands of attempts to generate the highly anomalous results' in Exhibit J to the Complaint, including by targeting and exploiting a bug (which OpenAI says it has committed to addressing) in violation of its terms of use. OpenAI goes on to categorise the key dispute in the case as to whether it is fair use to use publicly accessible content to train generative AI models to learn about language, grammar and syntax, and to 'understand the facts that constitute humans' collective knowledge'. The New York Times has categorised OpenAI's motion as grandstanding, with an attention-grabbing claim about 'hacking' that is both irrelevant and false.
Microsoft filed its Motion to Dismiss parts of the claim on 4 March 2024 focusing on (1) the allegation that Microsoft is contributorily liable for end-user infringement (2) violation of DMCA copyright management information and (3) state law misappropriation torts. Drawing an analogy with earlier disruptive technologies, the Motion states "copyright law is no more an obstacle to the LLM than it was to the VCR (or the player piano, copy machine, personal computer, internet, or search engine)"- its point is that the US Supreme Court has previously rejected liability merely based on offering a multi-use product that could be used to infringe. It further states that Microsoft "looks forward to litigating the issues in this case that are genuinely presented, and to vindicating the important values of progress, learning and the sharing of knowledge".
The Plaintiffs filed an Amended Complaint on 12 August 2024 (the amendments add a further approximately 7 million works to the suit).
The case has been consolidated with The Daily News complaint and also with the claim brought by The Center for Investigative Reporting.
On 26 March 2025, the Court issued an order on the Motions to Dismiss as follows:
- Denied OpenAI's motions to dismiss the direct infringement claims involving conduct occurring more than three years before the complaints were filed
- Denied Defendants' motions to dismiss the contributory copyright infringement claims
- Denied Defendants' motions to dismiss the state and federal trademark dilution claims in the Daily News action
- Granted Defendants' motions to dismiss with prejudice the common law unfair competition by misappropriation claims
- Granted OpenAI's motion to dismiss with prejudice the 'abridgment' claims in the CIR action
- With respect to the DMCA claims:
- Granted Microsoft's motions to dismiss the section 1202(b)(1) claims against it in all three actions
- Granted OpenAI's motion to dismiss the section 1202(b)(1) claim against it in The New York Times action
- Granted Defendants' motions to dismiss the section 1202(b)(3) claims against them in all three actions
- all dismissed without prejudice
- Denied OpenAI's motions to dismiss the section 1202(b)(1) claims against it in the Daily News and CIR actions
OpenAI had sought to centralize 12 actions pending in the Northern District of California and Southern District of New York in California (Microsoft supported centralization in either California or New York). The Multidistrict Litigation Panel has found the actions involve common questions of fact and that centralization in the Southern District of New York would serve the parties' and witnesses' convenience and promote the just and efficient conduct of the litigation. Further, differences in the underlying claims and material alleged to be infringed did not present a significant obstacle. The following cases are therefore now assigned to Judge Sidney Stein in the Southern District of New York for coordinated or consolidated pretrial proceedings:
Cases transferred from Northern District of California
- Tremblay et al v OpenAI et al
- Silverman et al v OpenAI et al
- Chabon et al v OpenAI et al
- Millette v OpenAI et al
Cases currently in Southern District of New York
- Authors Guild et al v OpenAI et al
- Alter et al v OpenAI et al
- The New York Times Company v Microsoft et al
- Basbanes et al v Microsoft et al
- The Intercept Media v OpenAI et al
- Daily News et al v Microsoft et al
- The Center for Investigative Reporting v OpenAI et al
The opening words of the complaint stress the importance of independent journalism for democracy - and the threat to the NYT's ability to provide that service by the use of its works to create AI products. It further highlights the role of copyright in protecting the output of news organisations, and their ability to produce high quality journalism.
The NYT website is noted in the Complaint as being the most highly represented proprietary source of data in the Common Crawl dataset, itself the most highly weighted dataset in GPT-3. Given the previous attempt at negotiations referred to in the complaint, it will be interesting to see if the launch of this complaint will lead to more fruitful licence negotiations, or whether this case will continue to trial (in which case, it should be tracked alongside the other complaints against OpenAI and Microsoft).
OpenAI's position is that 'training data regurgitation' (or memorisation) and hallucination are 'uncommon and unintended phenomena'. Memorisation is a problem that OpenAI say that they are working hard to address, including through sufficiently diverse datasets. Meanwhile, it points to its partnerships with other media outlets.