Mishcon de Reya page structure
Site header
Main menu
Main content section

Generative AI – Intellectual property cases and policy tracker

Case tracker

With businesses in various sectors exploring the opportunities arising from the explosion in generative AI tools, it is important to be alive to the potential risks. In particular, the use of generative AI tools raises several issues relating to intellectual property, with potential concerns around infringements of IP rights in the inputs used to train such tools, as well as in output materials. There are also unresolved questions of the extent to which works generated by AI should be protected by IP rights. These issues are before the courts in various jurisdictions, and are also the subject of ongoing policy and regulatory discussions.

In this tracker, we provide an insight on the various intellectual property cases relating to generative AI going through the courts (focusing on a series of copyright cases in the US and UK), as well as anticipated policy and legislative developments.

Read more in our Guides to Generative AI & IP and to the use of Generative AI generally.

Please sign up to receive regular updates.

Subscribe

This page was last updated on 5 November 2024.

Court Cases

5 November 2024

Dow Jones and NYP Holdings v Perplexity AI

Dow Jones & Company, Inc. and NYP Holdings, Inc. v Perplexity AI, Inc.

US

Case 1:24-cv-07984

Complaint: 21 October 2024

Summary

This complaint has been filed in the US District Court Southern District of New York by Dow Jones and NYP Holdings (corporate parent, News Corporation), the publishers of The Wall Street Journal and the New York Post, against Perplexity, which is described in the complaint as a platform that allows users to access up to date news and information by 'skipping the links' to the original publishers' websites. The complaint focuses on both the input stage, and also the outputs of Perplexity's products, arguing that sometimes Perplexity's answers contain full or partial verbatim reproductions of the Plaintiffs' copyrighted articles. The complaint also highlights that Perplexity allegedly generates made-up text in its outputs and attributes that text to the Plaintiffs' publications using Plaintiffs' trade marks, which is argued to be likely to cause dilution by blurring/tarnishment.

The claim is for copyright infringement (arising out of Perplexity's alleged copying of the copyrighted works to create inputs for its RAG Index and to generate outputs to user queries) and false designation of origin and dilution of trade marks.

Impact

This is the latest case brought by a news publisher but is the first case that has been brought against Perplexity.

15 October 2024

Farnsworth v Meta

Christopher Farnsworth v Meta Platforms, Inc.

US

Case 3:24-cv-06893-VC

Complaint: 1 October 2024

Order relating case with Kadrey v Meta: 4 October

Summary

This complaint has been brought in the US District Court Northern District of California San Francisco Division by a fiction author, Christopher Farnsworth, against Meta relating to its LLaMa tools, which were trained using books including from the Books 3 section of The Pile data set, which the Plaintiff argues included his works. The complaint is for copyright infringement.

Impact

The complaint has been related with the Kadrey v Meta proceedings.

15 October 2024

Lehrman v Lovo

Lehrman and Sage, and John Doe v Lovo, Inc.

US

Case: 1:24-cv-03770-JPO 

Amended Complaint filed by Plaintiffs: 25 September 2024

Summary

This complaint has been brought in the US District Court Southern District of New York by two voice actors and by a John Doe Plaintiff (in relation to all plaintiffs, individually, and on behalf of a class of voice actors). The complaint is against AI firm LOVO in relation to the alleged cloning and use of the actors' voices without their permission in LOVO's AI-generated voice technology (Genny). The complaint was filed in May 2024 and has now been amended to incorporate copyright claims, in addition to claims relating to violations of rights of publicity, deceptive business practices, fraud, and breach of contract. The copyright claims are for copyright infringement of the original voice recordings made by the actors and contributory copyright infringement.

Impact

Unauthorised use of performers' likenesses and their voices has been a particularly controversial aspect of genAI technology, and has been a key issue for members of the SAG-AFTRA union in the US (with a strike action by video game voice actors ongoing). There have also been a number of high profile complaints raised by celebrities such as Scarlett Johansson.

10 September 2024

Bartz v Anthropic

Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson v Anthropic PBC

US

Case: 3:24-cv-05417

Complaint: 19 August 2024

Answer to Complaint: 21 October 2024 

Summary

This class action has been brought in the US District Court Northern District of California by three authors of fiction and non-fiction against Anthropic. The claim of copyright infringement relates to Anthropic's Claude model which is the subject of proceedings brought by a number of record companies. The Plaintiffs allege that, whilst Anthropic has been 'particularly secretive' about the sources of its training corpus for Claude, it has admitted to using The Pile dataset. The Plaintiff had sought to have the case related to the existing complaint against Anthropic but this has been denied by the Court. Anthropic's Answer to the Complaint includes reliance on the defence of fair use, amongst other affirmative defences.

Impact

The complaint notes that it has been reported that Claude has been used to generate cheap book content with it being reported that one man had "written" (their use of quotation marks) 97 books in less than a year using Claude (as well as ChatGPT).

10 September 2024

Millette v OpenAI

David Millette v OpenAI, Inc., OpenAI, L.P., OpenAI OPCO, L.L.C., OpenAI GP, L.L.C., OpenAI Startup Fund I, L.P., OpenAI Startup Fund GP I, L.L.C., and OpenAI Startup Fund Management, LLC

US

Case: 5:24-cv-04710 

Complaint: 2 August 2024

Motion to Dismiss filed by OpenAI: 4 September 2024

First Amended Complaint: 18 October 2024

 

 

Summary

This class action complaint has been brought in the US District Court Northern District of California against OpenAI (there are separate claims against Google/YouTube and Nvidia – the three cases have been related). The Plaintiff is a YouTube user and video creator and the complaint relates to the "surreptitious, non-consensual transcription of millions of YouTube users' videos" to train the Defendants' AI software products. The complaint refers to a New York Times report that claimed Whisper (OpenAI's automatic speech recognition system, released in 2022) is capable of transcribing audio from YouTube videos, and that an OpenAI team had transcribed more than one million hours of videos from YouTube. The claim is for unjust enrichment and unfair competition.

OpenAI has filed a Notice of Motion to Dismiss the complaint on both counts. It argues that the complaint is a 'carbon copy' of pleadings filed in other actions and that state law claims of unfair competition and unjust enrichment have been addressed in a number of judicial opinions in the ongoing cases – to the effect that the use of copyrighted materials to train AI models is exclusively governed by federal copyright law, and that state law claims are pre-empted by the Copyright Act. OpenAI has also applied for the three cases to be related.

On 18 October, the complaint was amended to bring in a new plaintiff and to add complaints of breaches of the Massachusetts Unfair and Deceptive Business Practices Act, and for direct copyright infringement.

Impact

It is notable that the complaint, as initially filed, does not include one of copyright infringement. It is assumed (as asserted by OpenAI) that this is because there will have been no registrations of some of the works in issue in this case.

10 September 2024

Millette v Google

David Millette v Google LLC, YouTube Inc., and Alphabet Inc.

US

Case: 5:24-cv-04708 

Complaint: 2 August 2024

 

Summary

This class action complaint has been brought in the US District Court Northern District of California against Google/YouTube (there are separate claims against OpenAI and Nvidia – the three cases have been related) concerning Google's Gemini products. The Plaintiff is a YouTube user and video creator. The complaint relates to the "surreptitious, non-consensual transcription of millions of YouTube users' videos" to train the Defendants' AI software products. The complaint refers to a New York Times article that reported that Google had transcribed YouTube videos to harvest text for its language models, having changed its terms of service in 2023. The claim is for unjust enrichment and unfair competition.

Impact

This is the second case brought in relation to YouTube transcripts. As with the other two cases, there is no claim of copyright infringement.  

10 September 2024

Millette v Nvidia

David Millette v Google LLC, YouTube Inc., and Alphabet Inc.

US

Case: 5:24-cv-05157

Complaint: 14 August 2024

 

Summary

This class action complaint has been brought in the US District Court Northern District of California against Nvidia (there are separate claims against Google/YouTube and OpenAI – the three cases have been related) concerning the training of Nvidia's Cosmos AI software.  The Plaintiff is a YouTube user and video creator and the complaint relates to the "surreptitious, non-consensual transcription of millions of YouTube users' videos" to train the Defendants' AI software products in violation of YouTube's terms of service and at the expense of video creators.

Impact

This is the third case brought in relation to YouTube transcripts. As with the other two cases, there is no claim of copyright infringement.  

Summary

This class action was brought in the US District Court Northern District of California by an initially anonymised group (comprising an author/journalist, as well as users of Gmail/Google search engines etc including some minors, and users of social media services) against Alphabet Inc, Google Deepmind and Google LLC, in July 2023 in relation to the training of Bard (now Gemini) and other Google AI products. The claim is now proceeding only against Google LLC. The original claim alleged a number of claims including violation of competition laws, negligence, invasion of privacy, intrusion upon inclusion, larceny/receipt of stolen property, conversion, unjust enrichment, direct copyright infringement, vicarious copyright infringement and violation of the DMCA.  

Following the Defendants' Motion to Dismiss the Complaint, the Plaintiffs filed an Amended Complaint in which they made a number of changes to the complaint, including adding new causes of action. In relation to the copyright claims, they removed the vicarious copyright infringement and DMCA claims and revised the direct infringement claim to allege that "Bard's outputs were necessarily derivative" of the Plaintiffs' works (including the work of the author Jill Leovy) used to train the model.  Google filed a Motion to Dismiss arguing that the complaint was a "shotgun pleading", alternatively to dismiss other than the claim of direct infringement in relation to the work of Leovy (except to the extent that it was based on the argument that every output was a derivative infringing work).

On 6 June 2024, the Court granted Google's motion to dismiss with leave to amend in light of concerns expressed by the Judge in their order dismissing the complaint in Cousart v OpenAI.

On 27 June 2024, the Plaintiffs filed their Second Amended Complaint which comprises solely a claim for direct copyright infringement.  The Complaint relates how Gemini was initially built on the LaMDA LLM – with certain of the data used to train LaMDA coming from the C4 dataset which contains copyrighted materials.

The case has been consolidated with Zhang v Google.

Impact

This is one of a number of cases against Google relating to its Gemini (Bard) and other AI products (see Zhang v Google – the cases have been consolidated).

16 July 2024

Robert Kneschke v Laion

Robert Kneschke v Laion

Germany

Decision of District Court of Hamburg: 27 September 2024 (German text)

 

 

Summary

In this case, the District Court of Hamburg in Germany was asked to consider infringement arising out of the use of images taken by photographer Robert Kneschke (which had been downloaded from Shutterstock which had terms and conditions prohibiting scraping etc) against LAION, during the creation of its LAION 5B dataset of image-text pairs made available free of charge (LAION is a not for profit organisation). The claim specifically does not cover further acts of training or development of AI models using the data set (by companies such as Stability AI, for example).

The Court delivered its decision on 27 September 2024. It found that there was an infringement of the Plaintiff's copyright work by reproduction in the creation of the dataset. The Defendant was not entitled to rely upon the defence of temporary reproduction as the act of reproduction was not transient or incidental. However, as a research organisation, the Defendant could rely upon the exception for text and data mining for non-commercial scientific research purposes (as provided for in Article 3 of the Digital Single Market (DSM) Copyright Directive, and implemented in German law) in relation to its acts of scraping and analysis in the creation of the data set. The data set had been published free of charge and made available to researchers in the field of artificial neural networks. It was irrelevant in the assessment of the creation of the data set that it was also used by commercial companies for training and further developing their AI systems.

The Court therefore did not need to decide whether the Defendant could also rely on the general text and data mining exception provided for in Article 4 of the DSM Copyright Directive. Unlike the exception in Art.3, a rights holders can opt out of the TDM exception in Art.4 provided that its reservation of rights is in a machine-readable format. Whilst the Court did not need to decide on this issue, it suggested that a reservation of rights written solely in 'natural language' would be 'machine understandable' but this would need to be assessed depending on the technical development at the relevant time of use of the work.

Impact

As the first decision dealing with TDM exceptions and the temporary copying exception in relation to AI, this is an important case, albeit of limited scope, as the case focuses on the creation of the data set by LAION, and not on its subsequent use by AI tool developers to train their models.

1 July 2024

The Center for Investigative Reporting v OpenAI and Microsoft

The Center for Investigative Reporting, Inc., v OpenAI, Inc., OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco LLC, OpenAI Global LLC, OAI Corporation, LLC, OpenAI Holdings, LLC, and Microsoft Corporation

US

Case: 1:24-cv-04872

Complaint: 27 June 2024

Motion to Dismiss Counts III, VI and VII filed by Microsoft: 3 September 2024 (and Memorandum of Law in Support)

Motion to Dismiss Counts III, VI and VII filed by OpenAI: 3 September 2024 (and Memorandum of Law in Support)

First Amended Complaint: 24 September 2024

Motion to consolidate case with NYT and Daily News cases filed by OpenAI: 4 October 2024

Motion to Dismiss Amended Complaint: 15 October 2024 (and Memorandum of Law in Support) filed by Microsoft

Motion to Dismiss Amended Complaint: 15 October 2024 (and Memorandum of Law in Support) filed by OpenAI

Opposition to Defendants' Motion to Consolidate filed by CIR: 18 October 2024

Reply in support of Defendants' Joint Motion to Consolidate: 25 October 2024 

Order granting consolidation: 31 October 2024

 

Summary

The Center for Investigative Reporting (CIR) has brought a complaint against OpenAI and Microsoft in the US District Court Southern District of New York. The CIR, founded in 1976, describes itself as the oldest nonprofit newsroom in the US, reporting investigative stories about under-represented voices (its brands are Mother Jones, Reveal and CIR Studios). It alleges that tens of thousands of its articles have been copied as part of the training process of the Defendants' products, and that they memorize/regurgitate material or abridge it unlawfully.

The complaint alleges direct copyright infringement, contributory copyright infringement, and DMCA violations. CIR seeks actual damages and profits, or statutory damages of no less than $750 per infringed work, and $2500 per DMCA violation.  

OpenAI and Microsoft have filed Motions to Dismiss various of the claims. These include the claims under the DMCA alleging that Microsoft removed copyright infringement information from CIR's works or distributed works with the copyright management information (CMI) removed. They have also filed to dismiss the claim for contributory copyright infringement. In its Motion to Dismiss, OpenAI also seeks to dismiss the count of copyright infringement to the extent it relies upon CIR's 'novel' claim relating to 'abridgments' of CIR's copyrighted works. It argues that this claim should fail as, to constitute an infringing derivative work, an 'abridgment' must do more than just recite facts about an existing work, i.e., it would have to reprise the original's protected expression.

The case has been consolidated with the other newspaper claims, brought by The New York Times and Daily News.

Impact

This is a further complaint brought by a news organisation, in addition to the complaints brought by The New York Times and by a range of local/regional publications. The cases have been  consolidated.

CIR notes that the Defendants greatly benefit from its distinct voice in the marketplace as an investigative news outlet  – if the Defendants were limited to a more homogenous dataset, their LLMs would be "stunted in growth and power".

27 June 2024

UMG Recordings v Uncharted Labs d/b/a Udio.com

UMG Recordings, Inc., Capitol Records, LLC, Sony Music Entertainment, Arista Music, Arista Records LLC, Atlantic Recording Corporation, Rhino Entertainment Company, Warner Music Inc., Warner Music International Services Limited, Warner Records Inc., Warner Records LLC, and Warner Records/Sire Ventures LLC v Uncharted Labs, Inc., d/b/a/ Udio.com and John Does 1-10

US

Case: 1:24-cv-04777

Complaint: 24 June 2024

Answer to Complaint: 1 August 2024

 

Summary

This action has been brought in the US District Court for the Southern District of New York by a group of major record companies against the company behind Udio, a generative AI service launched in April 2024 by a team of former researchers from Google Deepmind.  Udio allows users to create digital music files based on text prompts or audio files. As with the complaint against Suno (see below), the Plaintiffs rely on tests comprising targeted prompts including the characteristics of popular sound recordings – such as the decade of release, the topic, genre and descriptions of the artist. They allege that using these prompts caused Udio's product to generate music files strongly resembling copyrighted recordings. For example, using the prompt "my tempting 1964 girl smokey sing hitsville soul pop" and excerpting lyrics from the band The Temptations led to Udio generating a digital music file called "Sunshine Melody" which would allegedly be instantly recognised as resembling the song "My Girl"

The claim is for direct copyright infringement.

In its Answer to the Complaint, Udio highlights the fact that the Plaintiffs do not allege that outputs generated by Udio infringe copyright. Whilst it accepts that the "many recordings that Udio was trained on presumably included recording whose rights are owned by the Plaintiffs" it argues that copies used in the training process, given that they are "never seen or heard by anyone", are not infringing. This is because it is argued to be "quintessential fair use" to copy the Plaintiffs' works as part of the process of developing a new technology in the service of creating an ultimately non-infringing new product. Udio further argues that the Plaintiffs, comprising major labels, have an aversion to competition but that "no owns musical styles".  

Impact

Whilst there is already a case relating to song lyrics (the claim against Anthropic's Claude), this (and the Suno claim) are the first claims to have been brought in relation to copyrighted sound recordings.  The Recording Industry Association of America (RIAA) has issued a press release in relation to the claims brought against both Udio and Suno. Noting that the music community has embraced AI, the RIAA argues that unlicensed services set back "the promise of genuinely innovative AI for us all".  Both complaints seek to deal head on with the likely claim of fair use:  “[The services] cannot avoid liability for [their] willful copyright infringement by claiming fair use. The doctrine of fair use promotes human expression by permitting the unlicensed use of copyrighted works in certain, limited circumstances, but [the services] offe[r] imitative machine-generated music—not human creativity or expression.”

In its response to the Answers to the Complaint filed by Udio and Suno, the RIAA issued a statement on X highlighting the "major concession" in relation to "massive unlicensed copying of artists' recordings" and rejects the reliance on fair use as a defence.

27 June 2024

UMG Recordings v Suno

UMG Recordings, Inc., Capitol Records, LLC, Sony Music Entertainment,  Atlantic Recording Corporation, Atlantic Records Group LLC, Rhino Entertainment Company, The All Blacks U.S.A., Inc., Warner Music International Services Limited, and Warner Records Inc., v Suno, Inc. and John Does 1-10.

US

Case: 1:24-cv-11611

Complaint: 24 June 2024

Answer to Complaint: 1 August 2024

Summary

This action has been brought in the US District Court for the District of Massachusetts by a group of major record companies against the company behind Suno, a generative AI service launched in July 2023.  Suno allows users to create digital music files based on text prompts. As with the complaint against Udio (see above), the Plaintiffs rely on tests comprising targeted prompts including the characteristics of popular sound recordings – such as the decade of release, the topic, genre and descriptions of the artist. They allege that using these prompts caused Suno's product to generate music files strongly resembling copyrighted recordings. For example, Suon's service has generated 29 different outputs that contain the style of Chuck Berry's "Johnny B. Goode" – using the prompt "1950s rock and roll, rhythm & blues, 12 bar blues, rockabilly, energetic male vocalist, singer guitarist" and the lyrics from the original, one output titled "Deep down in Louisiana close to New Orle" replicates the highly distinctive rhythm of the original's chorus, and uses the same melodic shape on the phrases "go Johnny, go, go".

The claim is for direct copyright infringement.

As with the Udio Complaint, in its Answer to the Complaint, Suno highlights the fact that the Plaintiffs do not allege that outputs generated by Suno infringe copyright.  Whilst it notes that "it is no secret that the tens of millions of recordings that Suno's model was trained on presumably included recordings whose rights are owned by the Plaintiffs in this case" it also argues that copies used in the training process, given that they are "never seen or heard by anyone", are not infringing. This is because it is argued to be "quintessential fair use" to use a back-end technological process, invisible to the public, in creating "an ultimately non-infringing new product". Suno also argues that the Plaintiffs, comprising major labels, have an aversion to competition but that "no owns musical styles".  

Impact

Whilst there is already a case relating to song lyrics (the claim against Anthropic's Claude), this (and the Udio claim) are the first claims to have been brought in relation to copyrighted sound recordings.  The Recording Industry Association of America (RIAA) has issued a press release in relation to the claims brought against both Udio and Suno. Noting that the music community has embraced AI, the RIAA argues that unlicensed services set back "the promise of genuinely innovative AI for us all".  Both complaints seek to deal head on with the likely claim of fair use:  “[The services] cannot avoid liability for [their] willful copyright infringement by claiming fair use. The doctrine of fair use promotes human expression by permitting the unlicensed use of copyrighted works in certain, limited circumstances, but [the services] offe[r] imitative machine-generated music—not human creativity or expression.”

In its response to the Answers to the Complaint filed by Udio and Suno, the RIAA issued a statement on X highlighting the "major concession" in relation to "massive unlicensed copying of artists' recordings" and rejects the reliance on fair use as a defence. In relation to the argument that the "apparent attempts to misuse the tool to generate renditions of pre-existing songs" is "unrepresentative of what real people do with Suno", the RIAA notes that in a presentation to venture capitalists, its co-founder was shown on video using "Hendrix" as a prompt.  

Summary

In May 2020, Thomson Reuters and West Publishing Corporation (the Plaintiffs) filed a claim for copyright infringement against ROSS Intelligence Inc. (ROSS). In their claim, the Plaintiffs allege that ROSS “illicitly and surreptitiously” used a third-party Westlaw licensee, LegalEase Solutions which in turn, hired a subcontractor, Morae Global to access and copy the Plaintiffs’ proprietary content on the Westlaw database. It is alleged that ROSS used the content to train its machine learning model to create a competing product.

The Plaintiffs are seeking injunctive relief and damages that they have suffered as a result of ROSS’ direct, contributory, and vicarious copyright infringement and intentional and tortious interference with contractual relations.

In the Memorandum Opinion issued in September 2023, Judge Stephanos Bibas denied the Plaintiff's and ROSS's cross-motions for summary judgment finding that only a jury can evaluate the four factors required for the fair-use defence to copyright infringement. These four factors are: (1) the purpose and character of the use, (2) the nature of the copyrighted work, (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and (4) the effect of the use upon the potential market for the copyrighted work. 

The trial was due to begin before a jury on 23 August 2024. However, the day before the hearing, District of Delaware Judge Stephanos Bibas decided to invite the parties to renew their motions and cross-motions for summary judgment (which will therefore consider issues relating to copyrightability, validity and infringement; and the fair use defence, alongside potentially other issues).  

On 1 October 2024, Thomson Reuters filed renewed motions for partial summary judgment on fair use, direct copyright infringement and Ross' defences of merger, scenes à faire, innocent infringement and copyright misuse.  Ross meanwhile has filed motions for summary judgment on its affirmative defence of fair use and as to Thomson Reuters' copyright claims.

Oral argument on the renewed summary judgment motions will be heard on 5 and 6 December 2024, with a trial date set for 12 May 2025 if the case proceeds (although it may be held earlier). 

Impact

If this case had gone to trial as scheduled, it would have been one of the first to test whether copyright owners can prevent businesses using copyrighted works for the purpose of training machine learning models for AI tools.

9 May 2024

Makkai v Databricks, Inc

Rebecca Makkai and Jason Reynolds v Databricks, Inc., and Mosaic ML, Inc.

US

Case: 4:24-cv-02653

Complaint: 2 May 2024

Answer to Complaint by Databricks, Inc, Mosaic LM, Inc: 29 May 2024

Summary

This class action complaint has been issued in the US District Court Northern District of California by two authors (Rebecca Makkai and Jason Reynolds) against MosaicML and its parent company Databricks.  Makkai owns registered copyrights in a number of books including The Hundred Year House, while Reynolds owns registered copyrights in books including As Brave as You.

The plaintiffs allege that their copyright works were included in the training dataset for MosaicML Pretrained Transformer (MPT) a series of large language models created by MosaicML and distributed by Databricks (including MPT-7B launched in May 2023, and MPT-30B launched in June 2023).  MosaicML has noted that a large quantity of data in the MPT training datasets comes from a component dataset called "RedPajama – Books". The complaint asserts that this is hosted on the Hugging Face website and its Books component is a copy of the Books3 dataset, which is itself a component of The Pile, which is derived from the Bibliothik shadow library comprising approximately 196,640 books.  The complaint against MosaicML is for direct copyright infringement.  The complaint against Databricks is for vicarious infringement (Databricks having acquired MosaicML in July 2023).

Impact

This is a further case brought by authors in relation to the use of their copyright works in training datasets for AI models (see also below O'Nan v Databricks and MosaicML). The cases have now been related.

9 May 2024

Dubus v Nvidia

Andre Dubus III and Susan Orlean v Nvidia Corporation

US

Case: 4:24-cv-02655

Complaint: 2 May 2024

Order relating case to Nazemian v Nvidia: 29 May 2024 

Answer to Complaint by Nvidia: 1 July 2024

Summary

This class action complaint has been issued in the US District Court Northern District of California by two authors owning registered copyrights in certain books that were alleged to be included in the training dataset Nvidia used to train its NeMo Megatron models, released in September 2022.  The complaint alleges that each of the NeMo Megatron models is hosted on a website called Hugging Face and each has a model card that provides information about the model, including its training dataset – for each of the NeMo Megatron models, the model card states that "the model was trained on 'The Pile' dataset prepared by Eleuther AI" (which includes the Book3 dataset, derived from the Bibliothik shadow library). The complaint is for direct copyright infringement.

Impact

This is a further case brought by authors in relation to the use of their copyright works in training datasets for AI models (see also below Nazemian v Nvidia).

1 May 2024

Daily News v Microsoft and OpenAI

Daily News, L.P., Chicago Tribune Company, LLC, Orlando Sentinel Communications Company, LLC, Sun-Sentinel Company, LLC, San Jose Mercury-News, LLC, DP Media Network, LLC, ORB Publishing, LLC, and Northwest Publications, LLC v  Microsoft Corporation, OpenAI, Inc., OpenAI LP, OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco, LLC, OpenAI Global, LLC, OAI Corporation, LLC and OpenAI Holdings LLC

US

Case: 1:24-cv-03285

Complaint: 30 April 2024

Motion to dismiss filed by Microsoft: 11 June 2024

Motion to dismiss filed by OpenAI: 11 June 2024

Motion to consolidate with NYT action filed by OpenAI: 13 June 2024

Brief re Motion to consolidate filed by Microsoft: 14 June 2024

Memorandum of law in opposition re Microsoft's motion to dismiss filed by Plaintiffs: 25 June 2024

Memorandum of law in opposition re OpenAI's motion to dismiss filed by Plaintiffs: 25 June 2024

Response to Motion to Consolidate: 27 June 2024

Reply Memorandum of Law in Support re Motion to Dismiss filed by Microsoft: 2 July 2024

Reply Memorandum of Law in Support re Motion to Dismiss filed by OpenAI: 2 July 2024

Reply Memorandum of Law in Support re Motion to Consolidate filed by Microsoft: 3 July 2024

Reply Memorandum of Law in Support re Motion to Consolidate filed by OpenAI: 3 July 2024 

Motion to consolidate case with Center for Investigative Reporting filed by Defendants: 4 October 2024

Response to Motion to Consolidate filed by The New York Times and Daily News et al: 18 October 2024

Order granting consolidation: 31 October 2024

Summary

This complaint has been issued in the US District Court Southern District of New York by a number of regional and local newspapers (such as the New York Daily News and Chicago Tribune) and their publishers against OpenAI and Microsoft.

As with the complaint brought by The New York Times, examples are given of the GPT LLMs having 'memorised' copies of training data, as well as alleged hallucinations.  The complaint is for direct copyright infringement, vicarious copyright infringement, contributory copyright infringement (including in relation to end users, to the extent end users are liable as direct infringers), removal of copyright management information, common law unfair competition by misappropriation, trade mark dilution (in branding outputs generated by OpenAI's GPT-based products), and dilution and injury to business reputation.  

OpenAI and Microsoft have filed Motions to dismiss the ancillary claims (but not the core issue of whether using copyrighted content to train a generative AI model is fair use). The case has been consolidated with The New York Times complaint; the cases have also been consolidated with the claim brought by The Center for Investigative Reporting.

Impact

Describing themselves as a 'rare breed in America' in terms of providing local news coverage, the Plaintiffs cite the new threat posed to them by GenAI products.  But, they also stress that this this is not a battle between new and old technology but one that is based on alleged use of copyrighted newspaper content, without their consent and without what they see as fair payment.  The case should be tracked alongside the complaint brought by The New York Times and The Center for Investigative Reporting (the cases have been consolidated).

30 April 2024

Zhang v Google LLC

Jingha Zhang, Sarah Andersen, Hope Larson and Jessica Fink v Google LLC and Alphabet Inc.

US

Case: 5:24-cv-02531

Complaint against Alphabet and Google: 26 April 2024

Motion to dismiss complaint filed by Alphabet Inc, Google LLC: 20 June 2024

Opposition to Motion to Dismiss: 18 July 2024

Order relating case to J.L. v Alphabet Inc: 23 July 2024

Reply in Support of Motion to Dismiss Complaint: 1 August 2024

Motion to consolidate with Leovy filed by Google: 4 October 2024

Order granting consolidation: 28 October 2024

Summary

This class action complaint has been brought by a number of visual artists against Google (and its parent company Alphabet) in relation to its text-to-image diffusion models Imagen (announced in May 2022 but not immediately released to the public), Imagen 2 (released in December 2023) and multi-modal models trained on both images and text (such Google Gemini).  The complaint is (only) for direct copyright infringement against Google and vicarious copyright infringement against Alphabet.  The complaint is based on an argument that the key source of Google's training data is the LAION image datasets.

The Defendants have filed a Motion to Dismiss in relation to works not named in the complaint, or not validly registered; the copyright infringement claim based on the theory that the Defendants' AI models are an infringing derivative work; and the vicarious infringement claim against Alphabet in its entirety.

The case has been consolidated with Leovy v Google.

Impact

This is the latest claim brought by visual artists, and includes as one of the Plaintiffs Sarah Andersen, who is a Plaintiff in the action against StabilityAI and other text-to-image models.

20 March 2024

Nazemian v Nvidia

Abdi Nazemian, Brian Keene and Stewart O'Nan v Nvidia Corporation

US

Case: 5:24-cv-01454

Complaint: 8 March 2024

Answer to Complaint by Nvidia: 24 May 2024

Order relating case to Dubus v Nvidia: 29 May 2024

Summary

In this class action complaint filed by three authors against Nvidia in the US District Court Northern District of California San Francisco Division, the Plaintiffs have brought a claim of direct copyright infringement against Nvdia relating to its NeMo Megatron LLM series released in September 2022.

The complaint alleges that the Plaintiff's registered copyrights were included in the training dataset used by Nvidia to develop its models. Each of the models is hosted on a website called Hugging Face, with a model card that provides information about the model, including its training dataset, in which it is stated that the model was trained on 'The Pile' dataset prepared by EleutherAI (the complaint therefore alleges that the LLM series was trained on one or more of the Plaintiffs' works).

Impact

This is another case relating to 'The Pile', one component of which is alleged to be a collection of books called Books3, derived from the Bibliotek 'shadow library' website. According to the complaint, the Books3 dataset was removed from Hugging Face in October 2023.

19 March 2024

O'Nan v Databricks

Stewart O'Nan, Abdi Nazemian and Brian Keene v Databricks, Inc., and MosaicML, Inc.

US

Case 3:24-cv-01451

Complaint: 8 March 2024

Answer to Complaint: 2 May 2024

Order relating case: 13 May 2024

Summary

In this class action filed by three authors against MosaicML (and its parent company Databricks) in the US District Court Northern District of California San Francisco Division, the Plaintiffs have brought a claim of direct copyright infringement relating to the training of MosaicML's Pretrained Transformer (MPT) models including MPT-7B and MPT-30B.  The complaint alleges that the MPTs were trained on a large quantity of data taken from a component dataset called 'RedPajama – Books' which was a dataset hosted on Hugging Face and in respect of which the 'Books' component is a copy of the "Books3 dataset", which his itself a component of The Pile dataset.  The complaint also alleges vicarious infringement against Databricks.

Impact

This is another case relating to 'The Pile', one component of which is alleged to be a collection of books called Books3, derived from the Bibliotek 'shadow library' website. The case has now been related to the Makkai claim against Databricks.

19 February 2024

In re ChatGPT Litigation: Tremblay v OpenAI (consolidated with Silverman v OpenAI and Chabon v OpenAI)

(1) Paul Tremblay & (2) Mona Awad v (1) OpenAI, Inc.; (2) OpenAI, L.P.; (3) OpenAI Gp, L.L.C., (4) OpenAI Opco, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C.; (6) OpenAI Startup Fund I, L.P.;(7) OpenAI Startup Fund Management, LLC 

US

Case 3:23-cv-03223

Complaint: 28 June 2023

Motion to dismiss by OpenAI: 28 August 2023

Opposition/Response to Motion to Dismiss: 27 September 2023

Reply re Motion to Dismiss: 11 October 2023

Order consolidating related cases: 9 November 2023

Order by Judge Araceli Martinez-Olguin granting in part and denying in part Motion to Dismiss: 12 February 2024

Motion to Intervene, enjoin Defendants and their Counsel from proceeding in substantially similar cases in the Southern District of New York: 8 February 2024

Defendants' Opposition/Response re Motion to Intervene, Enjoin Defendants and their Counsel: 22 February 2024  

Plaintiffs' Reply re Motion to Intervene, Enjoin Defendants and their Counsel: 29 February 2024

First Consolidated Amended Complaint against All Defendants: 13 March 2024

Motion to Dismiss First Consolidated Amended Complaint filed by OpenAI: 27 March 2024

Opposition/Response re Motion to Dismiss First Amended Complaint filed by Plaintiffs: 10 April 2024

Reply re Motion to Dismiss First Consolidated Amended Complaint filed by OpenAI: 17 April 2024

Answer to Amended Complaint by OpenAI: 27 August 2024

Order of Magistrate Judge Robert M Illman: 24 September 2024

Summary

Three cases against OpenAI have been consolidated (Tremblay v OpenAI, Silverman v OpenAI and Chabon v OpenAI). 

This class action claim has been brought by two authors as individual and representative Plaintiffs against OpenAI relating to its ChatGPT large language model (LLM). The claim has been brought in the US District Court for the Northern District of California (Mona Awad voluntarily applied for the dismissal of their claim on 11 August 2023).

The Plaintiffs allege that, during the training process of its LLMs, OpenAI copied "at least Plaintiff Tremblay’s book The Cabin at the End of the World; and Plaintiff Awad’s books 13 Ways of Looking at a Fat Girl and Bunny" without their permission. Further, they argue that "because the OpenAI Language Models cannot function without the expressive information extracted from Plaintiffs’ works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works, made without Plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act". The complaint also notes that, when prompted, ChatGPT generates summaries of the Plaintiffs' works.

Of particular relevance in this case is the datasets which OpenAI used in training its GPT models (with OpenAI having confirmed it had used datasets called Books1 and Books2 though it has not revealed the contents of those datasets).

In addition to direct and vicarious copyright infringement, the class action alleges violations of the Digital Millennium Copyright Act, unjust enrichment, violations of the California and common law unfair competition laws, and negligence.

On 28 August 2023, OpenAI filed a Motion to Dismiss a number (but not all) of the Plaintiff's claims. In particular, the Motion to Dismiss does not relate to the direct copyright claim, where OpenAI relies on a defence of fair use.

OpenAI's motion to dismiss was heard on 7 December 2023. In an order of 12 February 2024, the Court dismissed a number of claims in the Complaint, but with leave to amend in relation to the claim to vicarious infringement and the copyright management information (CMI) claim (the claim to direct infringement was not included in the motion to dismiss).  

The First Consolidated Amended Complaint filed by the Plaintiffs alleges direct infringement and unfair competition. OpenAI has filed its Answer to the complaint, and also applied to dismiss the unfair competition claim.

The Court has put in place an inspection protocol in relation to OpenAI's training data which will be made available for inspection at OpenAI's San Francisco offices or other secure nearby location, subject to certain strict protocols and conditions. This is seen as particularly significant as it will be the first time that this information has been made available for inspection.  

Impact

The case has been consolidated with the Silverman and Chabon actions against OpenAI.  OpenAI has applied to dismiss a number of the claims. 

As Open AI puts it in its Reply document, "the issue at the heart of this litigation is whether training artificial intelligence to understand human knowledge violates copyright law. It is on that question that the parties fundamentally disagree, and on which the future of artificial intelligence may turn".

18 February 2024

Silverman & ors v OpenAI (consolidated with Tremblay v OpenAI and Chabon v OpenAI)

(1) Sarah Silverman, (2) Christopher Golden & (3) Richard Kadrey v (1) OpenAI, Inc.; (2) OpenAI, L.P.; (3) OpenAI Gp, L.L.C., (4) OpenAI Opco, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C.; (6) OpenAI Startup Fund I, L.P.;(7) OpenAI Startup Fund Management, LLC 

US

Case 3:23-cv-03416

Complaint: 7 July 2023

Motion to Dismiss by OpenAI: 28 August 2023

Plaintiffs' Opposition to OpenAI's Motion to dismiss: 27 September 2023

OpenAI's Reply re Motion to dismiss: 11 October 2023

Order consolidating related cases: 9 November 2023

Order by Judge Araceli Martinez-Olguin granting in part and denying in part Motion to Dismiss: 12 February 2024

Summary 

This case has now been consolidated with Tremblay v OpenAI – see Tremblay entry for future updates.

In related proceedings to the complaint filed by Tremblay and Awad, the comedian Sarah Silverman, and other Plaintiffs as individual and representative plaintiffs have also brought proceedings against OpenAI relating to ChatGPT in the US District Court for the Northern District of California.

The Plaintiffs allege that, during the training process of its LLMs, OpenAI copied "at least Plaintiff Silverman’s book The Bedwetter; Plaintiff Golden’s book Ararat; and Plaintiff Kadrey’s book Sandman Slime." without Plaintiffs' permission. Further, it is argued that "because the OpenAI Language Models cannot function without the expressive information extracted from Plaintiffs’ works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works, made without Plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act".

In addition to direct and vicarious copyright infringement, the class action alleges violations of the DMCA, unjust enrichment, violations of the California and common law unfair competition laws, and negligence.

On 28 August 2023, OpenAI filed a Motion to Dismiss a number (but not all) of the Plaintiff's claims. In particular, the Motion to Dismiss does not relate to the direct copyright claim, where OpenAI relies on a defence of fair use.

OpenAI's motion to dismiss was heard on 7 December 2023. In an order of 12 February 2024, the Court dismissed a number of claims in the Complaint, but with leave to amend in relation to the claim to vicarious infringement and the copyright management information (CMI) claim (the claim to direct infringement was not included in the motion to dismiss).  

Impact

The complaint has now been consolidated with the Tremblay and Chabon actions against OpenAI (see Tremblay action for further updates).

16 February 2024

Chabon & ors v Open AI (consolidated with Tremblay v OpenAI and Silverman v OpenAI)

(1) Michael Chabon (2) David Henry Hwang (3) Matthew Klam (4) Rachel Louise Snyder (5) Ayelet Waldman v (1) OpenAI, Inc. (2) OpenAI, L.P. (3) OpenAI Opco, L.L.C. (3) OpenAI GP, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C. (6) OpenAI Startup Fund I, L.P. (7) OpenAI Startup Fund Management, LLC

US

Case 3:23-cv-04625

Amended Complaint: 5 October 2023

Order consolidating related cases: 9 November 2023

Summary

This case has now been consolidated with Tremblay v OpenAI – see Tremblay entry for future updates.

A third set of proceedings has been brought against OpenAI in the US District Court for the Northern District of California.  This claim has been brought by a group of authors, playwrights and screenwriters (on both an individual and representative basis), including Pulitzer Prize winning author for fiction, Michael Chabon.

As with the other claims against OpenAI, the claims include direct and vicarious copyright infringement, violations of the DMCA, violations of California unfair competition law, negligence and unjust enrichment.  

Impact

The claims against OpenAI have now been consolidated (see Tremblay action for further updates).

15 February 2024

Authors Guild & ors v OpenAI

(1) Authors Guild (2) David Baldacci (3) Mary Bly (4) Michael Connelly (5) Sylvia Day (6) Jonathan Franzen (7) John Grisham (8) Elin Hilderband (9) Christina Baker Kline (10) Maya Shanbhag Lang (11) Victor Lavalle (12) George R.R. Martin (13) Jodi Picoult (14) Douglas Preston (15) Roxana Robinson (16) George Saunders (17) Scott Turow (18) Rachel Vail v (1) OpenAI, Inc. (2) OpenAI, L.P. (3) OpenAI Gp, LLC (4) OpenAI Opco LLC (5) OpenAI Global LLC (6) OAI Corporation LLC (7) OpenAI Holdings LLC, (8) OpenAI Startup Fund I, L.P. (9) OpenAI Startup Fund GP I, LLC (10) OpenAI Startup Fund Management, LLC 

US

Case 1:23-cv-8292 

Complaint: 19 September 2023

Amended Complaint: 5 December 2023

Amended Complaint (consolidated with Alter action): 5 February 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Answer to First Consolidated Class Action Complaint by Microsoft: 16 February 2024  

Answer to First Consolidated Class Action Complaint by OpenAI: 16 February 2024  

Opposition to Motion to Intervene and Dismiss, Stay or Transfer by Microsoft: 26 February 2024  

Position Statement regarding Motion to Intervene and Dismiss, Stay or Transfer by OpenAI: 26 February 2024

Author Plaintiffs' Response to Motion to Intervene and Dismiss, Stay or Transfer: 26 February 2024

Reply to Response to Motion re Motion to Intervene and Dismiss, Stay or Transfer: 4 March 2024

Opinion & Order denying California Plaintiff's motions to intervene for purpose of transferring, staying or dismissing the New York actions: 1 April 2024

Notice of Interlocutory Appeal filed by California Plaintiffs: 15 April 2024

Order striking class allegations in the Basbanes complaint: 30 September 2024  

Order granting voluntary dismissal of appeal: 4 October 2024

Summary

This case has been consolidated with Alter v OpenAI. 

Following other class actions brought by authors against OpenAI, this case is particularly significant for a number of reasons. First, one of the plaintiffs includes The Authors Guild, alongside 17 well-known Authors Guild members such as John Grisham, Jodi Picoult, Jonathan Franzen, George RR Martin, David Baldacci and Scott Turow. Secondly, unlike the other claims, this one has been brought in the Southern District of New York. Thirdly, whilst there is overlap in relation to the claims (in relation to direct copyright infringement, vicarious copyright infringement, contributory copyright infringement), other claims that have featured in the other cases against OpenAI have not been included.

On 5 February 2024, the Plaintiffs in this action, and in the Alter action, filed a consolidated class action complaint.  The Plaintiffs in the ChatGPT litigation have filed a Motion for this case, and others filed in the Southern District of New York, to be dismissed, or stayed/transferred to the Northern District of California but this application has been rejected.

Impact

The complaint tackles the question of 'fair use' head on noting that there is "nothing fair" about what OpenAI has done, adding that its "unauthorized use of Plaintiffs' copyrighted works thus presents a straightforward infringement case applying well-established law to well-recognized copyright harms".  Whilst the other cases may be expected to settle, given that this case involves The Authors Guild, that seems much more unlikely here.

14 February 2024

Alter v OpenAI and Microsoft

Jonathan Alter, Kai Bird, Taylor Branch, Rich Cohen, Eugene Linden, Daniel Okrent, Julian Sancton, Hampton Sides, Stacy Schiff, James Shapiro, Jia Tolentino, and Simon Winchester v OpenAI, Inc., OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco LLC, OpenAI Global LLC, OAI Corporation, LLC, OpenAI Holdings, LLC, and Microsoft Corporation

US

Case: 1:23-cv-10211

Complaint: 21 November 2023

Amended Complaint: 19 December 2023

Amended Complaint (consolidated with Authors Guild action): 5 February 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Order striking class allegations in the Basbanes complaint: 30 September 2024

Order granting voluntary dismissal of appeal: 4 October 2024

Summary

This case has now been consolidated with Authors Guild v OpenAI (see Authors Guild entry for further updates).

This complaint is brought by a number of authors, on their own behalf and on behalf of a class against OpenAI and Microsoft, in the US District Court Southern District of New York. The claim is for infringement in the training of OpenAI and Microsoft's GPT models, as well as for contributory infringement by certain of the defendants.

On 2 February 2024, the Plaintiffs in this action, and in the Authors Guild action, filed a consolidated class action complaint. 

Impact

The initial complaint's opening paragraph stated that "the basis of the OpenAI platform is nothing less than the rampant theft of copyrighted works".  The complaint also noted that it asked ChatGPT if one of the authors' work had been included in its training data to which it answered "Yes, Julian Sancton's book 'Madhouse at the End of the Earth' is included in my training data". This is the first class action author complaint against OpenAI, that also cites Microsoft as a defendant.

13 February 2024

Basbanes & Ngagoyeanes v Microsoft and OpenAI

Nicholas A. Basbanes and Nicholas Ngagoyeanes (professionally known as Nicholas Gage) v Microsoft Corporation, OpenAI, Inc., OpenAI GP, L.L.C., OpenAI Holdings, LLC, OAI Corporation, LLC, OpenAI Global, LLC, OpenAI, L.L.C., and OpenAI OpCo, LLC,

US

Case 1:24-cv-00084 

Complaint: 5 January 2024

Motion to consolidate cases (with Authors Guild and Alter actions): 22 January 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Opinion & Order denying California Plaintiff's motions to intervene for purpose of transferring, staying or dismissing the New York actions: 1 April 2024

Order striking class allegations in the Basbanes complaint: 30 September 2024

Order granting voluntary dismissal of appeal: 4 October 2024

Summary

This class action complaint has been brought by two non-fiction authors / journalists against Microsoft and OpenAI in the US District Court Southern District of New York.  The complaint makes reference to that of the New York Times and is for direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement.

Whilst the case was consolidated with the Authors Guild and Alter matters, it has now been stayed as the Plaintiff has withdrawn the class allegations.

Impact

As the first AI infringement case issued in 2024, this case is the latest in a string of actions by authors against Microsoft and OpenAI, as well as the high profile complaint brought by The New York Times.  The complaint uses strong language describing the defendants as "no different than any other thief".

Legislative and policy developments

15 April 2024

USCO Notice of inquiry

US

Notice of inquiry and request for comments: 30 August 2023 (deadline for comments: extended to 6 December 2023)

Copyright and AI Report, Part 1: Digital Replicas: July 2024

Summary

As part of its study of the copyright law and policy issues raised by AI systems, the USCO sought written comments from stakeholders on a number of questions. It had received over 10,000 comments by December 2023. The questions cover the following areas:

  1. The use of copyrighted works to train AI models – the USCO notes that there is disagreement about whether or when the use of copyrighted works to develop datasets is infringing. It therefore seeks information about the collection and curation of AI datasets, how they are used to train AI models, the sources of materials and whether permission by / compensation for copyright owners should be required.
  2. The copyrightability of material generated using AI systems – the USCO seeks comment on the proper scope of copyright protection for material created using generative AI. It believes that the law in the US is clear that protection is limited to works of human authorship but notes that there are questions over where and how to draw the line between human creation and AI-generated content. For example, a human's use of a generative AI tool could include sufficient control over the technology – e.g., through selection of training materials, and multiple iterations of prompts – to potentially result in output that is human-authored. The USCO notes that it is working separately to update its registration guidance on works that include AI-generated materials.
  3. Potential liability for infringing works generated using AI systems – the USCO is interested to hear how copyright liability principles could apply to material created by generative AI systems.  For example, if an output is found to be substantially similar to a copyrighted work that was part of the training dataset, and the use does not qualify as fair use, how should liability be apportioned between the user and the developer?
  4. Issues related to copyright – lastly, as a related issue, the USCO is also interested to hear about issues relating to AI-generated materials that feature the names of likeness, including vocal likeness, of a particular person; and also in relation to AI systems that produce visual works 'in the style' of a specific artist.

In July 2024, the USCO published Part 1 of its Report on Copyright and Artificial Intelligence, focusing on Digital Replicas (also called 'deepfakes').  Based on the input received, the USCO has concluded that a new federal law is needed to deal with unauthorised digital replicas, as existing laws do not provide sufficient legal redress. This would cover all individuals, not just celebrities. However, whilst the paper also notes that creators have concerns over AI outputs that deliberately imitate an artist's style, it does not recommend including style in the coverage of the new legislation at this time.    

Separately, a No Fakes Bill (Nurture Originals, Foster Art and Keep Entertainment Safe Bill) has also been proposed in the US Senate. The No Fakes Bill also proposes to enact federal protection for the voice and visual likeness of individuals. The Bill is endorsed by a number of associations representing performers and rights holders, and from within the creative community.

Impact

The issues raised in the Notice are wide-ranging and some are before the Courts for determination. One key issue to resolve is whether the use of AI in generating works could be regarded as akin to a tool like a typewriter in creating a manuscript. Using a typewriter does not result in the manuscript being uncopyrightable in the same way as using Photoshop does not result in a photo taken by a photographer being uncopyrightable. This is the approach that GitHub takes in respect of its Copilot service (for example) where it notes that "Copilot is a tool, like a compiler or pen" and, as a result, its position is that the code produced from GitHub Copilot's should belong to the individual who used the tool. However, again, the legal position as to authorship/ownership is not so clear-cut. Whilst GitHub has no interest in owning Copilot-generated source code that is incorporated into a developer's works, it's not clear whether the terms in Copilot's terms of use effectively assign IP rights to the developer. It is also not clear whether there could be any instances where the use of extensive and carefully worded prompts could ever result in someone being able to claim copyright in the material generated by an AI tool by claiming that the author has ultimate creative control over the work. The USCO had previously considered this in its Statement of Practice. These are just a few issues on which clarity is needed.

11 April 2024

The Generative AI Copyright Disclosure Bill

US

Introduced by Representative Adam Schiff: 9 April 2024

Summary

Introduced by Democratic Representative Adam Schiff, The Generative AI Copyright Disclosure Act would require a notice to be submitted to the Register of Copyrights prior to a new generative AI system being released, providing information on all copyrighted works used in building or altering the training dataset. It would also apply retroactively to existing genAI systems.

Impact

The Bill has attracted widespread support from across the creative community including from industry associations and Unions such as the Recording Industry Association of America, Copyright Clearance Center, Directors Guild of America, Authors Guild, National Association of Voice Actors, Concept Art Association, Professional Photographers of America, Screen Actors Guild-American Federation of Television and Radio Artists, Writers Guild of America West, Writers Guild of America East, American Society of Composers, Authors and Publishers, American Society for Collective Rights Licensing, International Alliance of Theatrical Stage Employees, Society of Composers and Lyricists, National Music Publishers Association, Recording Academy, Nashville Songwriters Association International, Songwriters of North America, Black Music Action Coalition, Music Artist Coalition, Human Artistry Campaign, and the American Association of Independent Music.

12 February 2024

UK approach to text and data mining

UK

UKIPO Code of Practice: On 6 February 2024, the UK Government confirmed it had not been possible to reach an agreement on a voluntary Code of Practice

Summary

In 2021, the UK Intellectual Property Office (UKIPO) consulted on potential changes to the UK's IP framework as a result of AI developments (importantly, this was before the increased levels of interest following the launch of ChatGPT etc).

In particular, a number of policy options were considered relating to the making of copies for the purposes of text and data mining (TDM), a crucial tool in the development and training of AI tools. Currently, an exception is in place under UK copyright law to allow copying for the purposes of TDM, but only where it is for the purpose of non-commercial research, and only where the researcher has lawful access to the works.

Alongside retaining the current exception, or simply improving the licensing environment for relevant works, the consultation sought views on three alternative options:

  • Extend the TDM exception to cover commercial research.  
  • Adopt a TDM exception for any use, with a right-holder opt-out – modelled on the recent TDM exception introduced in the EU. This would provide rights holders with the right to opt-out individual works, sets of works, or all of their works if they do not want them to be mined.
  • Adopt a TDM exception for any use, with no right-holder opt-out – similar to an exception in Japan for information analysis, and also in Singapore.

In June 2022, the UKIPO published the then Government’s response to the consultation, which was in favour of the widest and most liberal of the options under discussion, i.e., a TDM exception for any use, with no right-holder opt-out. Specifically, it was noted that the widening of the exception would ensure that the UK's copyright laws were "among the most innovation-friendly in the world", allowing "all users of data mining technology [to] benefit, with rights holders having safeguards to protect their content". The main safeguard identified for rights holders was the requirement for lawful access.

Following widespread criticism, however, in particular relating to concerns from the creative industries, the then Minister for Science, Research and Innovation confirmed in February 2023 that the proposals would not proceed.

However, following the Sir Patrick Vallance Pro-Innovation Regulation of Technologies Review on Digital Technologies, which called upon the Government to announce a clear policy position, the Conservative Government's response confirmed that it had asked the UKIPO to produce a code of practice. The code of practice was intended to provide balanced and pragmatic guidance to AI firms to access copyright-protected works as an input to their models, whilst ensuring protections are in place on generated outputs to support right holders such as labelling. The Government suggested that an AI firm that committed to the code of practice could expect to have a reasonable licence offered by a rights holder. If a code of practice could not be agreed or adopted, however, legislation may have to be implemented.

In an interim report on governance of AI by the House of Commons Science, Innovation and Technology Committee (dated 31 August 2023), 'the Intellectual Property and Copyright Challenge' was identified as one of the 12 challenges of AI governance. Representatives of the creative industries reported to the Committee that they hoped to reach a mutually beneficial solution with the AI sector, potentially in the form of a licensing framework. Meanwhile, in its report on Connected tech: AI and creative technology (dated 30 August 2023), the House of Commons Culture, Media and Sport Committee welcomed the former Government's rowing back from a broad TDM exception, suggesting that it should proactively support small AI developers, in particular, who may find it difficult to acquire licences, by considering how licensing schemes can be introduced for technical material and how mutually beneficial arrangements can be agreed with rights management organisations and creative industry bodies. Further, it stressed to the Government that it "must work to regain the trust of the creative industries following its abortive attempt to introduce a broad text and data mining exception".

In its response to the House of Commons Culture, Media and Sport Committee's report on AI and the creative industries, the former Government confirmed that it was not proceeding with a wide text and data mining exception and reiterated its commitment to developing a code of practice to "enable the AI and creative sectors to grow in partnership". 

In the report of the House of Lords Communications and Digital Committee on 'Large Language Models and Generative AI' (published 2 February 2024), the Committee noted that the voluntary IPO-led process was welcome and valuable but that debate could not continue indefinitely, and if process remained unresolved by Spring 2024, the Government must set out options and prepare to resolve the dispute definitively, including legislative change if necessary. However, following reports in The Financial Times that the code of practice had been shelved, this was confirmed by the Government in its response to the AI White Paper consultation published on 6 February. 

Impact

Following the change of Government, monitor closely for the new Government's proposals in relation to AI, both generally and in relation to the treatment of copyright works. Whilst the King's Speech made reference to the Government intending to "…seek to establish the appropriate legislation to place requirements on those working to develop the most powerful artificial intelligence models", no further information has yet been provided. Given the reference to 'appropriate legislation', we anticipate that there will be further consideration of this issue, and the Government has indicated that it expects to resolve the issue by the end of the year.

8 February 2024

UK approach to copyright protection of computer-generated works

UK

Monitor for developments

Summary

In contrast to the approach adopted in most other countries, copyright is available in the UK to protect computer-generated works (CGWs) where there is no human creator. The author of such a work is deemed to be the person by whom the necessary arrangements for the creation of the work were undertaken, and protection lasts for 50 years from the date when the work was made.

How this applies in relation to content created with generative AI is currently untested in the UK.  In its consultation in 2021, the Government sought to understand whether the current law strikes the right balance in terms of incentivising and rewarding investment in AI creativity. 

Some have criticised the UK provision for being unclear and contradictory – a work, including a CGW, must be original to be protected by copyright, but the test for originality is defined by reference to human authors, and by reference to human traits such as whether it reflects their 'free and expressive choices' and whether it contains their 'stamp of personality'. 

From an economic perspective, meanwhile, it has been argued that providing copyright protection for CGWs is excessive because the incentive argument for copyright does not apply to computers. Further, some argue from a philosophical viewpoint that copyright should be available to protect only human creations, and that granting protection for CGWs devalues the worth of human creativity.

The consultation proposed the following three policy options, with the Government ultimately deciding to adopt the first option of making no change to the existing law at present:

  • Retain the current scheme of protection for CGWs
  • Remove protection for CGWs
  • Introduce a new right of protection for CGWs, with a reduced scope and duration

Impact

Having consulted, the Government decided to make no changes to the law providing copyright protection for CGWs where there is no human author, but said that this was an area that it would keep under review. In particular, it noted that the use of AI in the creation of these works was still in its infancy, and therefore the impact of the law, and any changes to it, could not yet be fully evaluated.

In view of recent developments, it is clear that this policy approach may need to be revisited sooner rather than later.

We discussed this and the comparison with the approach in the US in our article here (and see further below).

Summary

On 12 July 2024, the EU AI Act was published in the Official Journal of the EU. Now that it has been published, the compliance deadlines can be calculated as set out below.

In relation to copyright, the Act contains provisions relating to obligations on general-purpose AI systems around compliance with EU copyright law (including relating to text and data mining and opt-outs under the EU Digital Single Market Copyright Directive) and transparency around content used to train such models (in the form of sufficiently detailed summaries, which will be by reference to a form template to be published by the proposed AI Office). There is also a requirement that certain AI-generated content (essentially 'deep fakes') be labelled as such.

Impact

The Act will enter into force 20 days after publication in the Official Journal (i.e., on 1 August 2024), and be fully applicable 24 months after its entry into force, i.e., on 2 August 2026 (though certain provisions will be applicable sooner, and others at 36 months). There are staggered dates for when different parts of the Act will take effect:

  • 6 months after coming into force, provisions concerning banned AI practices take effect (i.e. 2 February 2025)
  • 1 year after coming into force, provisions on penalties, confidentiality obligations and general-purpose AI take effect (i.e. 2 August 2025)
  • 2 years after coming into force, the remaining provisions take effect (i.e. 2 August 2026)
  • 3 years after coming into force, obligations for high-risk AI systems forming a product (or safety component of a product) regulated by EU product safety legislation apply (i.e. 2 August 2027)
3 January 2024

USCO Statement of Practice

US

USCO Statement of Policy: 10 March 2023

Summary

In March 2023, the US Copyright Office published a Statement of Policy setting out its approach to registration of works containing material generated by AI.

The guidance states that only the human created parts of a generative AI work are protected by copyright. Accordingly, only where a human author arranges AI-generated material in a sufficiently creative way that ‘the resulting work as a whole constitutes an original work of authorship’ or modifies AI-generated content ‘to such a degree that the modifications meet the standard for copyright protection,’ will the human-authored aspects of such works be potentially protected by copyright. 

This statement follows a decision by the USCO on copyright registration for Zarya of the Dawn ('the Work'), an 18-page graphic novel featuring text alongside images created using the AI platform Midjourney. Originally, the USCO issued a copyright registration for the graphic novel before undertaking investigations which showed that the artist had used Midjourney to create the images. Following this investigation (which included viewing the artist’s social media), the USCO cancelled the original certificate and issued a new one covering only the text as well as the selection, coordination, and arrangement of the Work’s written and visual elements. In reaching this conclusion, the USCO deemed that the artist’s editing of some of the images was not sufficiently creative to be entitled to copyright as a derivative work.

Impact

The boundaries drawn by the USCO in relation to works created by generative AI confirm there are challenges for those that wish to obtain protection for such works. Developments should continue to be tracked, including in relation to ongoing litigation (see above).

Subscribe to our mailings

Keep up to date with news, publications and briefings

Subscribe
How can we help you?
Help

How can we help you?

Subscribe: I'd like to keep in touch

If your enquiry is urgent please call +44 20 3321 7000

I'm a client

I'm looking for advice

Something else