AI v. IP: Potential Fiasco Looming with New Wave of Litigation Against AI Platforms
Numerous new actions have been brought within the past few months against AI platforms. Copyright infringement allegedly arising from both the underlying training of and content generated by the involved AI models is central to the complaints.
Claimed Clashes with Copyrights (and More)
AI models require vast amounts of information for effective training. Such training typically involves processing and learning patterns from extensive datasets, which may include material subject to third party copyrights, such as text, images, and audio works. As AI platforms collect and utilize this data, legal questions arise involving potential IP infringement issues stemming not only from the use and generated content, but also the collection of the training data itself.
AI Large Language Models
The latest batch of lawsuits comprise three putative class action complaints filed by authors. Two of these complaints were filed against OpenAI LP (owner of ChatGPT) and related entities, while a third complaint was filed against Meta.[1] All three of the foregoing disputes involve large language models (LLMs), a specific type of AI model focused on natural language processing and generation. While being some of the first actions involving authors of books, these lawsuits are not the first targeting generative AI models.
In these actions involving LLMs, the authors essentially allege that defendants OpenAI and Meta infringed the authors’ works by copying their books without permission in the training of the subject LLM. The allegations are complicated in that the authors further allege because the LLM purportedly cannot function without the expressive information extracted from the authors’ (as well as others’) works, which are also allegedly retained within the model, the output generated by the LLMs as well as the LLM itself, are infringing derivative works.
AI Image Models
In a putative class action brought by artists against Stability AI (US and UK entities), Midjourney and DeviantArt, which involves the text-to-image generating AI model Stable Diffusion, the plaintiffs allege that the defendants used their copyrighted images without permission to train the AI model.[2] It is alleged that the AI model stored copies of the images, thereby infringing the plaintiffs’ copyrighted works, and that the generated output images are also necessarily infringing derivative works. Separately, in a complaint filed by Getty Images (US), Inc against Stability AI (US and UK entities), it is alleged that apart from the over 12 million images (along with associated text and metadata), which are infringed upon in the AI model training and generated output, some output images contain distorted versions of the Getty’s watermark.[3] The appearance of the watermark is claimed to falsely imply an association with Getty and otherwise implicate trademark infringement and related causes of action such as alleged dilution and tarnishment (e.g., as a result of the “bizarre” or “grotesque” images sometimes produced by Stable Diffusion).
AI Software Development Models
Anonymous software code owners have also filed class actions (which have been consolidated) against GitHub, Microsoft, and Open AI, involving the software code development AI models, namely, OpenAI’s Codex and GitHub’s Copilot.[4] The anonymous plaintiffs claim that the AI models were trained on a large corpus of software-related data, including copyrighted materials offered under specific open-source licenses. Moreover, the coders allege that the training and operation of the AI models not only violate their IP rights, but also breach the terms of the applicable software license agreements.
Across all aforementioned cases, plaintiffs also accuse the defendants of removing copyright management and attribution information from their works, which is alleged to violate the Digital Millennium Copyright Act.
Navigating Challenges and Legal Implications
Governmental bodies here and abroad appear to recognize the risks and are seeking ways to regulate AI. The European Union (EU), for instance, has proposed its own risk-based approach Artificial Intelligence Act, known as the “EU AI Act,” which may be enacted by year’s end. Further, several US federal financial regulators have released requests for information on industry use of AI, and several agencies, including the Federal Trade Commission (FTC), have expressed their commitments to enforce their respective laws and regulations to promote responsible innovation in automated systems.
As one may expect, regulation is fraught with challenges. For example, despite the US Copyright Office releasing its official position this past March that only “human-authored” works are eligible for copyright, many questions remain unresolved, including international implications. That being said, there are currently no US federal laws that specifically regulate AI or applications of same; however, activity at the White House and the series of recent hearings held by the US House of Representatives’ Committee on Science, Space, and Technology regarding AI suggest that change may be on the horizon.
While the apparent absence of specific regulations does not mean that there is no legal authority for addressing challenges, the emergence of AI technology raises difficult questions concerning the legality of the training and the output of generative AI models. Whether AI content ultimately will be found infringing or fair use under current law, is yet to be determined and likely will be case-specific. Meanwhile, companies developing AI-based products as well as users of AI-based products should consider the evident interplay of IP rights in the context of both AI model training and output generation.
Next Steps
Understanding the interplay of IP rights, particularly, copyright laws may be crucial in determining legal implications. For example, an examination of fair use, along with applicable licensing agreements or other terms of use may shed light on possible risks associated with AI model training. Further, caution and mindfulness as to the source, method of gathering, and possible storage of training data should be exercised as these elements may be crucial in determining liability. Lastly, the concept of fair use and transformative use may become pivotal when AI tools are used to generate content. An analysis of these and other considerations may empower AI developers and users alike to make informed decisions and reduce the risk of potential liability. On the other hand, IP owners are well-advised to strategically evaluate how they are protecting their IP and regularly review how third-party platforms and AI-generated content may be incorporating their IP in order to appropriately respond.
Our team will be monitoring generative AI evolving legal issues as well as significant regulatory developments and legislation concerning AI technology for both IP owners and users. Please contact your ArentFox Schiff LLP attorney or any one of the authors with questions or concerns.
[1] See Tremblay et al. v. OpenAI, Inc. et al., 4:23-cv-03223 (N.D. Cal. filed June 28, 2023); Silverman et al., v. OpenAI, Inc. et al, 4:23-cv-03416 (N.D. Cal. filed July 7, 2023); Kadrey et al., v. Meta Platforms, Inc., 3:23-cv-03417 (N.D. Cal., filed July 7, 2023).
[2] See Andersen et al. vs. Stability AI Ltd. et al., No. 3:23-CV-00201 (N.D. Cal., 2023 filed January 13, 2023).
[3] See Getty Images (US) Inc. v. Stability AI Inc.et al., No. 1:23-CV-00135 (D. Del., 2023 filed February 3, 2023).
[4] See Doe 1 et al. v. GitHub, Inc. et al., No. 4:22-cv-06823 (N.D. Cal. filed November 3, 2022) (lead/operative case); and Doe 3 v. GitHub, Inc., No. 4:22-cv-07074 (N.D. Cal. filed November 10, 2022) (consolidated).
Contacts
- Related Industries
- Related Practices