Research Guides: AI Literacy in the Age of ChatGPT: Copyright issues

Copyright issues

The output of generative AI

Can you copyright something you made with AI?
Open AI says:
"... you own the output you create with ChatGPT, including the right to reprint, sell, and merchandise – regardless of whether output was generated through a free or paid plan."

The U.S. Copyright Office says:
The term “author" ... excludes non-humans.

But, if you select or arrange AI-generated material in a sufficiently creative way... In these cases, copyright will only protect the human-authored aspects of the work. For an example, see this story of a comic book. The U.S. Copyright Office determined that the selection and arrangement of the images IS copyrightable, but not the images themselves (made with generative AI).

In other countries, different rulings may apply, see:
Chinese Court’s Landmark Ruling: AI Images Can be Copyrighted

The input to generative AI (training data)
Should it be considered fair use? This is widely debated.

Argument A. No it's copyright violation
Copyright law is AI's 2024 battlefield - "Copyright owners have been lining up to take whacks at generative AI like a giant piñata woven out of their works. 2024 is likely to be the year we find out whether there is money inside," James Grimmelmann, professor of digital and information law at Cornell, tells Axios. "Every time a new technology comes out that makes copying or creation easier, there's a struggle over how to apply copyright law to it."

This will affect not only OpenAI, but Google, Microsoft, and Meta, since they all use similar methods to train their models.

Argument B. Yes, it's fair use

Thoughts from the Association of Research Libraries
Training Generative AI Models on Copyrighted Works Is Fair Use

Thoughts from Creative Commons:
Fair Use: Training Generative AI - Stephen Wolfson
Better Sharing for Generative AI - Catherine Stilher

Thoughts from UC Berkeley Library Copyright Office
UC Berkeley Library to Copyright Office: Protect fair uses in AI training for research and education

Thoughts from EFF: Electronic Frontier Foundation:
AI Art Generators and the Online Image Market - Katharine Trendacosta and Cory Doctorow
How We Think About Copyright and AI Art - Kit Walsh
“Done right, copyright law is supposed to encourage new creativity. Stretching it to outlaw tools like AI image generators—or to effectively put them in the exclusive hands of powerful economic actors who already use that economic muscle to squeeze creators—would have the opposite effect.”

Other countries

Japan Goes All In: Copyright Doesn't Apply To AI Training
The Israel Ministry of Justice has issued an opinion: the use of copyrighted materials in the machine learning context is permitted under existing Israeli copyright law.

Several corporations have offered to pay legal bills of users of their tools
Adobe, Google, Microsoft, and Anthropic (for Claude) have offered to pay any legal bills from lawsuits against users of their tools.

Legal opinions

It will be difficult to prove, according to IP lawyers like Katherine Gardner
“When you put content on a social media site or any site, you’re generally granting a very broad license to the site to be able to use your content in any way,” Gardner said. “It’s going to be very difficult for the ordinary end user to claim that they are entitled to any sort of payment or compensation for use of their data as part of the training.”

Rebecca Tushnet studies and teaches copyright and trademark law as the Frank Stanton Professor of the First Amendment at Harvard Law School. In this interview in the Harvard Gazette, she talks about some of the broader legal issues around emerging tech.

Prof. Matthew Sag Testimony on Copyright and AI (PDF), Testimony before the U.S. Senate Committee on the Judiciary, Subcommittee on Intellectual Property, July 12, 2023.

Letter to the U.S. Copyright Office from the Library Copyright Alliance (The American Library Association and The Association of Research Libraries). Oct, 31, 2023. (download the PDF) Supports the idea that training data for generative AI should be considered fair use.

It's difficult to remove data from an AI once it's been trained.

According to Google, "Fully erasing the influence of the data requested to be deleted is challenging since, aside from simply deleting it from databases where it’s stored, it also requires erasing the influence of that data on other artifacts such as trained machine learning models."

To address this, Google has announced a Machine Unlearning Challenge, a competition for researchers to foster novel solutions to this problem.

Copyright is only one lens through which to consider generative AI

Creative Commons defends better sharing and the commons in WIPO conversation on generative AI

“In particular, since all creativity builds on the past, copyright needs to continue to leave room for people to study, analyze and learn from previous works to create new ones, including by analyzing past works using automated means.

Mr. Chair, copyright is only one lens through which to consider generative AI. Copyright is a rather blunt tool that often leads to black-and-white solutions that fall short of harnessing all the diverse possibilities that generative AI offers for human creativity. Copyright is not a social safety net, an ethical framework, or a community governance mechanism — and yet we know that regulating generative AI needs to account for these important considerations if we want to support our large community of creators who want to contribute to enriching a commons that truly reflects the world’s diversity of creative expressions.”

Is A.I. the Death of I.P.?
Generative A.I. is the latest in a long line of innovations to put pressure on our already dysfunctional copyright system.
The New Yorker, Louis Menand, January 15, 2024
Interesting review of the book, Who Owns This Sentence?: A History of Copyrights and Wrongs by Bellos and Montagu

AI Literacy in the Age of ChatGPT

Copyright issues

Copyright issues

The output of generative AI

The input to generative AI (training data) Should it be considered fair use? This is widely debated.

Legal opinions

It's difficult to remove data from an AI once it's been trained.

Copyright is only one lens through which to consider generative AI

The input to generative AI (training data)
Should it be considered fair use? This is widely debated.