
tl;dr
Reddit has filed a lawsuit against AI company Anthropic, accusing it of scraping Reddit content without permission to train its Claude AI model and violating Reddit’s user agreement. The complaint alleges continued unauthorized access to Reddit servers even after public claims to have stopped. Reddi...
Reddit has filed a lawsuit against AI company Anthropic, accusing it of scraping Reddit content without permission to train its Claude AI model. The complaint alleges Anthropic violated Reddit’s user agreement and accessed Reddit servers over 100,000 times even after publicly claiming to have stopped in July 2024. Reddit seeks damages and a court order prohibiting Anthropic from using Reddit-derived data in any products or licensing such data for profit.
The lawsuit portrays Anthropic as having "two faces": a public one promoting responsibility and legal respect, and a private one disregarding rules to profit from Reddit content. This case highlights ongoing disputes about the use of copyrighted and user-generated materials in training large language models (LLMs).
Since the rise of AI tools like OpenAI’s ChatGPT, legal battles have intensified. Multiple lawsuits target AI companies, including a notable 2023 case by The New York Times against OpenAI and Microsoft. Plaintiffs often include artists, authors, and record labels, who argue that their creative works were exploited without consent or compensation.
Anthropic faces additional lawsuits over using copyrighted song lyrics and pirated books as training data. Cultural concerns also run high, with artists outraged by AI models mimicking their styles—such as the craze replicating Studio Ghibli’s animation style, sparking fears of copyright infringement and loss of artistic income.
OpenAI admitted to using copyrighted materials in training, claiming such use is essential and lawful to develop advanced AI systems. Yet proposals like the recent UK initiative to relax copyright laws for LLM training have met resistance from prominent figures, including Elton John.
While Reddit vocally opposes unauthorized use of its content, it supports AI training with proper compensation. The company has licensed its user data to major tech players like OpenAI, Google, Sprinklr, and Cision, reflecting a pragmatic approach to the evolving AI landscape.