Reddit has filed a lawsuit against Perplexity and three data scraping companies for unauthorized access to its user-generated content. The complaint alleges that Perplexity utilized scraped data to power its AI “answer engine” without permission or consent. Reddit demands that the court halt these activities, labeling them as unlawful and harmful to its platform.
Reddit accuses Perplexity and three companies, Oxylabs, SerpApi, and AWMProxy, of scraping content by bypassing technical protections. The lawsuit compares the actions to bank robbery, describing the scrapers as breaking into “armored trucks” of data. Reddit claims these companies masked their identities and used deceptive tactics to collect content through Google.
According to Reddit, Perplexity is a customer of at least one of these scraping providers. The platform states that Perplexity pursued data “at any cost” instead of negotiating a license, like other firms. Reddit states, “Perplexity will apparently do anything to get the Reddit data it desperately needs.”
Reddit argues Perplexity continued scraping even after receiving a cease-and-desist letter in May 2024. The lawsuit mentions that Reddit observed more Reddit content appearing in Perplexity’s responses after the warning. The company also created a trap post visible only to Google, which Perplexity allegedly indexed and used quickly.
The lawsuit alleges that Perplexity obtained Reddit content by scraping Google’s search results using proxy services. Reddit states this practice circumvents its robots.txt protections and violates its terms of service. The platform highlights that this method constitutes “industrial-scale” data laundering by bad actors.
Reddit’s legal filing names Oxylabs, a Lithuanian firm, SerpApi, and AWMProxy, a former Russian botnet, as co-defendants. Reddit’s legal officer Ben Lee said, “Defendants…are textbook examples of this illegal behavior.” He emphasized their use of identity-masking tactics to gather Reddit’s content illegally.
The complaint argues that Perplexity knowingly benefited from these scraping services instead of seeking a legitimate data partnership. Reddit underlines its past licensing agreements with OpenAI and Google as examples of proper engagement. It maintains that others, like Perplexity, are cutting corners to access valuable training data.
Reddit positions its user-generated discussions as critical human content for training modern AI systems. It says this content is dynamic, ranked by people, and valuable for developing accurate AI models. Reddit emphasizes it must be compensated for this data as usage increases across the tech industry.
The company previously altered its API access terms in 2023 to charge for commercial use of its data. That decision sparked widespread protests but ultimately reinforced Reddit’s push for fair compensation. Now, Reddit argues this lawsuit is a necessary step to protect its content and rights.
Perplexity responded through a spokesperson, stating it had not received the complaint but plans to defend access to public knowledge. The company said, “Our approach remains principled and responsible as we provide factual answers with accurate AI.” It added that it will oppose threats against openness and public interest.
Reddit has previously filed similar complaints, including legal action against Anthropic over content scraping issues. This new case highlights the growing tension between content platforms and AI companies seeking access to data.
The post Reddit Files Lawsuit Against Perplexity Over Unlawful Data Scraping appeared first on Blockonomi.