tl;dr

Perplexity AI's web crawlers ignored explicit blocks from tens of thousands of websites, leading Cloudflare to delist Perplexity from its verified bot program and block its deceptive scraping practices. Perplexity used stealth tactics like generic browser user-agents, rotating IP addresses, and auto...

Perplexity AI's web crawlers continued accessing content from tens of thousands of websites despite explicit blocks, according to Cloudflare. This prompted Cloudflare to delist Perplexity from its verified bot program and implement blocks against what it deemed deceptive scraping practices. Founded in 2022 by former AI and tech experts, Perplexity recently raised $100 million, valuing the company at $18 billion.

The conflict escalated after Cloudflare customers reported that Perplexity ignored robots.txt directives and firewall rules designed to block their declared crawlers. Cloudflare engineers confirmed that while Perplexity’s declared crawlers were blocked, the company switched to stealth tactics including using generic browser user-agents that impersonate Google Chrome on macOS.

These undeclared crawlers employed sophisticated evasion techniques such as rotating through IP addresses not listed in Perplexity’s official range and switching across different autonomous system numbers to bypass blocks. Perplexity’s declared crawlers generate 20-25 million daily requests, while stealth crawlers add an extra 3-6 million, affecting tens of thousands of domains and millions of requests each day.

Cloudflare CEO Matthew Prince emphasized the unsustainable extraction of web content by AI companies, highlighting a sharp decline in search traffic referrals as users increasingly rely on AI summaries. He revealed that AI companies crawl vastly more pages per visitor compared to traditional search engines, with OpenAI and Anthropic showing deteriorating visitor-to-crawl ratios.

In response, Cloudflare launched "Content Independence Day," defaulting to blocking AI crawlers on new domains and empowering over a million websites—including major publishers like The Associated Press and BuzzFeed—to block unwanted crawlers. Cloudflare insists crawlers must be transparent, purposeful, and respectful of website directives.

Contrasting Perplexity’s tactics, Cloudflare praised OpenAI for respecting robots.txt rules and halting crawling when blocked. To combat deceptive crawling, Cloudflare implemented signature-based blocks for stealth crawlers accessible to all customers and is developing innovative tools like an "AI Labyrinth" that traps non-compliant bots and a "pay-per-crawl" marketplace allowing publishers to monetize content access by AI companies.

Disclaimer

The opinions expressed by the writers at Grow My Bag are their own and do not reflect the official stance of Grow My Bag. The content provided on our site is not intended as investment advice, and Grow My Bag is not an investment advisor. We do not endorse buying or selling any cryptocurrencies or digital assets mentioned in our articles. High-risk investments in Bitcoin, cryptocurrencies, and digital assets require thorough due diligence, and all transfers and trades made are at your own risk. Grow My Bag is not responsible for any potential losses and participates in affiliate marketing.
 5 Aug 25
 5 Aug 25
 5 Aug 25