Additional commentary by Emily Lyons
Welcome back to another edition of AI Pulse, our blog series where we share AI bot traffic insights observed across the Akamai network. If you missed any of the previous blog posts in the AI Pulse series, be sure to check them out.
For our final post of the year, we’re digging into two often overlooked topics: how AI bots actually surface content, and the persistent challenge of evasive scrapers. (And don’t worry; the AI Pulse series will be back in 2026.)
If your content is online, AI will likely try to find it
Many businesses still feel their content is unique. They assume it is different enough or gated enough that it will be excluded from AI systems. Some think that blocking a crawler or updating robots.txt is enough to keep their content out of AI’s reach. Others believe that AI bots care about the rules we put in front of them.
What we see across the Akamai network tells a more nuanced story. AI bot traffic is present, but it is still a relatively small slice of overall automated activity. The larger and more persistent pressure comes from evasive scrapers, which have been probing content for years and continue to adapt as new controls are introduced.
When AI bots are blocked, the demand for the underlying data does not disappear. Demand often shifts, which is when scraper activity begins to rise. Today, evasive scrapers generate nearly three times more traffic than AI bots and are the dominant force behind automated content collection (Figure 1).
Not every AI vendor performs all their own scraping
A common misconception is that all AI vendors gather data the same way or scrape at the same scale. In reality, each company has a very different approach, depending on the products they offer and the signals they need.
Figure 2 illustrates three of the most common AI platforms:
OpenAI
Anthropic
Perplexity
Open AI
OpenAI is the most multifaceted. OpenAI operates training crawlers, search crawlers, fetchers, and agent traffic, which creates a broad impact across the entire lifecycle of AI systems.
Anthropic
Anthropic’s approach is much narrower. Anthropic’s activity is primarily tied to training, with only a small amount of traffic showing up outside of that use case.
Perplexity
Perplexity behaves differently from both. Perplexity’s traffic is largely linked to search and retrieval, and it does not appear to be focused on training through direct scraping.
AI vendors are not uniform, and many rely on signals or data sources that they do not gather themselves. That gap in direct scraping is one of the reasons evasive scrapers continue to play such a large role in how information gets collected and circulated online.
Blocking an AI platform doesn’t guarantee your content stays out of their models; the platform can still reach your content through indirect sources.
What happens when a business actually mitigates AI bots
Figure 3 illustrates a single media business’s AI bot traffic when mitigation is put in place.
After the business enabled the tarpit mitigation action on July 14, 2025, the high volume of AI training crawlers that they had been seeing dropped sharply. Instead of shifting into more aggressive behavior, the AI bots simply began to fade out over time.
This pattern shows that when a business takes a deliberate step to slow down AI training crawlers, those bots do not immediately pivot into more evasive techniques. They reduce their activity and eventually go away.
Not every AI vendor is scraping aggressively; many will respond to the friction when it is introduced, and back off. But the bigger question remains: Where do they turn next? If your content is valuable, AI systems will still find a path to it, whether through redistributors, aggregators, competitors, or anyone else who didn’t apply the same controls.
Some AI bots back off quickly, others take their time
When that media business enabled a tarpit for AI training crawlers, most AI vendors reduced their activity fairly quickly. The drop in traffic was clear and steady. But not all vendors behave identically.
As Figure 4 reflects, ByteDance and Meta took noticeably longer to adjust. Their crawlers continued to show up well after the others had backed off, suggesting that response times vary widely across vendors and that some systems take longer to recognize or react to new controls.
This variation lends credence to the idea that AI bots are not monolithic. Each vendor has its own approach, its own signals, and its own cadence to how it responds when a business starts to limit access.
AI training crawler traffic continues to grow across the media industry
Across the media industry more broadly, AI training crawler traffic continues to grow (Figure 5).
AI training crawler traffic now far exceeds the AI fetcher, AI search crawler, and AI agent traffic, maintaining a steady climb throughout the year. Even when individual vendors reduce their direct scraping in response to mitigation, the aggregate volume across the segment keeps trending upward.
This trend reinforces that the demand for high-value media content is growing and that the training layer behind AI models is becoming more active, not less active.
The so what
For media organizations, the takeaway is simple: AI bots are only one part of what is surfacing your work, and most of the real pressure still comes from evasive scrapers.
Even when training crawlers pull back, demand for your content does not disappear. The demand shifts, and scraper activity continues to fill the gaps. This is why effective protection needs to account for both.
AI bot controls help guide how models access your site — and strong scraper defense with tools like Akamai Content Protector helps limit the indirect paths that AI systems rely on.
The organizations that stay ahead will be the ones that understand how their content is being discovered and put guardrails in place that preserve both access and value as AI continues to evolve.
Learn more
To learn more about how to gain control over evasive scrapers and manage AI bots, contact an expert.
Tags