Rate Limiting AI Agents Without Blocking Them Entirely
AI agents can be aggressive crawlers. An agent researching a product might hit dozens of pages on your site in rapid succession, and that looks a lot like a DDoS attack from your server's perspective. The challenge is protecting your infrastructure without locking out the agents your customers are sending.
The problem with blanket rate limiting
Most rate limiting works by counting requests per IP or per session within a time window. Exceed the threshold and you get blocked, usually with a 403 or a connection reset. This works well for stopping malicious bots, but AI agents often trigger the same limits for legitimate reasons.
An agent comparing prices across your product catalogue might request 50 pages in a minute. A traditional bot detector flags that as suspicious. But from the user's perspective, they asked their agent to "find the cheapest option," and the agent is doing exactly what was asked.
Tools for managing agent traffic
robots.txt Crawl-Delay
The Crawl-delay directive in robots.txt asks well-behaved crawlers to wait between requests. Not all agents respect it (it is not part of the official robots.txt specification), but many do.
User-agent: *
Crawl-delay: 2
User-agent: GPTBot
Crawl-delay: 1
User-agent: ClaudeBot
Crawl-delay: 1
This tells general crawlers to wait 2 seconds between requests, while giving known AI agents a shorter delay. You can read more about configuring robots.txt and sitemaps for AI agents to fine-tune this.
HTTP 429 with Retry-After
When an agent exceeds your rate limit, respond with HTTP 429 (Too Many Requests) and include a Retry-After header. Well-built agents will respect this and back off.
HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json
{
"error": "Rate limit exceeded",
"retry_after": 30,
"message": "Please wait 30 seconds before making another request."
}
The key detail is the Retry-After header. Without it, the agent might immediately retry, making the problem worse. A clear retry interval gives the agent a concrete instruction it can follow.
Response headers for rate limit transparency
Include rate limit information in every response, not just when the limit is exceeded:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1714003200
Agents that read these headers can throttle themselves before hitting the limit. This is better for everyone: the agent gets its data, and your server stays healthy.
Distinguishing agents from attacks
Not all rapid-fire traffic is malicious. Here are signals that help differentiate legitimate agent traffic from attacks:
User-Agent strings. Most major AI agents identify themselves: GPTBot, ClaudeBot, PerplexityBot, and others. These strings are not hard to spoof, but most attackers do not bother impersonating AI agents.
Request patterns. Agents tend to follow links logically, moving through navigation, pagination, and related pages. Attacks tend to hit random URLs or hammer the same endpoint repeatedly.
Referrer and session behaviour. An agent researching a topic will typically land on one page and follow internal links. An attack often arrives without referrers and with no coherent browsing pattern.
A tiered approach
Rather than a single rate limit for all traffic, consider tiers:
- Verified AI agents (confirmed by reverse DNS or IP verification): 120 requests per minute
- Unverified but identified agents (correct User-Agent but unverified): 60 requests per minute
- Anonymous traffic: 30 requests per minute
- Suspicious patterns: 10 requests per minute, with CAPTCHA challenges after repeated violations
This approach lets legitimate agents work efficiently while still protecting against abuse. The verification step matters. OpenAI, Anthropic, and Google all publish the IP ranges their crawlers use. Cross-referencing the User-Agent string with the source IP confirms the agent is genuine.
What happens when you get it wrong
Block agents too aggressively and your site becomes invisible to AI-powered search and shopping tools. Your competitors' products show up in agent responses; yours do not. Users who rely on agents to find information will simply not see your content.
Block too loosely and a rogue crawler can take down your site. The sweet spot is being explicit about your limits, returning proper HTTP status codes, and giving agents the information they need to play by your rules.