ClaudeBot
ClaudeBot is Anthropic's web crawler that collects content used to train its Claude models, identified by the ClaudeBot user-agent and controllable via robots.txt.
ClaudeBot is Anthropic's crawler for gathering training data. It fetches public
pages to help train the Claude family of models, identifies itself with the
ClaudeBot user-agent, and respects robots.txt directives.
Anthropic uses separate user-agents for user-initiated live fetches, so allowing or
blocking ClaudeBot governs training use specifically.
As with other AI crawlers, ClaudeBot reads the HTML your server returns; content that only appears after JavaScript runs in a browser is effectively absent to it. Serving your content as crawlable, server-rendered text is the access pillar at work. The decision to allow ClaudeBot is a content-rights choice rather than a direct AEO lever — it affects whether your material informs the model, not whether a specific answer cites you today.
Example. A publisher that wants to keep its archives out of model training adds
a User-agent: ClaudeBot / Disallow: / block to robots.txt, while a site
pursuing maximum reach leaves it allowed alongside the other AI crawlers.
Relevant pillar
Related terms
- GPTBotGPTBot is OpenAI's web crawler that gathers content to train its models, identified by the GPTBot user-agent and controllable through your robots.txt file.
- PerplexityBotPerplexityBot is Perplexity's web crawler that indexes pages so they can be retrieved and cited in Perplexity's answers, identified by the PerplexityBot user-agent.
- robots.txtrobots.txt is a plain text file at the root of your domain that tells crawlers which user-agents may access which parts of your site, and is how you allow or block AI crawlers.