PerplexityBot
PerplexityBot is Perplexity's web crawler that indexes pages so they can be retrieved and cited in Perplexity's answers, identified by the PerplexityBot user-agent.
PerplexityBot is the crawler that lets Perplexity find and cite your pages.
Unlike pure training crawlers, its job is indexing for retrieval — it gathers
content so Perplexity's answer engine can surface and link your page as a source.
It identifies itself with the PerplexityBot user-agent and checks
robots.txt.
Because PerplexityBot feeds a citation-driven engine, allowing it is squarely an AEO move: block it and you remove yourself from Perplexity's pool of citable sources. As with all AI crawlers, it needs your content available as crawlable, server-rendered HTML — the access pillar — since browser-only content won't be seen. This is the difference between citation crawlers, which you generally want to allow, and training crawlers, where allowing is optional.
Example. If your robots.txt accidentally blocks PerplexityBot, your pages simply stop appearing as sources in Perplexity answers, no matter how good they are — a silent, self-inflicted loss of visibility.
Relevant pillar
Related terms
- GPTBotGPTBot is OpenAI's web crawler that gathers content to train its models, identified by the GPTBot user-agent and controllable through your robots.txt file.
- ClaudeBotClaudeBot is Anthropic's web crawler that collects content used to train its Claude models, identified by the ClaudeBot user-agent and controllable via robots.txt.
- robots.txtrobots.txt is a plain text file at the root of your domain that tells crawlers which user-agents may access which parts of your site, and is how you allow or block AI crawlers.