robots.txt Generator for AI Crawlers
Toggle each AI crawler — GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, Google-Extended, and more — to allow or block, add your sitemap, and copy a ready-to-use robots.txt. Each crawler is labelled by what it actually does, so you allow the bots that get you cited.
Toggle each AI crawler to allow or block, add your sitemap, and copy a ready-to-use robots.txt — generated in your browser. AI engines can only cite pages their crawlers are allowed to fetch, so the goal for visibility is simple: allow the bots that get you cited, and decide about the rest on purpose.
Quick answer
Allow the citation crawlers (OAI-SearchBot, PerplexityBot, Bingbot) so engines can quote you. Blocking training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) is a rights choice that won't change your citations. Each bot below is labelled by what it actually does.
robots.txt generator for AI crawlers
12/12 allowed
Toggle each crawler to allow or block it. To be cited, you want the Citations bots allowed. Blocking Trainingbots is a rights choice — it won't improve or remove your citations.
# robots.txt — generated by The AEO Canon # https://www.aeocanon.com/tools/robots-txt-generator-ai-crawlers User-agent: OAI-SearchBot Allow: / User-agent: ChatGPT-User Allow: / User-agent: GPTBot Allow: / User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / User-agent: ClaudeBot Allow: / User-agent: Claude-User Allow: / User-agent: Google-Extended Allow: / User-agent: Bingbot Allow: / User-agent: Applebot-Extended Allow: / User-agent: CCBot Allow: / User-agent: Bytespider Allow: /
Place the file at the root of your domain (example.com/robots.txt). robots.txt is a public, voluntary standard — reputable crawlers honor it, but it isn't access control. Note: Google AI Overviews use Googlebot, which has no separate AI opt-out.
Allow the right bots — and know what each one does
The single most common mistake is silently blocking the crawlers you want, usually
with a blanket Disallow: / or a restrictive default that catches AI user-agents.
The fix is to be explicit. Allow the search and live-fetch bots that drive
citations; only block training bots if you have a deliberate
reason to. And remember the Google nuance: blocking
Google-Extended does not remove you from AI
Overviews, because those are served by
Googlebot, which has no separate AI opt-out. This is the Access
pillar — the price of admission, with no partial credit. For the
syntax rules themselves, see Google's robots.txt
documentation.
Where to go next
For the full, copy-paste reference block and the one mistake that silently blocks every AI crawler, read how to allow AI crawlers in robots.txt. Then confirm it actually worked with how to check AI crawlers can read your site, and make sure your content isn't hidden behind JavaScript in why JavaScript breaks AI citation. Access is pillar one of The AEO Canon.
How we review
This guideis compiled from each vendor’s own documentation and current independent testing, and was last verified in 2026; we re-check quarterly. Pricing and features in this space change fast — confirm current details on the vendor’s site before buying. We don’t earn affiliate commissions on the tools we cover, and we don’t accept payment for placement.
Frequently asked questions
- Should I allow AI crawlers in robots.txt?
- To be cited in AI answers, yes — allow the crawlers that power citations, such as OAI-SearchBot (ChatGPT), PerplexityBot, and Bingbot (Copilot). If a crawler can't fetch your page, that engine can't quote it. Blocking training crawlers like GPTBot or ClaudeBot is a separate rights decision and won't improve or remove your citations; this generator labels each bot so you can choose deliberately.
- What's the difference between AI search crawlers and training crawlers?
- Search crawlers (OAI-SearchBot, PerplexityBot, Bingbot) fetch pages so an engine can cite them in live answers — these are the ones that drive your AI visibility. Training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) collect content to train models. Blocking training crawlers restricts how your content trains AI but does not remove you from that engine's search citations.
- Does blocking Google-Extended remove me from AI Overviews?
- No. Google-Extended only controls whether your content is used to train Gemini and Vertex AI. It has no effect on Google Search or AI Overviews, which are served by Googlebot — and Google offers no separate opt-out from AI Overviews short of blocking Googlebot, which would remove you from Search entirely. Blocking Google-Extended is purely a training choice.
- Where do I put the robots.txt file?
- At the root of your domain, reachable at example.com/robots.txt. It must be a plain text file served from the top level — a robots.txt in a subfolder is ignored. Remember robots.txt is a public, voluntary standard that reputable crawlers honor; it signals intent but is not access control or security.