Should I allow AI crawlers in robots.txt?

To be cited in AI answers, yes — allow the crawlers that power citations, such as OAI-SearchBot (ChatGPT), PerplexityBot, and Bingbot (Copilot). If a crawler can't fetch your page, that engine can't quote it. Blocking training crawlers like GPTBot or ClaudeBot is a separate rights decision and won't improve or remove your citations; this generator labels each bot so you can choose deliberately.

What's the difference between AI search crawlers and training crawlers?

Search crawlers (OAI-SearchBot, PerplexityBot, Bingbot) fetch pages so an engine can cite them in live answers — these are the ones that drive your AI visibility. Training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) collect content to train models. Blocking training crawlers restricts how your content trains AI but does not remove you from that engine's search citations.

Does blocking Google-Extended remove me from AI Overviews?

No. Google-Extended only controls whether your content is used to train Gemini and Vertex AI. It has no effect on Google Search or AI Overviews, which are served by Googlebot — and Google offers no separate opt-out from AI Overviews short of blocking Googlebot, which would remove you from Search entirely. Blocking Google-Extended is purely a training choice.

Where do I put the robots.txt file?

At the root of your domain, reachable at example.com/robots.txt. It must be a plain text file served from the top level — a robots.txt in a subfolder is ignored. Remember robots.txt is a public, voluntary standard that reputable crawlers honor; it signals intent but is not access control or security.

robots.txt Generator for AI Crawlers

How we review

This guideis compiled from each vendor’s own documentation and current independent testing, and was last verified in 2026; we re-check quarterly. Pricing and features in this space change fast — confirm current details on the vendor’s site before buying. We don’t earn affiliate commissions on the tools we cover, and we don’t accept payment for placement.

Frequently asked questions

Should I allow AI crawlers in robots.txt?: To be cited in AI answers, yes — allow the crawlers that power citations, such as OAI-SearchBot (ChatGPT), PerplexityBot, and Bingbot (Copilot). If a crawler can't fetch your page, that engine can't quote it. Blocking training crawlers like GPTBot or ClaudeBot is a separate rights decision and won't improve or remove your citations; this generator labels each bot so you can choose deliberately.
What's the difference between AI search crawlers and training crawlers?: Search crawlers (OAI-SearchBot, PerplexityBot, Bingbot) fetch pages so an engine can cite them in live answers — these are the ones that drive your AI visibility. Training crawlers (GPTBot, ClaudeBot, CCBot, Google-Extended) collect content to train models. Blocking training crawlers restricts how your content trains AI but does not remove you from that engine's search citations.
Does blocking Google-Extended remove me from AI Overviews?: No. Google-Extended only controls whether your content is used to train Gemini and Vertex AI. It has no effect on Google Search or AI Overviews, which are served by Googlebot — and Google offers no separate opt-out from AI Overviews short of blocking Googlebot, which would remove you from Search entirely. Blocking Google-Extended is purely a training choice.
Where do I put the robots.txt file?: At the root of your domain, reachable at example.com/robots.txt. It must be a plain text file served from the top level — a robots.txt in a subfolder is ignored. Remember robots.txt is a public, voluntary standard that reputable crawlers honor; it signals intent but is not access control or security.

robots.txt Generator for AI Crawlers

Allow the right bots — and know what each one does

Where to go next

Frequently asked questions

Keep exploring