Question 1

What is a robots.txt file and where does it go?

Accepted Answer

robots.txt is a plain-text file that tells crawlers which parts of your site they may request. It must live at the root of your domain — https://example.com/robots.txt — and apply to that host and protocol only. A file at a subpath (like /blog/robots.txt) is ignored. Generate the file here, then upload it to your web root or have your framework serve it at /robots.txt.

Question 2

How do I block ChatGPT, Claude, and other AI crawlers?

Accepted Answer

Add a group that disallows the AI user-agents: GPTBot (OpenAI training), ClaudeBot (Anthropic training), Google-Extended (Gemini training), CCBot (Common Crawl), and others. Choose “Block AI training crawlers” to stop training bots while keeping AI search and on-demand fetch bots, or “Block all AI crawlers” to disallow every AI user-agent. The generator writes the correct User-agent groups for you.

Question 3

Does robots.txt actually stop AI from using my content?

Accepted Answer

Only for crawlers that choose to obey it. robots.txt is a voluntary standard: well-behaved bots (Googlebot, Bingbot, GPTBot, ClaudeBot) respect it, but it is not enforced. It cannot stop a crawler that ignores the rules, and it does not remove content already trained on or indexed. For guaranteed control, gate content behind authentication or block user-agents at the server or CDN.

Question 4

Is robots.txt a security feature?

Accepted Answer

No. robots.txt is a crawler instruction, not access control. Listing a path under Disallow tells compliant crawlers to skip it, but the path is still publicly reachable — and the robots.txt file itself is public, so you are effectively publishing a list of the directories you consider sensitive. Protect private content with authentication or server-side rules, never with Disallow.

Question 5

What should a WordPress robots.txt contain?

Accepted Answer

Disallow /wp-admin/ but explicitly Allow /wp-admin/admin-ajax.php, because themes and plugins call admin-ajax.php on the front end and blocking it can break functionality and rendering. You generally do not need to block /wp-includes/ on modern WordPress. Pick the WordPress template here to get those rules.

Question 6

Why does the staging option block everything?

Accepted Answer

A staging, preview, or development site should never be indexed — duplicate content competing with production hurts SEO, and you rarely want unfinished pages public. The staging template emits User-agent: * with Disallow: / and omits the sitemap. Swap to your production rules (or remove the blanket block) before launch.

Question 7

Does Crawl-delay work for Google?

Accepted Answer

No. Googlebot ignores Crawl-delay entirely; control its crawl rate in Google Search Console instead. Bing and Yandex do honour Crawl-delay, so it is still useful for those engines. The generator adds the directive when you set a value, and the validator reminds you of the Google caveat.

Question 8

Do I need to list my sitemap in robots.txt?

Accepted Answer

It is optional but recommended. A Sitemap: line gives every crawler an absolute URL to your sitemap without you having to submit it in each search engine's tools. You can list more than one Sitemap line. Submitting the sitemap in Search Console as well does no harm.

Question 9

Does this generator send my configuration to a server?

Accepted Answer

No. The file is assembled by JavaScript in your browser — nothing is uploaded, logged, or stored. You can build the file offline once the page has loaded and download it directly.

Site type	Default rules	Why
Blog / content site	Allow: /	Crawl everything — content sites want maximum indexing.
SaaS app	Disallow: /dashboard/ Disallow: /settings/ Disallow: /account/ Disallow: /api/	Keep the marketing site indexable, hide the authenticated app and API.
Ecommerce	Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /*?add-to-cart=	Index products and categories; hide cart, checkout, and account flows.
Documentation	Allow: /	Index all docs so search and AI answer engines can cite them.
WordPress	Allow: /wp-admin/admin-ajax.php Disallow: /wp-admin/	Block wp-admin but keep admin-ajax.php, which themes and plugins need.

User-agent	Operator	Purpose
GPTBot	OpenAI GPTBot (training)	training
OAI-SearchBot	OpenAI SearchBot	search
ChatGPT-User	ChatGPT-User (on-demand)	assistant
ClaudeBot	Anthropic ClaudeBot (training)	training
Claude-User	Claude-User (on-demand)	assistant
Claude-SearchBot	Claude SearchBot	search
anthropic-ai	anthropic-ai (legacy)	training
Google-Extended	Google-Extended (Gemini training)	training
PerplexityBot	PerplexityBot	search
Perplexity-User	Perplexity-User (on-demand)	assistant
CCBot	Common Crawl CCBot	training
Meta-ExternalAgent	Meta-ExternalAgent	training
Amazonbot	Amazonbot	training
Applebot-Extended	Applebot-Extended	training
Bytespider	ByteDance Bytespider	training

Robots.txt Generator

Site-type templates

AI crawler reference

What robots.txt does — and does not — do

Wildcards and pattern matching

Frequently asked

Robots.txt Generator

Robots.txt builder

What robots.txt does — and does not — do

Wildcards and pattern matching

Frequently asked