Robots.txt generator, live compare, path tester and WordPress crawl rule planner
Generate a clean robots.txt draft for WordPress, public sites, ecommerce, staging or AI crawler rules, test important paths against the draft, fetch the current live robots.txt file, compare generated and live lines, and copy an audit-friendly output with sitemap declarations and crawl warnings.
Robots.txt controls crawling, not secrecy and not guaranteed indexing. Test live behavior after publishing at the root of the exact host.
A robots.txt generator should produce a file you can audit
A robots.txt file is small, but it can change how crawlers discover a site. A single Disallow: / on a public domain can block every path, while a missing sitemap line can make discovery slower after a migration. For WordPress sites, the goal is usually simple: protect admin paths, allow the AJAX endpoint crawlers may need, leave public content crawlable, and point search engines to the sitemap.
This robots.txt generator is built for that practical workflow. It creates templates for WordPress, generic public sites, ecommerce sections, staging blocks and AI crawler controls. It tests important paths against the generated rules, warns about common crawl mistakes, fetches the current live robots.txt file for comparison, and shows what changed before you upload a new file or edit SEO plugin settings.
How to use generated robots.txt safely
Upload the file at the root of the host, for example https://example.com/robots.txt, or configure it through the SEO plugin or hosting layer that controls your virtual robots output. Then fetch the live file and test public pages, admin areas, search pages, sitemap URLs and any Search Console URL that looks blocked. Robots rules are host-specific, so check www and non-www when both resolve.
- Use robots.txt for crawl access, not for private content.
- Use noindex on fetchable pages when the goal is index removal.
- Keep sitemap lines absolute and current.
- Do not block CSS or JavaScript needed to render public pages.
- Compare the live file after cache, plugin or hosting changes.
Common robots.txt mistakes
Blocking the whole site on production is the classic mistake, especially after a staging launch. Blocking /wp-content/ can prevent crawlers from rendering assets. Blocking pages that should carry a noindex signal can keep crawlers from seeing that signal. Forgetting that robots.txt is public can expose private folder names. The safest file is intentional, short and checked against real paths.
Common questions
Does robots.txt guarantee that a URL will not be indexed?
No. It controls crawling. For index removal, use noindex on a fetchable page, redirects, canonical cleanup or correct status codes depending on the goal.
Should every WordPress site block wp-admin?
Most public WordPress sites block /wp-admin/ and allow /wp-admin/admin-ajax.php. Test your theme and plugins after changes.
Can AI crawler rules go in robots.txt?
Some AI crawlers read robots.txt user-agent groups. Treat those rules as a public preference signal, not a security boundary.
What should a basic robots.txt contain?
A User-agent line, the Disallow or Allow rules, and a Sitemap line pointing to your XML sitemap. For most sites, allowing everything plus the sitemap line is the right default.
Does disallowing a path hide it from Google?
No. It blocks crawling, but the URL can still be indexed without a snippet. To keep a page out of the index, allow crawling and use a noindex directive instead.
Where do I put robots.txt?
At the root of each host, exactly at /robots.txt. Files in subfolders are ignored, and every subdomain needs its own robots.txt.













