Skip to main content

Robots.txt Generator

Build comprehensive robots.txt rules with presets for WordPress, e-commerce, and AI bot blocking.

Free & unlimited
Presets

No sitemap URL specified. Adding one helps crawlers discover your content.

Rule for "*" has an empty Disallow — this allows everything (same as Allow: /).

User-agent rules
User-agent
TypePath
Output
2 lines
User-agent: *
Disallow:

Robots.txt best practices

Always include a Sitemap directive to help crawlers discover your content.
Use Allow rules to override broader Disallow rules for specific paths.
Wildcards (*) in paths match any string. Dollar sign ($) matches end of URL.
robots.txt is publicly accessible — never use it to hide sensitive URLs.
Each user-agent section should be complete. Bots only follow the most specific match.
Test your robots.txt with Google Search Console before deploying.
Use Crawl-delay cautiously — Googlebot ignores it, but other bots respect it.
Block faceted navigation and search result pages to save crawl budget.
All processing happens in your browser. No data is sent to any server.

About this tool

  1. 1

    Select user agents

    Choose which crawlers to configure rules for - Googlebot, Bingbot, or all bots with the wildcard agent.

  2. 2

    Define allow and disallow rules

    Add directory paths to allow or block for each user agent using the guided form inputs.

  3. 3

    Add sitemap reference

    Enter your sitemap URL to include a Sitemap directive that helps crawlers discover your content.

  4. 4

    Export robots.txt

    Copy the generated robots.txt content or download it as a file to place in your site root.

  • Always test your robots.txt rules with the robots-txt-validator tool before deploying to avoid accidentally blocking important pages.
  • Use "Disallow: /admin/" to block admin areas and "Disallow: /api/" to prevent indexing of API endpoints.
  • Add a crawl-delay directive for aggressive bots, but note that Googlebot ignores it - configure crawl rate in Search Console instead.
  • Remember that robots.txt is publicly accessible; never rely on it to hide sensitive content - use authentication or noindex instead.
  • Guided form interface - no need to memorize robots.txt syntax
  • Support for multiple user-agent blocks with separate rules
  • Automatic Sitemap directive insertion
  • Syntax validation and real-time preview of the output
  • Common preset templates for WordPress, e-commerce, and SPA sites
  • Set up crawl rules for a new website before launch
  • Block search engines from indexing staging or development environments
  • Prevent crawling of duplicate content like print-friendly pages or internal search results
  • Create user-agent-specific rules to manage crawl budget on large sites
It must be in the root directory of your domain (e.g., https://example.com/robots.txt). It will not work in subdirectories.
No. It prevents crawling but not indexing. If a page is already indexed, use a noindex meta tag or request removal in Search Console.
Yes. Use patterns like "Disallow: /*.pdf$" to block PDF files from being crawled by compliant bots.

Related tools

View all

We use anonymous analytics to improve ToolChamp. No personal data is stored or sold. Privacy Policy