Free robots.txt analyzer

Is your robots.txt blocking AI crawlers?

Check which AI bots can access your site. Get an AI-readiness grade and a recommended robots.txt in 2 seconds.

Free. No signup. Checks 16 AI crawlers in 2 seconds.

Check if AI crawlers can access your site

We'll fetch your robots.txt and check it against 16 major AI crawlers including GPTBot, ClaudeBot, and PerplexityBot.

No robots.txt = missed AI signals
Blocking GPTBot = invisible to ChatGPT
Wildcard Disallow: / blocks everything
No Sitemap directive = missed pages

Comprehensive analysis

What We Check

We fetch your robots.txt and analyze every rule against 16 major AI crawlers.

16 AI Crawlers

GPTBot, ClaudeBot, Google-Extended, PerplexityBot, Applebot, Meta AI, and 10 more

Rule Parsing

User-agent groups, Allow/Disallow paths, wildcard inheritance, and override logic

AI-Readiness Grade

A-F grade and 0-100 score based on how accessible your site is to AI crawlers

Sitemap Detection

Checks for Sitemap: directives that help AI crawlers discover all your pages

Crawl-Delay Analysis

Detects excessive Crawl-delay values that slow down AI crawler access

Recommended robots.txt

Generates an AI-optimized robots.txt with explicit Allow rules for all AI crawlers

How It Works

Three steps to check your robots.txt.

1

Enter your URL

Paste any website URL. We automatically find and fetch the robots.txt file.

2

Get your report

See which AI crawlers are blocked, your AI-readiness grade, and specific issues to fix.

3

Copy & deploy

Copy the recommended robots.txt and deploy it to your site root for instant improvement.

Why Your Robots.txt Matters for AI Visibility

Your robots.txt file is the first thing AI crawlers check before accessing your site. It sits at yoursite.com/robots.txt and acts as a gatekeeper, telling bots what they can and cannot access. For traditional search engines, this has been standard practice since the 1990s. But in the age of AI assistants, robots.txt has taken on an entirely new significance.

The AI Crawler Landscape in 2026

There are now over 16 major AI crawlers actively browsing the web. OpenAI uses GPTBot and ChatGPT-User to power ChatGPT responses. Anthropic sends ClaudeBot to gather context for Claude. Google deploys Google-Extended for Gemini and AI Overviews. Perplexity, Meta, Apple, ByteDance, Cohere, and You.com all have their own crawlers.

Each of these bots checks your robots.txt before crawling. If your file contains User-agent: GPTBot / Disallow: /, ChatGPT will never see your content. Similarly, a blanket User-agent: * / Disallow: / blocks every crawler that doesn't have a specific override.

Training Crawlers vs. Browsing Crawlers

Not all AI crawlers serve the same purpose. Training crawlers like CCBot (Common Crawl) and some configurations of GPTBot collect data to train or fine-tune AI models. Browsing crawlers like ChatGPT-User and OAI-SearchBot fetch pages in real-time when a user asks a question.

Some site owners block training crawlers for copyright reasons while allowing browsing crawlers. This is a valid approach, and your robots.txt gives you granular control over which bots access which paths.

Robots.txt Best Practices for AI

  • Explicitly allow AI crawlers. Don't rely on “not mentioned” status. Add User-agent: GPTBot / Allow: / for each AI bot you want to permit.
  • Include a Sitemap directive. This helps AI crawlers discover all your pages efficiently, not just the ones linked from your homepage.
  • Block only what's private. Use targeted Disallow rules for admin panels, APIs, and internal tools rather than blanket blocks.
  • Avoid excessive Crawl-delay. AI crawlers are generally respectful of server load. A Crawl-delay over 10 seconds can significantly reduce your crawl coverage.
  • Check regularly. AI crawlers are added frequently. Your robots.txt from 2024 may not account for bots launched in 2025-2026.

Beyond Robots.txt: The Full AI Visibility Picture

Robots.txt is just one of 25+ signals that determine whether AI assistants recommend your product. Other critical factors include structured data (JSON-LD), llms.txt files, content quality, FAQ schema, social presence, EEAT signals, and answer-first content formatting. Use our free AI Visibility Audit to check all 25+ signals.

Frequently Asked Questions

Common questions about robots.txt and AI crawlers.

What is a robots.txt file?
A robots.txt file is a plain-text file at the root of your website (e.g. yoursite.com/robots.txt) that tells web crawlers which pages they can and cannot access. It uses directives like User-agent, Disallow, and Allow to control crawler behavior. Every major search engine and AI assistant respects robots.txt.
How does robots.txt affect AI crawlers like GPTBot and ClaudeBot?
AI crawlers like GPTBot (ChatGPT), ClaudeBot (Claude), and PerplexityBot follow robots.txt rules just like search engines. If your robots.txt blocks these bots, your content won't be used in AI-generated answers. This means you could be invisible to users asking AI assistants for product recommendations.
Should I block AI crawlers in my robots.txt?
For most businesses, blocking AI crawlers means missing out on a growing discovery channel. When someone asks ChatGPT or Perplexity for product recommendations, blocked sites won't appear. However, some publishers block AI training crawlers like CCBot while allowing browsing/search crawlers like GPTBot.
What happens if I don't have a robots.txt file?
Without a robots.txt file, all crawlers (including AI bots) can access all public pages on your site. While this means AI can crawl your content, you miss the chance to explicitly signal access, add a Sitemap directive for better crawl efficiency, or exclude private sections like /admin or /api.
How do I check if my site blocks AI bots?
Enter your URL in the analyzer above. It fetches your robots.txt and checks it against 16 AI crawlers including GPTBot, ClaudeBot, Google-Extended, and PerplexityBot. You'll see which bots are blocked, allowed, or not mentioned, plus an AI-readiness grade and specific fix recommendations.
What is the difference between Disallow and Allow in robots.txt?
Disallow tells a crawler not to access a specific path (e.g. 'Disallow: /admin'). Allow explicitly permits access to a path, which is especially useful to override a broader Disallow rule. For example, you might Disallow: / for all bots but then Allow: / specifically for GPTBot.
Does blocking GPTBot prevent my site from appearing in ChatGPT?
Yes. OpenAI uses GPTBot to crawl web content for ChatGPT's responses. If GPTBot is blocked in your robots.txt, your site content will not be included in ChatGPT's training data or browsing results. Similarly, blocking ClaudeBot affects Claude, and blocking PerplexityBot affects Perplexity.
How do I allow specific AI crawlers in my robots.txt?
Add a User-agent block for each AI crawler you want to allow. For example: 'User-agent: GPTBot' followed by 'Allow: /' on the next line. Repeat for ClaudeBot, Google-Extended, PerplexityBot, and others. Our analyzer generates a recommended robots.txt with all AI crawlers explicitly allowed.
What is Crawl-delay and should I use it?
Crawl-delay is a robots.txt directive that tells crawlers to wait a specified number of seconds between requests. While it can reduce server load, excessive crawl-delay (over 10 seconds) significantly slows indexing. Most AI crawlers are already respectful of server resources, so Crawl-delay is rarely needed.
How does this tool differ from the full AI Visibility Audit?
This tool checks one specific signal: your robots.txt configuration for AI crawlers. The full AI Visibility Audit checks 25+ signals including structured data (JSON-LD), llms.txt, content quality, EEAT signals, social presence, FAQ schema, and more. Think of this as a quick check and the full audit as the comprehensive assessment.
This checks 1 of 25+ signals

Robots.txt is just the beginning.

Your robots.txt controls crawl access, but AI visibility depends on 25+ signals including structured data, llms.txt, content quality, EEAT, and more. Get the full picture.

Run Free AI Visibility Audit