AI Bot Analyzer

See which AI crawlers can access your site and manage training opt-out

Known AI Bots Database

Frequently Asked Questions

How many AI bots does this tool track?

We track 16 known AI crawler bots from companies including OpenAI, Anthropic, Google, Meta, Apple, Perplexity, and ByteDance.

How do I block GPTBot?

Add "User-agent: GPTBot" followed by "Disallow: /" to your robots.txt file. This blocks OpenAI's crawler from accessing your site.

What is the difference between crawl and train bots?

Crawl-only bots retrieve your content for search results. Training bots use your content to train AI models. Some bots do both.

How often is the bot database updated?

Quarterly. When the database is older than 90 days, a warning badge is displayed. Community contributions via GitHub are welcome.

Can I opt out of AI training only?

Yes. You can block training bots (CCBot, Bytespider, Google-Extended) while allowing search bots (OAI-SearchBot, PerplexityBot) for AI search visibility.

Does blocking bots affect my Google ranking?

Blocking AI-specific bots does not affect traditional Google Search rankings. Only blocking Googlebot would impact your SEO.

What is Google-Extended?

Google-Extended is a user-agent that controls whether your site content is used to train Google's Gemini AI models. Blocking it doesn't affect Google Search.

How do I add a new bot to your database?

Open a GitHub issue with the bot's user-agent string, company name, and documentation link. We verify and include it in the next quarterly update.

Is ClaudeBot the same as Claude?

ClaudeBot is Anthropic's web crawler. Claude is the AI chatbot. Blocking ClaudeBot prevents Anthropic from crawling your site for training data.

Are there any bots I cannot block?

Robots.txt is a voluntary standard. Some rogue crawlers may ignore it. Our Dark AI Crawler tool identifies these non-compliant bots.