robots.txt
robots.txt is a file that tells web crawlers which pages they can access. For GEO, it's crucial to allow AI bots access to your content.
Why It Matters for GEO
Many sites block AI bots by default. A GEO-optimized robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, and anthropic-ai.
Your robots.txt is the gatekeeper between your content and AI citation engines. If it blocks AI bots — intentionally or by accident — your site is invisible to ChatGPT's web search, Claude's browsing, and Perplexity's live retrieval system. Businesses that check their robots.txt frequently discover they have been inadvertently blocking all non-Google crawlers for months, losing citation opportunities throughout that period.
GEO-Optimized robots.txt
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
Practical Example
A technology distributor has a standard robots.txt file set up by their web agency two years ago. It contains User-agent: * Disallow: /api/ Disallow: /admin/ — which looks reasonable. But their CMS also added User-agent: * Disallow: / as a default "maintenance mode" rule that was never removed. Every AI crawler is blocked. After fixing the file and explicitly allowing GEO-relevant bots, their product pages and guides start appearing in Perplexity results within weeks. They attribute 120 new B2B leads in the following quarter to traffic from AI citations.
Common Mistakes
- Wildcard Disallow rules:
User-agent: * Disallow: /blocks everything including AI bots. Always audit your full robots.txt for unintended global rules. - Not listing all AI user agents separately: Some sites allow GPTBot but forget ClaudeBot and anthropic-ai (Anthropic uses both). Each major AI platform requires its own explicit rule.
- Blocking by URL pattern: Disallowing
/content/or/guides/to block scrapers may also block the exact pages AI engines most want to index. Review pattern rules carefully. - Never auditing after CMS updates: Many CMS platforms and plugins modify robots.txt automatically. Schedule a monthly check to ensure AI bot permissions have not been silently revoked.