WQI.web​qualityindex

AI-readiness

AI crawler permissions

Explicit allow/disallow rules for GPTBot, ClaudeBot, PerplexityBot, and friends. Default-deny means missing AI citations; default-allow means free training data.

Authority
Per-vendor
Version
robots.txt convention
Jurisdiction
Global
Source
developers.openai.com
Last reviewed
2026-04-28
Last verified
pending

What it is

User-agent–specific rules in robots.txt that grant or deny access to known AI crawler bots: OpenAI's GPTBot, Anthropic's ClaudeBot, Common Crawl's CCBot, Google's Google-Extended, and others.

Why it matters

An unaddressed robots.txt is ambiguous in 2026 — some bots default to allow, some don't. Be explicit, and decide whether you want to be in the AI corpus.

Who it applies to

Every site that has an opinion about AI training and citation.

How WQI scores it

Web Quality Index considers this standard satisfied when the supporting factor passes.

# Factor Status
16 AI crawler robots.txt directives live

Related standards

See also
llms.txt , ai.txt , robots/sitemap , AI Preferences

Other references