SEO

robots.txt + sitemap.xml

Tell crawlers what to index and what to skip. The most basic site-discoverability hygiene.

Authority: IETF / Sitemaps.org
Version: RFC 9309 / sitemaps.org
Jurisdiction: Global
Source: rfc-editor.org
Last reviewed: 2026-04-28
Last verified: pending

What it is

robots.txt (RFC 9309) controls crawler access at the host level; sitemap.xml (sitemaps.org) lists indexable URLs with metadata. Both live at well-known paths.

Why it matters

Without a sitemap, search engines have to discover every URL through links — slow and incomplete on large sites. Without robots.txt, you can't direct AI crawlers or block low-value paths.

Who it applies to

Every public-facing site.

How WQI scores it

Web Quality Index considers this standard satisfied when the supporting factor passes.

#	Factor	Status
14	Sitemap.xml + robots.txt presence	live

Related standards

See also: AI crawlers , Schema.org

Standards that share factors with this one

Auto-computed from overlapping factor tickets in satisfiedBy, excluding standards already listed under "See also" above. Strong overlap suggests these standards rise and fall together when sites are scored.

Other references

spec sitemaps.org protocol
guidance Google Search Central — robots.txt