Methodology

How the crawler access report works

AI Crawler Checker reviews public signals that site owners control. It starts with robots.txt, checks each supported automatic crawler at the submitted URL path, shows user-triggered fetchers separately when vendor behavior is non-standard, then inspects page-level metadata and discovery files.

What the score can tell you

The report can show that a public page is blocked by robots.txt, noindex metadata, X-Robots-Tag headers, a failed HTTP response, a redirect problem, missing readable text, missing structured data, or absent discovery files such as sitemap.xml and llms.txt.

Focused readiness pages

The llms.txt, AEO, and AI search visibility pages reuse the same live scan. They change the focused report and scoring lens, not the evidence source. The AI search visibility page is a technical readiness check and does not query live AI answer engines.

How user-triggered fetchers are handled

Some user agents represent a person asking an AI product to fetch a page, not an automatic crawler. When official documentation says robots.txt may not apply or is generally ignored, the report labels that row as not scored instead of mixing it into the crawler-policy pass rate.

What it cannot promise

Passing these checks does not guarantee AI visibility, ranking, citation, or inclusion in any answer engine. AI platforms change behavior over time and may use their own quality, safety, freshness, and source-selection systems.

Access boundary

The scanner only checks public URLs and public site settings. It does not bypass logins, paywalls, firewall rules, bot defenses, or private systems. Requests use timeouts, redirect limits, response-size limits, private-network blocking, and lightweight rate limiting to keep API route cost and abuse risk bounded.