Index control and preview directive audit
Read robots meta tags and X-Robots-Tag headers from a public URL, compare generic and crawler-specific directives, separate index control from snippet control, and keep header-only files such as PDFs in the same review workflow.
Meta tags are visible only in HTML. X-Robots-Tag headers can control HTML or non-HTML responses, including files where a robots meta tag cannot exist.
What robots meta and X-Robots-Tag actually decide
Robots directives are easy to flatten into one word: indexable or blocked. Real audits need a cleaner split. A noindex signal is about whether a fetched URL may appear in search results. nofollow changes link-handling instructions. Preview directives such as max-snippet, max-image-preview and nosnippet affect how much search systems may show around a result. Those are not the same decision.
This checker reads the generic robots meta tag, crawler-specific meta tags when HTML can be fetched, and the X-Robots-Tag response header. That matters on a WordPress site because an SEO plugin can output page-level tags in HTML while a host, CDN or custom rule adds a header. It matters even more for PDFs, images and other non-HTML files where headers are the practical control surface.
Robots meta is not robots.txt
A robots.txt rule controls crawling access to a path. A page-level directive must be fetched before a crawler can read it. If a URL is blocked from crawling, a noindex tag on that URL may not be the cleanup mechanism you think it is. For moved content, read redirects. For duplicate public content, read canonicals. For URLs you do not want in search, use index-control directives where crawlers can actually see them.
- robots is the generic HTML meta directive set.
- googlebot and bingbot can override or specialize page behavior for named crawlers.
- X-Robots-Tag lives in HTTP headers and matters for non-HTML responses.
- Snippet directives shape previews without necessarily removing a URL from search.
- Expected outcome keeps an intentional noindex from being graded like an accident.
A practical robots directive workflow
- Check the exact public URL that appeared in a sitemap, Search Console report or support ticket.
- Read the response status and content type before interpreting the tags.
- Compare generic meta, crawler-specific tags and X-Robots-Tag headers together.
- Pair noindex findings with canonical, redirect and robots.txt checks when signals disagree.
- Retest after theme, SEO plugin, cache, CDN or server-header changes.
Common questions
Is nofollow the same as noindex?
No. Noindex addresses search-result inclusion for the URL. Nofollow is a link-handling directive. Mixing them casually can hide the real reason a page is not performing.
Why check X-Robots-Tag on a PDF?
A PDF has no HTML head where a robots meta tag can be placed. A response header is the normal place to apply header-level indexing or preview controls to that file.
Does a missing robots meta tag mean a public page is broken?
No. A page can be indexable without a robots meta tag. The problem is an unexpected restrictive directive, conflicting control layers or an unreadable response path.
What is the difference between the robots meta tag and robots.txt?
robots.txt controls crawling at the site level; the robots meta tag (and X-Robots-Tag header) controls indexing per page. A page must be crawlable for its noindex meta to be seen.
What does noindex, follow mean?
Do not index this page, but still follow its links so their targets can be discovered and ranking flows through. It is common on paginated or thin pages you still want to pass link value.
Can I set robots directives in an HTTP header?
Yes, the X-Robots-Tag header applies the same directives and is the only way to control non-HTML files like PDFs or images, which have no place for a meta tag.













