XML sitemap discovery and quality audit
Load a sitemap index or URL sitemap, sample child sitemap files, inspect URL hygiene, run a live status check on selected submitted URLs, and leave with a report that separates discovery coverage from indexability work.
The audit samples child sitemaps and URL rows to stay useful in the browser. A sitemap helps discovery; status, canonical, robots and content quality still decide whether a submitted URL is a good search candidate.
What a sitemap analyzer should tell you
A sitemap is not a ranking shortcut and it is not a substitute for internal links. It is a discovery map. A good sitemap analyzer should therefore answer practical questions before it shows a giant URL list: can the XML file be fetched, is it a sitemap index or a URL sitemap, do child sitemap files parse, which URLs are being submitted, and do those submitted URLs look like the canonical public pages you actually want crawlers to find?
This tool is built around that review. It starts at the sitemap URL you give it, samples child sitemap files when the root is an index, keeps the submitted URLs visible, checks common hygiene signals such as HTTPS, hostname consistency, duplicate rows and missing lastmod values, then probes a controlled URL sample for live HTTP status. That is far more useful than staring at raw XML when you have just published a batch of WordPress articles or changed an SEO plugin setting.
Sitemap index and child sitemaps are different layers
Many WordPress SEO plugins expose one sitemap index and several child sitemaps. The index is the directory. A child sitemap usually contains the post, page, category, author, image or other URL rows. When you submit only one child file by mistake, you may be checking a narrow slice of the site. When a child sitemap stops updating, the root index can still look healthy at first glance. Reading both layers is the safer habit.
- Root sitemap tells you whether the public entry point loads and what type of XML it exposes.
- Child sitemap sample shows which sections are present and whether sampled files parse.
- URL sample reveals the exact submitted locations and lastmod values visible in the XML.
- Status audit checks a live sample so 404, 5xx and unstable destinations are not hidden inside clean markup.
- Action guide keeps discovery issues separate from page-level indexability work.
How to judge a submitted URL
A URL in a sitemap should usually be a preferred, public, canonical destination. That means it should use the expected protocol and host, return a healthy status, avoid accidental duplicates and agree with the internal linking strategy. If an old URL redirects, a private section appears, a noindex archive is still submitted or a deleted page returns 404, the XML file may be technically valid while the search signal is still noisy.
Lastmod deserves a measured reading. A missing lastmod value does not automatically make a sitemap broken. A misleading one is also not useful. Treat lastmod as a change signal that should reflect meaningful page updates when your generator can provide it reliably. If a plugin refresh changes lastmod for every URL on every minor event, the value becomes harder to trust during audits.
Useful WordPress sitemap checks
After bulk publishing tools, moving content from pages to posts, changing categories, switching SEO plugins, changing permalink rules or clearing a sitemap cache, check the root sitemap again. Confirm the expected post sitemap exists, open the sampled rows, and test a few important new URLs directly. If a URL does not appear yet, make sure it is published, indexable, linked from a relevant hub and included by the SEO plugin rules before assuming Search Console is the problem.
Sitemaps and Google Search Console
Submitting the sitemap index is usually the cleanest start for a site with multiple child sitemaps. Search Console can then report fetch problems and discovered URLs over time, but the local audit still matters. It catches simple mistakes before you wait for crawler reports: wrong host, stale child sitemap, unexpected 404 rows, non-HTTPS output after a migration, or a sitemap file that no longer parses after a cache or plugin change.
A practical sitemap workflow
- Analyze the sitemap index submitted for the canonical host.
- Read the child sitemap sample and confirm the content sections you expect are present.
- Review URL rows for protocol, hostname, duplicates, lastmod patterns and unwanted sections.
- Status-check important submitted URLs before asking a crawler to revisit them.
- Pair sitemap review with robots, indexability, canonical and internal link checks on pages that matter.
Common questions
Does a sitemap guarantee indexing?
No. It helps discovery. A page still needs to be accessible, technically indexable, useful, internally connected and worth keeping in the search results.
Should redirected URLs stay in a sitemap?
Usually not when you control the sitemap. Prefer the final canonical destination. Redirects are useful for old links and migrations, while the sitemap should describe the URLs you want discovered now.
Can a sitemap be valid but still low quality?
Yes. XML can parse perfectly while it submits thin pages, duplicate archives, dead URLs or pages blocked by other search signals. That is why a sitemap audit needs both structure and sampled URL checks.
Why should I submit an XML sitemap?
It tells search engines which URLs you consider important and when they changed, which speeds up discovery, especially for large sites or pages with few internal links.
How many URLs can one sitemap hold?
Up to 50000 URLs or 50 MB uncompressed. Beyond that, split into multiple sitemaps and list them in a sitemap index file.
Should I include noindex or redirected URLs in my sitemap?
No. A sitemap should list only canonical, indexable 200 URLs. Including redirects, errors or noindex pages wastes crawl budget and sends mixed signals.













