Question 1

What makes a page non-indexable?

Accepted Answer

A handful of usual suspects. A noindex (meta robots or the X-Robots-Tag header), an HTTP status that is not 200, a canonical pointing off somewhere else, or a login wall. Here is the one that trips people up. A robots.txt disallow blocks crawling, which is not the same as blocking indexing. Different problem entirely, so I treat it as its own signal.

Question 2

Does a robots.txt disallow remove a page from Google?

Accepted Answer

No, and that answer surprises people constantly. Disallow stops the crawl. But if other pages link to that URL, Google can still list it, just with no snippet, that sad no information is available blurb. So if you genuinely want it gone, do the opposite of what feels right. Allow the crawl so Googlebot can reach the page, then serve a noindex. You have to let it in before it will agree to leave.

Question 3

What is the difference between noindex and canonical?

Accepted Answer

They feel similar. They are really not. Noindex is a flat no, keep this page out of the index, full stop. A canonical is softer, just a hint that says these pages are basically the same, treat this one as the master copy. With a canonical the page still gets crawled and can still surface. So my rule: noindex when I want a page gone, canonical when I have near-duplicates and only need Google to pick a winner.

Question 4

Why is my page indexable but still not indexed?

Accepted Answer

Indexable just means nothing is actively blocking it. That is the floor, not a promise. Google still gets the final say. It weighs whether it even found the page, whether the crawl is worth the budget, and whether the content holds up against the near-duplicate it might already have. When I want the actual answer instead of guessing, I drop the URL into Search Console URL Inspection. It shows you the exact coverage state straight from Google.

Question 5

Does this tool render JavaScript?

Accepted Answer

It does not. It reads the served HTML and the response headers, exactly what comes back on that first request. Which matters more than it sounds. If JavaScript injects your meta robots or canonical after load, what Google eventually renders can drift from what you see here. So when a page leans on JS for that stuff, confirm the final state in Search Console URL Inspection, because that one actually renders the page the way Google does.

Indexability Checker

Indexability Checker: Status, Robots.txt, Noindex, Canonical, Sitemap and Crawl Signals

What an indexability checker does

Indexable does not mean indexed

Signals checked

How I read the result

Frequently asked questions