Why does a blocked noindex URL show up in the search results? SISTRIX

Why does a blocked noindex URL show up in the search results? SISTRIX

Why does a blocked noindex URL show up in the search results - SISTRIX Login Free trialSISTRIX BlogFree ToolsAsk SISTRIXTutorialsWorkshopsAcademy Home / Ask SISTRIX / Crawling and indexing / Why does a blocked noindex URL show up in the search results

Why does a blocked noindex URL show up in the search results

From: SISTRIX Team 25.01.2021 Google-Index, Google-Bot and the Crawling Process What is the Google Everflux? Robots meta tag vs. robots.txt: what are the main differences? What is an HTTP referrer? Our web site is no longer in the index - have we lost our rankings? What is a User-Agent? What is Google Search Console and How To Get Started Web Crawlers: How do They Work? Changing Google Search through Entities What is the X-Robots-Tag? What is the Mobile First Index? Rich Snippets: What are the advantages? Can the Google-Bot fill out and crawl forms? Crawl Budget: What does this mean? These are the CTR's For Various Types of Google Search Result Crawling and Indexing for extensive websites Google SERP Features: Result Types in the Search Results Why does the amount of indexed pages fluctuate so much? How can I quickly get a new page into Google's index? Why does a blocked noindex URL show up in the search results Is a website with and without the www harmful? Shelf space optimisation on Google Find out how many pages of a domain are indexed by Google The consequences of negative user-signals on Google's rankings Why am I getting different values for indexed pages in the Google search, the GSC and SISTRIX? How can I remove a URL on my website from the Google Index? Back to overviewIf you use the robots.txt to block access to a directory or specific page for search engine crawlers, this page/directory will not be crawled or indexed. In certain cases, Google will show a page that is blocked through the robots.txt in the SERPs.ContentsContentsWhy do I find my page in the search results even though it is blocked through the robots txt When does a blocked page appear in the SERPs Google is increasingly paying attention to user signals – an exampleHow to definitively keep content from showing up on the search result pagesUncrawled URLs in search results You can block the directory “a-directory” and the page “a-page.html” for webcrawlers with the following addition to the sites robots.txt:User-agent: * Disallow: /a-directory/ Disallow: /a-page.html

Why do I find my page in the search results even though it is blocked through the robots txt

In certain cases, Google will show a page that is blocked through the robots.txt in the SERPs (Search Engine Results Pages). For these instances it is important to know that the crawler does respect the robots.txt and has not added the content of such blocked pages to their index. Google therefore has no information available when it comes to this page.

When does a blocked page appear in the SERPs

If the blocked page has a lot of incoming links with a definitive link text, then Google may view the content of the page as relevant enough to show the URL that appears in these linktexts in the search results. The content of that URL, however, is still unknown to Google as they are unable to crawl or index the page. You can usually recognise pages within the SERPs that were blocked through the robots.txt from being crawled and indexed by a missing snippet (for example the description).

Google is increasingly paying attention to user signals – an example

We use the robots.txt to block access to our page http://www.domain.com/grandmas-cakerecipe.html. Google’s crawlers honour our request to not crawl and index the contents of the page. Google therefore has no idea what content is in the file grandmas-cakerecipe.html. Let us say that this page contains a world class recipe and we get a lot of incoming links from other pages, many of with use the linktext “Grandma’s World Class Pie Recipe”. In such cases, our blocked page http://www.domain.com/grandmas-cakerecipe.html could appear in the search engine result pages (SERPs) for the query “Grandma’s World Class Pie Recipe” – despite us blocking crawlers through the robots.txt.

How to definitively keep content from showing up on the search result pages

The robots.txt is not guaranteed to keep your page out of the search results. To make sure that a page will definitely be kept out of the search results, you should use the Meta-Element Robots with the value NOINDEX.

Uncrawled URLs in search results

From: SISTRIX Team 25.01.2021 Google-Index, Google-Bot and the Crawling Process What is the Google Everflux? Robots meta tag vs. robots.txt: what are the main differences? What is an HTTP referrer? Our web site is no longer in the index - have we lost our rankings? What is a User-Agent? What is Google Search Console and How To Get Started Web Crawlers: How do They Work? Changing Google Search through Entities What is the X-Robots-Tag? What is the Mobile First Index? Rich Snippets: What are the advantages? Can the Google-Bot fill out and crawl forms? Crawl Budget: What does this mean? These are the CTR's For Various Types of Google Search Result Crawling and Indexing for extensive websites Google SERP Features: Result Types in the Search Results Why does the amount of indexed pages fluctuate so much? How can I quickly get a new page into Google's index? Why does a blocked noindex URL show up in the search results Is a website with and without the www harmful? Shelf space optimisation on Google Find out how many pages of a domain are indexed by Google The consequences of negative user-signals on Google's rankings Why am I getting different values for indexed pages in the Google search, the GSC and SISTRIX? How can I remove a URL on my website from the Google Index? Back to overview German English Spanish Italian French
Share:
0 comments

Comments (0)

Leave a Comment

Minimum 10 characters required

* All fields are required. Comments are moderated before appearing.

No comments yet. Be the first to comment!

Why does a blocked noindex URL show up in the search results? SISTRIX | Trend Now