SEO Simplified

Core Knowledge Center

Master SEO and improve your organic rankings by getting all your questions about SEO answered.

Knowledge Center > Google Search Console Errors and Warnings

Google Search Console Errors and Warnings

You want your website to make it to the top of search results, but some of your webpages do not even appear in the SERPs. These pages are either not crawled or indexed. Google Search Console can help you identify and fix issues that may prevent the search engine from crawling or indexing your website.

Using this free application, you can manage and monitor your site’s performance and also have a clear picture of .

Where does your website stand in the search results?
Which pages are ranking well?
Which pages have crawling and indexing errors?

Google Search Console flags issues that can negatively affect your website’s organic visibility, as unindexed pages don’t show up in the search results. Moreover, unresolved website errors also impact the user experience.

You can find a list of all the pages that Google tried to crawl and index but failed due to some error in the Index Coverage Report. While you can view this report, you may not understand the errors or warnings Google Search Console displays.

So, we have explained the common errors that Googlebot may encounter while crawling your site. First things first.

Crawler / Bot

Search engines use bots to discover and analyze pages for indexing them. A crawler or bot is a program that browses or crawls the new and existing pages of a website.

In simple terms, a bot or crawler finds the content of your site and adds the information to its index. The unindexed content or pages of your site don’t appear in the search results. All search engines use their own bots; for instance, Google uses Googlebot, whereas Bing uses Bingbot.

Mainly, crawlers are of two types:

Ones that work 24/7 to find new pages and recrawl the existing ones
Those that crawl a limited number of pages on request

Some special crawlers also help index images and videos. Crawling is an ongoing practice if you have an active website, which means bots keep crawling your site to find and add new web pages to its search index.

XML Sitemap

An XML sitemap is a file that contains the list of all the pages of your website that you want Googlebot or other crawlers to index. This file also includes important information about each URL. Your website should have a proper sitemap so you can help search engines discover the pages you want to rank for.

Search engine bots cannot index a page if it’s undiscoverable. Not to mention, unindexed pages do not appear in search results. This means you cannot improve your rankings in the SERPs if your website has crawling and indexing errors.

Make sure to add your sitemap to the root directory of your website. You can find the sitemap of your website at domain.com/sitemap.xml or domain.com/sitemap_index.xml.

Robots.txt

It refers to the file that prevents crawlers from accessing specific URLs of your website. In other words, this file guides search engines about the pages they can crawl and the ones they should leave. You can use robots.txt file to

Optimize the crawl budget
Adjust the crawling speed for some bots
Reduce or optimize crawler traffic to your website
Let webmasters control the behavior of crawlers on your site

Google can still index a page discovered through an external link, even if it gets blocked by robots.txt. Good web crawlers follow the guidelines specified in the robots.txt file, but the unregistered crawlers disregard these rules.

You can add a “noindex” tag to pages that you don’t want the bots to crawl, such as login or author pages. Just ensure the page with the noindex tag is not disallowed in the robots.txt file, as the search engine cannot read and update it in its index otherwise.

Server Error (5xx)

When Googlebot attempts to crawl your webpage and encounters an issue related to the HTTP status codes, it returns with status code errors. Server error (5xx) is a status code error that indicates there’s a problem with your website’s server that did not let it load the requested page.

You may encounter a server error due to many reasons, such as:

Database issues
Configuration issues
Coding errors in the server-side scripts
Running out of resources to handle requests

You can resolve the main issues leading to the server error and request Google Search Console to recrawl the pages that returned with the error.

Not found (404)

It’s an error, derived from the HTTP status code 404 Not Found, showing that the server could not locate the page or source the user requested. In other words, when Googlebot tried to crawl a page, the server could not find it at the requested address and returned with this error.

The error occurs when you remove a webpage from your site but forget to remove it from the sitemap. The user may also encounter the 404 error if:

The address of the resource gets changed without setting up a redirect.
The site is undergoing maintenance.
The URL provided to Googlebot has a typo.

You can avoid the 404 error by regularly updating and maintaining your sitemap file, fixing or replacing broken links, and setting up redirects.

Soft 404

A soft 404 error in Google Search Console occurs when a page looks like other normal pages and returns with a “200 OK” status code from the server, but Google thinks it has little to no meaningful content.

In other words, the URL does not return with a 404 status code, but the content seems like it is a 404 page. Sometimes, your website theme may also create pages that are not required.

Another scenario that may lead to a soft 404 error is when you redirect a page to a new URL but the content on the new location is irrelevant or thin. Google considers such pages as empty pages or a shell of their former selves, meaning something useful that no longer exists. These pages look broken to Google, and it treats them as if they don’t exist.

Try out these options if the page:

No longer exists: Ensure it returns a 404 (not found) or 410 (content deleted) HTTP status code.
Moved to a new URL: Set up the 301 redirect.
Exists but has thin content: Optimize it by adding valuable, in-depth, and relevant content.

Redirect Error (3xx)

As the name suggests, this error means your redirect is not working. You need to set up redirects when you want to send search engine bots and users from a page that no longer exists to a modified or new page.

Redirect errors can affect the crawling and indexing experience for bots like Google, impacting your website’s ranking. The errors may occur due to

Long redirect chains: One URL redirects to another, which redirects to the third URL, slowing down the loading speed and affecting the crawling experience.
Redirect loops: When the final URL redirects to the previous URL in the redirect chain, it causes infinite redirect loops, which may lead to crawl errors.
Empty URL: An empty or bad URL in the redirect chain.
URL Length: When a redirect URL exceeds the maximum length
Incorrect Redirects: When you redirect the page to an incorrect or irrelevant destination URL

Make sure a redirect takes the visitor and search engines directly to the destination URL without additional redirects.

Blocked by Robots.txt

The pages that you don’t want the search engine bots to crawl get blocked by robots.txt. However, if you submit a page for indexing and it gets blocked, there could be a mistake. Check your robots.txt file and see if it tells Google not to crawl that specific page.

You will find a line of code that says, “This page was blocked by Googlebot with a robots.txt file.” Simply remove this line from your robots.txt file and then use Google Search Console to request a recrawl. Note that some indexed pages may also get blocked with a robots.txt file accidentally.

Google keeps these pages in the index for a certain period and considers the ‘noindex’ directive to understand whether you want a specific page to not be crawled.

Marked ‘noindex’

The noindex mark means Google thinks the page should not be indexed. When a page submitted for indexing returns with the “Submitted URL Marked ‘noindex’” error, this indicates the page has a ‘noindex’ directive in its HTTP response or a meta tag.

The mixed signals can be confusing for the search engine, so check the source code of your page and find the ‘noindex’ directive. You have to remove the tag or HTTP response if you want Google to index that page.

Indexed, Not Submitted in Sitemap

This is a status and not an error, which means Google successfully discovered and indexed the page. However, you did not include the page in the sitemap to tell the search engine you want it to index the page. Google and other search engines find it easy to crawl and index pages listed in the sitemap.

An updated sitemap may help improve how often and quickly Google crawls your content. Moreover, it may boost your website’s traffic and ranking in the SERPs.

Indexed; Consider Marking as Canonical

This means that Google has indexed the submitted URL, but it has multiple versions, so it’s advisable to clearly mark it as canonical. When a page has many duplicate or near-duplicate versions, the search engine chooses a source or canonical to show in the results and hides the rest of the pages.

Adding the canonical tag to a page tells Google that it’s the master page and helps consolidate all link signals to that specific version. The canonical tag is also important if you don’t want

Backlink dilution
Crawl budget exhaustion
Undesirable URLs in search results

Blocked by ‘Noindex’ Tag

Google Search Console excludes the pages that you don’t want Google to index when crawling your site. You see this status when Google tries to index a page and encounters a ‘noindex’ directive. In simple words, it means Google did not index this page because you did not want to.

If you want the search engine to index this page, Google Search Console will tell you what to do, such as removing the ‘noindex’ directive. You don’t have to do anything if you don’t want the page to be indexed.

Blocked Due to Unauthorized Request (401)

When a page returns with the HTTP status code 401 error in Google Search Console, it means that request was unauthorized. You will see this warning when the search engine bot tries to crawl a page that is not included in the sitemap. The error may also occur if the page is only accessible to a logged-in user and has:

Password-protected content
IP blocking or access restrictions
Crawling-specific configuration issues

If you don’t want to waste your crawl budget, locate these URLs on your site and remove their authorization requirements. Or if you want to submit these pages, allow the search engine bot to access them by verifying its identity.

Crawl Anomaly

A crawl anomaly may indicate a 4xx- (client error) or 5xx- (server error) response code. If the issue is with your server, check if the page is accessible in your browser. Also, use a URL inspection to check if the URL is a part of the redirect chain.

An unspecified anomaly may also occur if the page no longer exists or redirects to a page with a 404 error. So, you need to fix the redirect issues and ensure your page

Directly takes the visitor or search engines to the destination URL
Loads quickly without delays
Returns a 200 response

You can submit the affected pages for recrawling after fixing the potential issues.

Crawled – Currently Not Indexed

This happens when Google crawls a page but does not index it for different reasons. You don’t have to resubmit the page for crawling if you want it to be indexed in the future. However, you can improve the chances of getting the page indexed by analyzing and optimizing your content.

Evaluate your content and find if it resonates with your target audience and satisfy their search intent. Check if the content is accurate, relevant, and valuable. Moreover, update your sitemap and edit your robots.txt file to ensure it is not blocking any pages that you want Google to index.

Discovered – Currently Not Indexed

Google does not crawl all the pages it discovers on your site immediately. If you see the Discovered – currently not indexed status, it simply means Google has found the page but not crawled it yet. Not to mention, Google indexes the pages it has crawled.

The URL status will automatically resolve when Google crawls the discovered URL, especially if you have a small site with good quality content. However, if Google keeps ignoring new pages of your site, it means the search engine does not find them worthy of crawling.

In this case, you may need to improve your content quality and work on your linking strategy to drive traffic to your site. The optimized, up-to-date, and in-depth content will increase the chances of getting your page indexed.

Alternate Page with Proper Canonical Tag

When a page returns with this status message in Google Search Console, it means the page is a duplicate of the canonical page. In other words, it means that the same page is accessible through multiple URLs that are properly canonicalized and point to the main page.

You don’t have to do anything if you see this message, as Google ignores the duplicate version and indexes the canonical page. However, you may need to consider a few things if multiple pages return with this error status.

For instance, you can check if

Your pages are properly canonicalized.
There are crawl budgeting issues.
Your internal link structure is poor.

Duplicate without User-Selected Canonical

This status means the page submitted for indexing has multiple versions, but Google is unable to determine which page is the canonical one. In this case, Google has to guess which version of the page to index.

The error also occurs when you have multiple pages with the same title tag. To avoid the confusion, you should explicitly tell the search engine which version to index using a canonical tag. You can also use 301 redirects if you want permanent redirects from one page to another.

Duplicate Non-HTML Page

You will see this status in Google Search Console when Google discovers a non-HTML page, such as a PDF, that contains the same or similar information as another page it has indexed and marked as canonical. You don’t have to take any action for this status unless you want Google to use the PDF version for indexing.

Duplicate, Google Chose Different Canonical than User

If you see this status, it indicates an indexing issue, which means Google does not think the submitted page is the best version to be marked as a canonical for a set of pages. In other words, it means Google did not index the page you submitted and instead chose a different duplicate page because it has more links and/or content.

The issue occurs when the duplicates don’t have canonical tags or the canonical redirects the visitor to a different version of the same page. You can fix this issue by ensuring that you have added canonical tags on all duplicate pages and the canonicals are consistent.

Also, make sure to explicitly tell the search engine which version of the page to index and optimize the canonicalization signals on your website. Use the URL inspection tool to find which URL Google has chosen to index. If you think it’s a better choice, no need to take any action.

Page Removed because of Legal Complaint

This one is self-explanatory; the status means that Google has removed the page from its index because someone has reported the page content. You may encounter this issue if your website gets hacked or infected with malicious code.

Cybercriminals and hackers like to create pages for prescription drug or illegal movie downloads. Law enforcement agencies and legal departments in large organizations usually file complaints against such pages. So, if you see a page returning with this status, make sure to

Remove the copyrighted material.
Keep all plugins up to date.
Protect your website and passwords from hackers.

Page with Redirect

Google only indexes pages that return a 200 OK status. So, when you see the status “Page with redirect,” it means the URL is a redirect and is ineligible for indexation.

If you want to fix this issue, first determine the root cause. There could be two reasons:

The URL is redirecting. In this case, just remove the redirect.
You’re leading the search engine bot to crawl a redirecting URL. Resolve the problem based on where you found this issue in the Index Coverage report.

Mobile Usability Issues

Positive user experience is an important ranking signal, and you cannot perform well on the SERPs if your site has usability issues or if it’s not mobile-friendly. Mobile usability issues, such as slow loading speed, can affect the user experience and may also force the visitor to leave your site. A poor user experience can also cause trust issues and lead to higher bounce rates.

You can prevent Google Search Console errors and warnings by

Conducting regular website audits to identify and resolve issues promptly
Including all relevant pages of your website in the XML sitemap
Regularly reviewing and updating your robots.txt file.

Now that you’re familiar with common Google Search Console errors and warnings, check our Knowledge Center to learn about technical SEO terms and issues.

Ready to Climb the SEO Ladder?Unlock the Secret with Core SEO Audit Blog!