Soft 404s

A soft 404 is a page that informs the user that the URL does not exist, yet is accompanied by a HTTP response status code that indicates success.

Usage

A soft 404 is a contradiction because the page content indicates that the URL is missing, while at the same time, the 200 OK HTTP status code indicates that the resource was successfully located. This is an unwanted situation because the “not found” page might be indexed and included in search results. In addition, web crawlers will continue to request the page in the future, taking up valuable resources which can be better dedicated to crawling and analyzing valid content.

Remedy

A soft 404 needs to be replaced by a HTTP status code that accurately and appropriately describes the state of the resource. For example, if the page content is no longer available then a 404 Not Found or 410 Gone HTTP status code is an appropriate response. If the content is still available at another URL then a 301 Moved Permanently or 308 Permanent Redirect HTTP status code need to be returned.

False positives

Some web crawlers and search engines automatically detect a soft 404 and mark it accordingly. For example, when Google detects this situation, it marks it as such in the site’s Index Coverage Report in Google Search Console. Unfortunately, this occasionally happens when the page and its content are still valid. If, for example, the majority of resources are not being correctly rendered, uses negative language such as “not available” or “out of stock” and/or the page is black or near-blank (empty or thin content), then it can be misinterpreted as an error and marked as a soft 404.

Note

If search engines detect too many soft 404 pages on the website, search engines may lose trust in the server signals which in turn may impact the overall quality evaluation of all content signals of that website, and negatively affect its rankings in the search engine search results.

Common mistakes

When webmasters create soft 404 on purpose, it is often a futile Search Engine Optimization (SEO) attempt to preserve link juice (PageRank) from external backlinks to the deleted URLs for the website, which can be misinterpreted by web crawlers and search engines as soft 404s:

  • Redirecting the URL where the content is gone with a 301 Moved Permanently to a higher level in the site structure hierarchy, for example from a /category/product URL to /category URL.

  • Redirecting the URL where the content is gone with a 301 Moved Permanently to the homepage, for example from a /blog/popular-article URL to / URL.

  • Serving the URL where the content is gone with a 200 OK response and canonicalizing in the HTML code to a higher level in the site structure hierarchy, for example from a /category/product URL to /category URL.

  • Serving the URL where the content is gone with a 200 OK response and canonicalizing in the HTML code to the homepage, for example from a /blog/popular-article URL to / URL.

A better SEO strategy involves creating a "smart 404" which returns a proper 404 Not Found HTTP response while detecting, based on the deleted URL accessed, what type of content the user is searching for and present alternatives within the error page, as in "URL not found, however you may be interested in the following pages which have replaced the content you are looking for".

Note

If you need assistance with SEO, contact ex-Google SEO consultants Search Brothers.

Takeaway

A soft 404 described a situation where a page is returned, stating that the specified resource does not exist, but includes a 200 OK HTTP status code. This can cause problems with search engines and needs to be replaced with the appropriate HTTP status code.

See also

Last updated: June 20, 2022