My Thoughts on Google’s Webmaster Tools 404 Report

October 15, 2008 – 9:31 am

Right. It has been busy few months and it seems like there is some blogging due. Since the results of the latest batch of experiments I have been busy with are inconclusive, I thought to do a non-experimental post in the meantime, just to keep my ~500 (whoha-just-went-down-to-380-overnight-was-it-something-i-said. The numbers went back, it was some kind of a glitch) RSS subscribers (whoohooo) happy.

So as many of you have probably heard, Google Webmaster Tools have started reporting the referrals of the 404 errors they have encountered as they are crawling the site. Matt was very surprised no one from the SEOsphere was writing about it (although since then, there was a number of posts about this issue), and rightly so – this report can be very useful for several purposes:

  1. As a webmaster you want to know which pages on your site are being requested and from where and act upon that information. One way of acting is by fixing the possible URL mistakes you may have on your site. Another way is creating a customized 404 pages that will channel your incoming traffic to more useful places on your site. Just imagine that you are getting tons of requests from a bad link from a forum on a certain topic and being able to offer those visitors a customized 404 pages which casually offers links to the same-topic page on your site. Or 301ing the visitors to a landing page offering related products. The possibilities are numerous.
  2. As an SEO you do not want those links to go to waste by sending their link juice to your default/customized 404 page. You can redirect those links through .htaccess (or any other form of redirection) to your landing page

Sounds great. The question that started bugging me when I read about this service offered by Google was whether the information I am getting from my Webmaster Tools (WMT) is correct / up to date / comprehensive?  So I dug into my log files covering the same time period as the WMT for this blog and lo and behold, Google was showing me only a fraction of the referrals to the 404 page.
WMTsnap-small

In Google WMT I could see only two bad referrals, while digging through log files discovered 22 different bad referrals. That is a 9% fraction, which is pretty bad.

To be fair, not all of the referrals found in WMT report were included in the log file report, for the simple reason that for a referral to appear in your log files, it needs to be clicked on. Google’s spider, however, does not pass referrals nor does it click on the link immediately upon discovery, therefore you will see in WMT report some links that will not be found in log files.

There are a number of reasons for the discrepancy between WMT and the log files and not all of them require putting a tinfoil hat on:

  1. Google is showing only those bad referrals that their spider found. It is possible that the page with the bad link was removed so as far as Google is concerned, the linking page does not exist so there is no need to report it.
  2. The page that contains the bad link has not been crawled yet.
  3. The page that contains the bad link is not being crawled due to password requirement, robots.txt or metatag blocking (although we all know how well that works), duplicate content issues found on the linking page or any other possible issue.
  4. As with the incoming link data, Google is withholding some of the information from the webmasters (yes, this is the tinfoil one, which doesn’t make it less of a possibility, however it sounds farfetched to me).

In any case, for all the reasons I have mentioned earlier, you want to know about all the links causing a 404 on your site, even if they are not counted by Google presently. One possible scenario that comes to mind is a page that is not being crawled by Google for whatever reason, and thus a bad link from that page not reported in WMT. If that page gets scraped and the scraped page is counted by Google, you have a 404 problem which you could use to your benefit.

Now, I know that the number of bad referrals I am showing here is pretty low, however it must not be forgotten that the audience that reads this blog and links to it is very web savvy and there are not many mistakes done by them (even though Search Engine Roundtable is one of the bad referrals both in log files and the WMT report <looks in Barry’s direction>). However, when it comes to websites from other niches, I would assume that the percentages of reported vs. unreported bad referrals could mount up to significant numbers and due diligence should be applied by exploring your log files.

And for end, here are several tips that can assist your digging through the log files:

  1. First and foremost, use a good log file analysis tool. I prefer Nihuo, which gives great visualization of your log file data, while giving you a great deal of flexibility in defining the analysis parameters, such as tracking single files, tracking advertising campaigns, setting up filters for a plethora of parameters, etc. Of course, a custom made log analysis tool is better, if you have the skills/resources to get one.
  2. Filter out No Referral 404s. They are good for fixing your missing pages on the site, not so useful for redirecting your link juice.
  3. A majority of requests resulting in 404 on my site were requests for favicon.ico from the time I did not have a favicon. Another very popular file whose request result in 404 is robots.txt. Filter those out since they are of no
    interest to you for this purpose.
  4. Filter out all of the referrals coming from your URL.

This should leave you predominantly with the 404 referrals coming form broken links from outside of your site.

As I said earlier, WMT 404 report is a step in great direction, however, do not rely completely on it and complement your 404 research with your log file info.

Tags: , , , ,

  1. 18 Responses to “My Thoughts on Google’s Webmaster Tools 404 Report”

  2. I also regularly run Xenu LinkSleuth over *my* sites to make sure that I am not contributing to the “bad link” problem by having errors in my internal navigation and/or in any of my outgoing links. I look for URLs that return 301, 302 and 404, and then delete or update them. The link report shows you when external sites have redirected, and where the new URL is. The new WMT 404 data is a very useful addition to the available information.

    By g1smd on Oct 15, 2008

  3. yeah, Xenu is a great tool too, although I use it more for mapping all the outgoing links from competitor’s sites.

    By Neyne on Oct 15, 2008

  4. My 404’s are mostly script-kiddie attacks now.

    The Google WMT 404 report has been very helpful in helping me weed out my own typing mistakes and in creating redirects or links for the mistakes of others.

    I’m very glad they added the source – that makes it much easier.

    By Tony Lawrence on Nov 21, 2008

  5. This is one of the detailed post about the 404 report. Everyone need this information. Thanks for sharing.

    By Kamal on Dec 8, 2008

  6. iu really like this blog but you really need to post more often

    By kevin on Dec 18, 2008

  7. Excellent content here and a nice writing style too – keep up the great work!

    By Find Niches Online on Jan 10, 2009

  8. Very nice article… I manage a huge site and have very easy access to this info from our internal software. I will give you some feedback once I see what the difference is between Google Webmaster Tools and the log files. Thanks!

    By Barry on Jan 15, 2009

  9. excellent!

    By transportadora on Feb 12, 2009

  10. Great info, thanks

    By Day Two Webdesign on Feb 14, 2009

  11. It’s a bit like analytics too. If you get stumbled, you can’t find out much info from google analytics. Plough through your log files and you can find out the refering SU page for real.

    By malcolm on Mar 2, 2009

  12. With a combination of log-file analysis (awstats is fine), Xenu, Google’s WMT and .htaccess I have been able to put many 404 dead links to good SEO use – all adding to the number of in-links for my sites. I estimate about 10% extra links :)

    BTW, 50% of bad links are not due to errors on my sites, but due to errors in the linking sites. Either Webmaster errors or link parsing/creation errors…

    .S.

    By Skyper on Mar 9, 2009

  13. Useful info, have sent the link to this blog to my friends

    By Einar Lang on Mar 15, 2009

  14. Good information, Yes you are right Google is not showing all error for this you need to dig the log mines – Hemang Doshi

    By SEO Firm on Mar 29, 2009

  15. Google webmaster is an great tool to help us with 404 reports. In some cases it´s impossible to check all pages and with this tool we can correct the broken links and set the redirects correctly.

    Great post!

    By mudanças transportadora on Apr 28, 2009

  16. Really great article for seo. I would definitely go for the customize 404 page.

    By website design on May 2, 2009

  17. Useful info, have sent the link to this blog to my friends

    By ShowfreeLinks on May 10, 2009

  1. 2 Trackback(s)

  2. Oct 16, 2008: קישורים בחינם אליך לאתר באדיבות גוגל. Google Webmaster Tools. | גוגל-ספרה
  3. Jan 5, 2009: Ссылкостроительство: Акноры. Донор и тематика донора. - блог о продвижении сайтов и интеренет рекламе - SeoImho.com

Sorry, comments for this entry are closed at this time.