In the last few years, we can see a lot of effort coming out of Mountain View* to reach out to webmasters and assist them with managing their rankings and the ways their web property is perceived by Google spiders. The toolbar PR, the ever expanding Webmaster Tools, Matt Cutts blogging about what to do and what not to do when optimizing your site (do I even need to link to his blog?), regular updates of the Webmaster Guidelines section, participating in various conferences, demonstrate the many ways Google helps people maximize the utility they derive from their sites and assures them a continuous flow of targeted traffic from The Search Engine (brown-nosing, I know).
Being a webmaster and an SEO, I thought what the heck – its my turn to ask. Let’s see if this idea finds an attentive ear among the people at Google that will, in a best case scenario, spark a discussion among them and decide that this is a great idea, offer me a job (which I will promptly decline) and improve the lives of webmasters around the globe. Or, in a worst case scenario, my idea may not penetrate Google’s crap filter but it might at least raise some dust around the blogosphere and maybe stimulate other, more interesting questions in the same vein.
So what is my request ?
Allow spiders to pass referral information.
For those of you that the above sentence did not make sense, here is a little explanation: every visit to your site elicits a response from the server. Every time someone requests a page, image or a Flash file (or whatever) from your website, the server logs that request in a file called, surprisingly enough, “log file”. There are several pieces of information written in every line of the log file. The more interesting ones are the response code of the server, the time and date of the visit, the kind of request that was made, the user agent, the OS and the IP of the initiator of the request and in some cases the referring URL from which the visitor performing the request came to your site. The referrer information is passed only if this information passing is enabled. For the majority of the human visitors this information is passed to the server by default. However, in some cases, such as search engine spiders, the referral information is not recorded in the log files.
I don’t know if my request is even technically possible, although I don’t see why it wouldn’t be – they do follow links, they do read the content of the site, (or most of it), and they do receive and pass all the other information (time of visit, requested page, user agent). Even if they don’t visit every existing link, every time (as suggested by jdMorgan on WMW), over a long period of time, Googlebot will visit most of my links. Combination of this information from all the other SE bots will compose a comprehensive picture of my links.
What are the benefits of this?
More information – more power to the webmasters :
1. By analyzing my log files and filtering them for spider(s) visits, I can get the full, comprehensive, not-to-be-found-anywhere-else information about the details of all incoming links to my site. If you are a webmaster/SEO, there is no need to explain why this is useful. I don’t think there should be any privacy problems with this – after all, those are my log files and by being able to analyze them I am in a sense confirming my ownership of the site.
2. With this information, I would be better equipped to control my incoming links. I could see who is linking to me and how and try to change that by contacting the webmasters of the sites that are linking to me. I could quickly discover potential attempts of linking sabotage jobs done by competitors (linking from bad neighborhoods, using deceptive and derogative anchor text when linking to my site, etc.).
3. I would also be able to control the flow of link juice on my site. For example: I have a page about blue widgets on my site (damn, I promised myself not to use the blue widgets example). Some other blue widgets site links to my homepage, not because they are mean or evil, just because that’s what a majority of people prefer to do – link to a homepage rather than to an inner page. I would prefer the link to contribute to my blue-widget.html page and not to the homepage. If I had this information about the existence of the link, I could contact the webmaster of the other Blue Widget site and ask him to link to the page of my preference.
What are the drawbacks ?
As with any tool or useful method, this could be abused by spammers/cloackers and redirect spider visits coming from certain links to certain websites, but I think that this is something they can already do. Maybe not at potential resolution that this thing would allow them (links coming from certain websites are redirected to site X and others are redirected to site Y) but it would not add significantly to their ability to spam compared to what they already have.
I think that this would be a very positive step in the direction of putting more power in the hands of the webmasters and would, in the long run, contribute towards better indexing and classification on the web.
What do you think?
*I wonder whether the citizens of Mountain View (all 70,708 of them) mind the fact that there are thousands of people (if not more) around the world who have completely erased their identity as an actual town in California and have equated their existence with the location of a certain search engine HQ?