What if Googlebot Could See How Your Site Really Looked?

Unconfirmed reports are surfacing of a bot with a known Google IP that is requesting CSS and Javascript files and which appears with a User-Agent of ‘Mozilla, Googlebot’. We evaluate two possible explanations and comment on the potential impact of a shift from purely text-based search engine bots to spatially aware bots.

An unconfirmed report over at AdSenseBits suggests that a new GoogleBot could be in town — one which carries a User-Agent of Mozilla, Googlebot and which fetches not just HTML files, but also CSS and Javascript. Apparently, this mysterious visitor shows up in logs as appearing from a known Google IP address.

Until the report is confirmed, we wouldn’t want to waste too many brain cycles pondering the implications, but nonetheless two immediate possibilities come to mind, both of which are pretty interesting:

Google is sniffing out cloaking
In this case, the bot is reporting a different User-Agent than usual in order to detect whether a site is employing User-Agent based cloaking: the practice of serving up one set of content to real human visitors with real web browsers, and another set of content to search engine bots. If Google can manage to sniff out this kind of black hat search engine manipulation, then more power to them! And look out, black hatters… Anything that pushes manipulators out of the way leaves more space for the rest of us!
Googlebot is becoming spatially aware
In this case, the bot really is an attempt to create a spatially informed model of the web page, and to index it accordingly. Unlike current text-based crawling and indexing, spatially aware crawling and indexing would take account of where material sits on a page, and it would take aware of the appearance of new content in response to Javascript events. Not only would this deal a blow to other black hatters who use CSS to hide spam content from human viewers, but it could also open up previously invisible links to other content — such as pages which are accessible via Javascript-driven menu systems. This kind of bot change also represents a potentially massive increase in the computational power which must be devoted to the processing of a single page.

Either way, the impact of such a change could be pretty significant!

Is it really happening? Is anyone else seeing this kind of new bot activity in their server logs?

This article was last updated on Friday, 24th February 2006 at 5:38 pm and is filed in the Google, Search Engine Marketing section. You can leave a response below.

Feed for this Entry Trackback Address

Bookmark and Share:

There are no comments yet on this article -- would you like to be the first to post a response?

Join the Discussion!

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


A valid email address is required to enable you to personally verify and authorize your comment for posting. It will not be displayed in your post or used in any other way. SPAM comments will be deleted immediately.