The Google Dance
November 18, 2003
Just how does Google update its whole index? This is a rather broad question, but we will explain each and every step that Google takes every month to ensure its database is the most relevant and of the highest quality.
Quite a good number of people and companies realize that, in order to obtain the best Google rankings early in their search engine optimization (SEO) campaigns, it is important to take all the necessary steps before and carefully plan ahead. In respect to non-fee search engine submissions, Google happens to be one of the very few left. It's also one of the earliest to include it in its database.
As/of November 18, 2003, it is widely estimated that the worldwide Google database consists of over 3.6 Billion pages. And that is only a fraction of all available web sites, as some of them are not open to Google, ie: those sites are not to be visited by a search crawler or spider.
Such as it often is in real life, with all this comes a lot of risks and potential complications that website owners there, webmasters and Search Engine Optimization (SEO's) professionals need to carefully assess while in the initial production and pre-launch phases of the marketing program. While most experts agree that Google spiders (crawls) before and after certain phases, they are not certain at which exact point in the month they will do their spidering and finally update their whole database.
In this article, I will attempt to explain in great detail what the "Google Dance" entails, when and how to read "Googlebot" and at the exact right time. Additionally, we will tell you what all of this means for your search engine optimization campaign.
The Famous Google Dance
Each "dance" begins with Google making a major, deep crawl. Let's call it Crawl A. What it does is it spiders the whole web- over 3.4 Billion pages at last count. Google uses over 15,000 inexpensive PC's (actually, conventional desktop computers) spread all over the world, located in different data centers. When it sends Googlebot (or DeepBot) out to spider the current sites within its database, as well as to find new websites that have recently been launched on the web. Initially, once Google has completed this Crawl A, effectively catching all of these web pages for its next update, there will be a second update afterwards, roughly two weeks later.
Google will then update its whole database, showing the new results on www2.google.com and www3.google.com. All along during this update, the results are often rapidly switched between the primary database and the second and third database. As stated earlier, since Google uses over 15,000 servers, most people in all areas of the world are usually seeing very different search results, until most of the update in finally completed.
The "Google Dance" will continue for another few days, but usually no longer than a week in duration (unless there are problems and/or major algorithm changes done by Google such as the April 2003 update).
At all times, both during and directly following each database update, Google will again start another heavy spidering, we will call it Crawl B, of all the existing websites in its current database and also newer websites that have been recently launched on the web and picked up by its search crawlers. After this spidering by Googlebot, the cycle returns to the beginning and starts all over again for the next month.
"Trapping" Googlebot at the perfect time
If a webmaster wishes to have a new website included in the Google database, the question is, will either of these crawls insure its inclusion into the database? Judging from our experience over many monthly updates, this is not always the case! To be sure, if a website is spidered in the beginning of the month, chances are that it will not be included in that month's update. If the website is spidered during the second crawl of the month, which is directly following the update, it is possible (but never guaranteed) that it will be revisited in the next crawl and then included in the next monthly update.
On other occasions Google will simply visit a new website and take only the homepage and the Robots.txt file. Such behaviour is usually a good indication that Googlebot will come back during the next major crawl and the website will usually be included in the update following that second spidering. Looking back, it would seem that for a new site to be included in the Google database, it would take two complete visits from Googlebot. In most cases, this would be true, although exceptions can always happen.
In order to ensure the most rapid inclusion possible, there are a few things an experienced webmaster can do. If the website is spidered for the very first time by Googlebot during or directly after the update, then it is in good stance, as it is more than likely it will be included in the next monthly Google Dance. If that website is not crawled at that point, but during the next crawl, the webmaster or site owner will have to wait even longer for his or her website to be indexed in Google's database.
In light of all of this, what's a typical webmaster to do in order to get Googlebot crawl his website during that very specific time period? He can either pray or hope that it will happen that way, which is certainly not very scientific, or he or she can do the necessary homework and plan ahead the whole time. If webmasters have other websites that are in the Google database, they can watch the spidering and all update dates and then carefully plan their new launchs accordingly. Additionally, if you don't have any websites in the Google database that you can individually monitor, you could always watch www.google.com for the updates.
However, since in real life, there is almost no way to be 100% certain that any website is ever going to be crawled, either partly or completely, there are certain cautionary steps a webmaster can do to "flag" Googlebot and get the search robot (crawler) to the designated website. The first step to take is to exchange reciprocal links to the site from other websites with a high Page Rank.
In usual terms, the higher a website's PageRank, the more that website will be crawled and refreshed more often by Google, which really means that your link (URL) should be picked up more quickly. A word about relevancy: if a website is about furniture retail, link to similar companies such as furniture manufacturers or distributors, etc. Google will rank you higher that way than if you just link to any site that is off-topic.
Number two, you can submit your website to Google through their add url section. While this is certainly not a definite way into the Google database, it should still be done. Number three, a webmaster can install the Google Toolbar and then visit his or her own website through the toolbar. Since mid-2002, there has been countless reports of a direct correlation between a website's inclusion into the Google database and a visit through the Google Toolbar.
At US $299 annually, a listing in the Yahoo directory is also a a good start in getting into Google's database, and Yahoo does offer rapid inclusion times, usually within seven days into their directory. Also, a DMOZ (Open Directory Project or ODP) listing could be a good way to have your website included in the database, although this could sometimes take longer periods of time. DMOZ is not 100% dependable and has had more than its share of server problems lately.
An alternative is to use the Global Business Listing search engine and directory, which is about one of the only Internet search engines that makes it possible to do an industry search query, on top of the usual search box query. Global Business Listing's annual inclusion fee is also lower than Yahoo.
Wrapping it all up
As Google commands a very high percentage of targeted search engine traffic referrals, having a better, ballpark idea of when all of this will probably start can be of immense help. If you would like to read more information on the Google algorithm, you might also find this article to be of good interest to you.
Additionally, for a very good interview with Chris Ridings, one of the best experts on the Google Page Rank algorithm, please click here.
Article written by Serge Thibodeau,
Unless otherwise specified, all content and material on this site is copyrighted by Serge Thibodeau of rankforsales.com and may not be reproduced by any means without express written permission. Using my content without permission is a theft of my work. Please contact firstname.lastname@example.org to discuss certain reprint options that would be acceptable.
You can read some of Serge Thibodeau's exclusive comments that are not posted on this website. Visit his personal blog by clicking here. For hardware, software or IT-related technology questions, it is recommended you visit www.techblog.org
We strongly suggest you bookmark our web site by clicking here.
Tired of receiving unwanted spam in your in box? Get SpamArrest� and put a stop to all that SPAM. Click here and get rid of SPAM forever!
Get your business or company listed in the Global Business Listing directory and increase your business. It takes less then 24 hours to get a premium listing in the most powerful business search engine there is. Click here to find out all about it.
Rank for $ales strongly recommends the use of WordTracker to effectively identify all your right industry keywords. Accurate identification of the right keywords and key phrases used in your industry is the first basic step in any serious search engine optimization program. The keywords you think are the best may be totally different than the ones recommended by WordTracker. Click here to start your keyword and key phrase research.
You can link to the Rank for Sales web site as much as you like. Read our section on how your company can participate in our reciprocal link exchange program and increase your rankings in all the major search engines such as Google, AltaVista, Yahoo and all the others.Powered by Sun Hosting Protected by Proxy Sentinel� Traffic stats by Site Clicks�
Site design by GCIS SEO enhanced by Pagina+� Online sales by Web Store�