Listing delays that have come to be called the Google Sandbox effect are actually true in practice at each of four top tier search engines in one form or another.
MSN, it seems has the shortest indexing delay at 30 days. This article is the second in a series following the spiders through a brand new web site beginning on May 11, 2005 when the site was first made live on that day under a newly purchased domain name. Lessons learned in the first 60 days on a new site follow:
1) Google crawls 250 pages on first discovery of links to site.
Then they don't return until they find more links and crawl
slowly. Google has failed to index new domain for 60 days.
2) Yahoo looks for errors pages and once they find bad links
will crawl them ceaselessly until you tell them to stop it.
Then won't crawl at all for weeks until crawling heavily
one day and lightly the next in random fashion.
3) MSNbot requires robots.txt files and once they decide they
like your site, may crawl too fast, requiring "crawl-delay"
instructions in that robots.txt file. Implement immediately.
4) Bad bots can strain resources and hit too many pages too
quickly until you tell them to stay out. We banned 3 bots
outright after they slammed our servers for a day or two.
Noted "aipbot" crawled first then "BecomeBot" came along
and then "Pbot" from Picsearch.com crawled heavily looking
for image files we don't have. Bad bots, stay out. Best to
implement robots.txt exclusions for all but top engines if
their crawlers strain your server resources. We considered
excluding the Chinese search engine named Baidu.com when
they began crawling heavily early on. We don't expect much
traffic from China, but why exclude one billion people?
Especially since Google is rumored to be considering a
possible purchase of Baidu.com as entry to Chinese market.
The bottom line is that we've discovered all engines seem to delay indexing of new domain names for at least thirty days. Google so far has delayed indexing THIS new domain for 60 days since first crawling it. AskJeeves has crawled thousands of pages, while indexing none of them. MSN indexes faster than all engines but requires robots.txt file. Yahoo's Slurp crawls on again off again for 60 days, but indexes only six of total 15,000 or more pages crawled to date.
We seem to have settled that there is a clear indexing delay, but whether this site is officially "Sandboxed" and whether that delay is universal is less clear. Many webmasters claim that they have been indexed fully within 30 days of first posting a new domain. We'd love to see others track spiders through new sites following launch to document their results publicly so that indexing and crawling behavior are proven.
Mike Banks Valentine operates SEOptimism, Offering SEO training of
in-house content managers http://WebSite101.com and blogs about SEO at
Found an error or have a suggestion? Let us know and we'll review it.
Suggest a Correction
Big Sandbox for Google, AskJeeves & Yahoo. MSN Indexes Quickest
0 views
Comments (0)
Please sign in to leave a comment.





No comments yet. Be the first to comment!