Sometime around January or February, a number of webmasters began to notice that Google had somehow "lost" huge portions of their websites.
Reference to their sites, generally to the index pages and a seemingly random selection of internal pages existed in Google listings but pages that once drove sizable amounts of traffic appeared to vanish into the ether. As February rolled into March, more reports were posted to blogs and forums by frustrated webmasters who started to notice the number of pages from their sites had declined, significantly, in Google's index.
Many SEO firms, including StepForth, received information requests and research projects from clients who wanted to know what had happened to their sites. In all cases, we did the best we could but, given the obvious complexity of the update and the lack of fresh information from Google, recommendations given during this period have more resembled shotgun style SEO advice than the finer laser focus most of us would normally prefer to offer our clients. As is the case with most major updates, investigation as often as not leads to more questions.
Matt Cutts, Google's Search Quality Officer and #1 communicator, answered many of those questions yesterday in an open and wide ranging post titled, " supplemental index . The supplemental index is a much larger representation of documents found on the web than those included in the main Google index.
" We're able to place fewer restraints on sites that we crawl for this supplemental index than we do on sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index." (source: Quality On, Quality In and Quality Out
Google has gotten better at judging the quality of content found on a document and within a site. Content includes text, images, titles, tags and both inbound and outbound links. Consistently said that well-built sites offering quality information and a positive user experience should perform well throughout its search indexes, Google provides a wealth of information via the Google Help Center and through its webmaster focused spokespersons, Cutts and Googleguy.
As Google has gotten better at determining the origin and history of content found in its various indexes, it tries to snip away at duplicate forms of on-site content, with the goal of listing the most trust worthy sites under any given user query in the main index.
Having been inundated over the years with multiple replications of what was already considered duplicate content. Google (and other search engines) has gotten very good at knowing if it has already indexed similar or duplicate content. Google is capable of examining text (including individual paragraphs), images and link networks (in and outbound links), looking for telltale signs of duplicate content.
If, for example, it perceives a site displaying product information pulled from the same product database that 25,000 other sites pull duplicate product information from, Google is not likely to rank that site well. Similarly, if it finds duplicate networks of reciprocal links shared among several pages in its index, it is not likely to assign a high trust value to that document.
Reciprocal linking strategies
"As these indexing changes have rolled out, we've improving how we handle reciprocal link exchanges and link buying/selling."
Though Cutts points at reciprocal linking as an indicator to Google that there might be issues with a website's credibility, that doesn't automatically mean that all reciprocal links are going to cause problems for webmasters. Common sense and the value of delivering a quality user experience should dictate decisions around link strategies.
For example, if a professional landscaper provided links to plant nurseries in his or her region, and those nurseries in turn provided links to that landscaper, Google would likely consider those to be quality links. There is a direct relevance between the two sources of information. A network of links between local landscaping businesses, nurseries, horticultural institutes, Affiliate Text and Content
Cutts devoted a long paragraph covering affiliate text, mentioning a T-shirt site that once had about 100 pages indexed, a number recently reduced to only 5.
" The person said that every page has original content, but every link that I clicked was an affiliate link that went to the site that actually sold the T-shirts. And the snippet of text that I happened to grab was also taken from the site that actually sold the T-shirts. The site has a blog, which I'd normally recommend as a good way to get links, but every link on the blog is just an affiliate link. The first several posts didn't even have any text, and when I found an entry that did, it was copied from somewhere else. So I don't think that the drop in indexed pages for this domain necessarily points to an issue on Google's side. The question I'd be asking is why anyone would choose your "favourites" site instead of going directly to the site that sells T-shirts?"
The Ghosts of minutes past
We live in the present. Our websites live in the past as well as the present. Google keeps tabs on all documents in its index and even if it has, "... spidered content that was posted only moments before," it has an elephant's memory for previous details and a computer's ability to pull lots of information together to get a bigger picture of how all those details fit together.
Google works by following links. Google ranks by examining the quality of content found on a site and also on the sites that link into, or are linked to from, sites in its indexes. If you have seen a great deal of page content fall away from Google's index, or if you are just generally interested in how Google is working, read Cutts' Bigdaddy "Del.icio.us") | Yahoo! My Web
Technorati:
Jim Hedger is the SEO Manager of
Found an error or have a suggestion? Let us know and we'll review it.
Suggest a Correction
Bigdaddy Timeline, Courtesy of Matt Cutts
0 views
Comments (0)
Please sign in to leave a comment.





No comments yet. Be the first to comment!