With over 1.3 billion documents currently indexed and roughly 29 million daily searches, Google has emerged as a search powerhouse for the 21st century. Partnered with the industry powerhouse, Yahoo!, Google has established itself as a market leader. Much of Google’s strength can be credited to their uncompromising attitude towards search. Google strives to make their search experience “an easy, honest and objective way to find high-quality websites with information relevant to your search” (www.google.com).
Google’s lofty aspirations and revolutionary Web page analysis tools continue to drive more and more users to their search engine. I had a chance to talk with Craig Silverstein of Google for an exclusive interview. We covered a number of topics including Google’s multifaceted ranking algorithm, shedding light on some of its various components. Most importantly for readers of MarketPosition, Silverstein divulged how their ranking technique differs from the other major search engines and what factors are likely to give your Web site a higher ranking. What was most apparent after our interview with Google was this: Search engine positioning is a process, not a project. You can’t simply press one button and find your site launched to the top of the search results.
Search engine visibility is gained over time. That’s why tools such as WebPosition Gold are so important for companies seeking to grow their search engine visibility. It’s an ongoing process that must address an ever-evolving search engine landscape and how your Web site will interface with those many search engines. Remember, you heard it from iProspect.com first: search engine positioning is a process, not a project. iProspect.com’s interview with Google revealed how intricate Google’s page scoring algorithm has become.
PageRank is one of the fundamental aspects of Google’s page-scoring algorithm. Google describes PageRank as the following:
“PageRank relies on the uniquely democratic nature of the Web by using its vast link structure as an indicator of an individual page’s value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages important”” (www.google.com).
Keep in mind that PageRank does not consider outbound links. Therefore, the links on your Web page to other sites across the Web have no impact on your PageRank score. However, outbound links are important for establishing your page’s reputation as a source, an “authority” on a topic. Furthermore, remember that PageRank is conducted on a page-by-page basis, thus different pages within one domain are likely to have unique PageRanks. It should be pointed out that PageRank does consider links that are within the same domain. Hence, pages within a domain linking to another page within that same domain impact PageRank – if there is a page in your Web site that all the other pages of your Web site link to, it will enjoy a higher PageRank score and may rank better than other pages in your Web site. When you consider that most Web pages have a link to “home” its no wonder that a site’s home page can enjoy a higher ranking than internal pages.
Craig pointed out something very interesting during our talk: External links that you grant from a particular page on your Web site can become diluted. In other words, if you place 10,000 links to other Web pages from a particular page of your Web site, each link is less powerful than if you were to link to only five other Web pages. Or, the contribution value to another Web site of each individual link is weakened the more you grant.
PageRank and Search
“While PageRank helps clarify the quality or importance of a Web page, it provides no insight into how well that Web page matches your particular information need. Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don’t match your query contextually. Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page’s content (as well as the content of the pages linking to it) to determine if it’s a good match for your query” (www.google.com). So what other factors have weight in Google’s ranking algorithm?
The Term Vector Theory
Google’s algorithm incorporates the ideas and understanding behind the term vector theory. While the elements of the term vector theory can be quite complex, Craig offered a rather basic definition of how the theory originated. A premise of the term vector theory “says the documents are good if they contain the words in your query and they contain them a lot,” explained Silverstein. As search has matured and grown more complex, Google has adapted their algorithm to complement these changes and to account for those who try to cheat and trick the search engines. While the algorithm has adjusted with the times, in essence it still embraces the beliefs behind the term vector theory.
Scoring = PageRank + Term Vector
The term vector factors of the Google ranking algorithm, which will be covered below, concentrate on how relevant a page is to a user’s search. This score, combined with the PageRank score that measures the popularity of the page, is how Google derives an overall score or ranking of a Web page. Thus, the Web pages that receive high scores are, in Google’s opinion, the Web pages that best meet the user’s individual needs.
The following are some of the on-the-page considerations that Craig revealed that apply to Google’s term vector portion of their ranking algorithm. These factors may be the difference between getting listed in one of today’s most influential search engines and being left out in the dark.
As discussed in the term vector theory, the presence and number of times the query words appear in the document has significance. However, Google also takes the word or phrase proximity into account. For instance, if a search is conducted using the expression “Thai restaurants in Cleveland,” how closely these words appear to each other within the document has consequence. If all the words appear within the document multiple times, but the word “Thai” is nowhere near “restaurants” and the word “Cleveland” is also alienated, the page’s ranking is likely to be diminished.
The above article, or portions of it, have been reprinted with permission from the MarketPosition Newsletter and FirstPlace Software, Inc. and is copyright 1997-2001. FirstPlace produces WebPosition Gold, the award-winning software product to track and to improve your search engine rankings. You may download a FREE trial copy of WebPosition Gold from: http://www.webposition.com