Search

Putting Behavioural Metrics In Perspective

0 views

So here’s the question; are behavioural metrics being used in modern search? You do remember them right? Those warm and fuzzy little signals such as bounce rates that there all the rage in late 2008 in the search engine optimization world? Sure you do… but let’s take one last look.

Although bounce rates received the biggest attention, we would be remiss not to start by quickly listing some signals commonly looked at by information retrieval folks. The two elements include implicit and explicit data (actions and interactions) – examples can include;

Implicit signals

  1. Query history (search history)
  2. SERP interaction (revisions, selections and bounce rates)
  3. User document behaviour (time on page/site, scrolling behaviour);
  4. Surfing habits (frequency and time of day)
  5. Interactions with advertising
  6. Demographic and geographic
  7. Data from different application (application focus – IM, email, reader);
  8. and closing a window.
  9. Explicit signals

    1. Adding to favourites
    2. Voting (a la Search Wiki or toolbar)
    3. Printing of page
    4. Emailing a page to a friend (from site)
    5. Now that we’re past that let’s get a little geeky so those information retrievers don’t shake their heads to hard at us – the terminology. I am as guilty as the next Gypsy of flinging the term ‘behavioural metrics’ about over the last year or so, even performance metrics. If you want to research this more, start by using the term; implicit/explicit user feedback signals – because that’s what we’re talking about.

      This is not the ranking signal U were looking for
      and thanks to Second, there is a “quality-of-context bias”: the users’ clicking decision is not only influenced by the relevance of the clicked link, but also by the overall quality of the
      other abstracts in the ranking.”

      Other research (on click data) looked at how users actually interact with search results as far as bias is concerned. People are often consistent in clicking patterns (clicking top result, second, third) regardless of the underlying data. This means the entire data set can be skewed as not clicking on the 8th result may no necessarily be a vote against the link in the result, but more of an ingrained habit on the part of the searcher.

      They summarized;

      “Our results show that click behaviour does not vary systematically with the quality of search results. However, click behaviour does vary significantly between individual users, and between search topics. This suggests that using direct click behaviour—click rank and click frequency—to infer the quality of the underlying search system is problematic.”

      And also;

      A natural question that arises in this setting is the tolerance of this method to noise in the training data, particularly should users click in malicious ways. While we used noisy real-world data, we plan to explicitly study the effect of noise, words with two meanings, and click-spam on ourapproach. From -
      “ranking accuracy decreases indeed when more documents are spammed, but the decrease is within a small range. When only a small number of documents are spammed per query, ranking accuracy is only slightly affected even if a large number of queries are spammed.” From- “… it might also be possible to explore mechanisms that make the algorithm robust against “spamming”. It is currently not clear in how far a single user could maliciously influence the ranking function by repeatedly clicking on particular links.” From - Behavioral Metrics and the Birth of SEO Surfbot Nets – let us get to then now shall we?

      Getting beyond the geeky; looking to the future

      Are we getting somewhere yet? Great… but it’s not all doom and gloom, no need to call the corner just yet. You see, for the most part researchers have been finding some great improvements in search performance; they simply haven’t worked out all the values of such signals nor the spam concerns. In an enterprise environment, where manipulation/spam is far less likely, implicit feedback can be a more useful tool. It is the larger public access environment where spam is far more prevalent that the nut has yet to be cracked.

      I stand on my original assertion that this type of approach is best served in a personalized environment. This would be huge in dealing with the apparent issues surrounding spam related issues as it is kinda’ hard to spam ones self you see. This makes personalized a likely candidate for user feedback signals. Either way, it simply hasn’t been solved yet

      So what are we left with?? Some noisy signals that are spammable… hmmm… where have we heard that before?

      Matt Cutts on bounce rates

      And so now I leave all of this in your capable hands my weary web warriors. If you can go through the research papers listed below (or elsewhere) and find me strong evidence of how they deal with noise reduction and click-spam, then we can discuss it further. That is my challenge to you; because from what is out there, it is not yet viable in a large scale environment.

      I submit to you, my enthusiastic optimizers, that bounce rates and it’s implicit feedback brethren are simply not likely to be in Google’s (nor any major search engine's) current ranking schemes. It is a novelty item at best with potential in a personalized environment.

      Care to dispute this? I am more than happy to review any research to the contrary.

      Want to know what I think is causing us to see what we believe this to be? You’re just going to have to wait until next week.

      /story..

      Suggest a Correction

      Found an error or have a suggestion? Let us know and we'll review it.

Share this article

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!