How User Data May Reorder Search Rankings


Over the last several months, I’ve seen a site take a series of hits in traffic that I feel is the result of increased usage of user data to reorder search results. What I’ve seen is that the homepage and high quality content continues to increase in rankings on highly competitive terms (#2 for industry head term after bringing them up from 80+ a few months ago). However, traffic to subpages with bad user experiences dropped significantly. Upon inspecting their rankings, there does not appear to be a “penalty” but these subpages dropped from #1 to #2 - #4 across a wide range of keywords. The summation of this slight reordering across a large number of keywords is resulting in a large overall drop in organic traffic.

Below are my thoughts on how search engines may be reordering results based on user data.

Disclaimer: This stuff is nearly impossible to prove, but my thoughts are from reading patents, listening to Duane Forrester talk at SMX, and from experiences of things I’ve seen.

User Signals Search Engines May Look At

#1 Bounce Rate from result returning back to the search engine.

If a user clicks a result after performing a search, then comes back to Google.com search in a relatively quick amount of time, this could be a signal that the result was low quality or did not match the searcher’s intent.

Yeah, there is a lot of noise and caveats to this signal, but look at these results from a Google paper about bounce rate and ad landing page quality. They compare mean bounce rates to quality scores from evaluators.

Bounce Rate vs Quality Valuation

Results show that expert human evaluation of ad quality agreed well with implied user assessment given by bounce rate. That graph to me is a pretty big deal. The bounce rates for those in the excellent category were less than half that of those in the bad category.

Ads that followed AdWords quality guidelines had bounce rates that were 25.4% lower than those that didn’t. It might be worth checking out their landing page and site quality guidelines.

#2 The CTR on the result listing.

CTR = total clicks on listing / total listing impression.

You can bet Google is collecting this data.

CTR in Google Webmaster Tools

They know your domain’s average CTR, average CTR per URL, and CTR per keyword SERPs. They could also rotate top listings and collect data over time to determine if particular result receives a disportionate CTR relative to other top results.

I find this graph from the same Google paper to be particularly interesting.

Bounce Rate vs CTR

Ads with low bounce rates had high CTR. These two metrics had a correlation of -0.85. That’s a strong inverse relationship. If bounce rate, as shown before, is a good proxy for user satisfaction, so is CTR.

#3 The amount of time the user was away from the search engine before coming back.

Duane Forrester from Bing made a comment at SMX during a panel that not only is the return to the results noted, but so is the time the user was away. If the user is gone for 5 seconds, that’s very different than a user who was gone for 5 minutes. If a user clicks a result, returns in 5 seconds, clicks another result, and then never comes back, that’s a potential signal.

#4 Behavior post return

If a user comes back to a search engine after clicking a result, their actions upon returning can send corroborating signals to reinforce historical signals. This may include actions such as clicking another result for the same keyword, refining the search query, or even blocking a result. By crossing these signals with other signals, I feel Google can increase their confidence in evaluating result quality.

How This May Improve SERPs Quality

This type of user satisfaction data could be used to reorder results to create an improved SERPs.

Google Traditional Rankings

SERPs Before User Satisfaction Reorder

Potential Conclusion by Engines:

Site #2 seems to provide an improved user experience for this SERPs, so although Site #1 has a stronger relevancy score based on links and content, let’s move Site #2 to the top result. This will result in a larger portion of satisfied users.

Site #3 is the least relevant of the three, but its large brand creates affinity to this result and higher user satisfaction. Moving Site #3 to the second result will result in more satisfied users.

Reordered Results

SERPs After User Satisfaction Reorder

Potential Conclusion by Engines:

A significant portion of searchers (~30% to 60%) are now likely to click on a result that creates higher user satisfaction. As a result of this reordering, this result page’s quality has been improved.

The Takeaway

This isn’t a “penalty” but moving a site down from a top result to a second, third, or fourth position is a strong drop in traffic for that particular term. If this is repeated across a large number of keywords, this impact on traffic may be significant.

I think this is one way Panda could have had enormous impacts on content farm visibility without applying a “penalty”. Although link building can keep pushing up rankings, and is still a strong signal, search engines can gather this type of information on top results and readjust accordingly.

Doing This Without Substantial User Data

Using the example above, Google would need to collect a fair amount of data overtime to make a significant claim about quality and user satisfaction. However, machine learning can be used to build a predictive model.

Just take a look at this quote from Google (emphasis mine):

This paper has demonstrated through quantitative and qualitative analysis that bounce rate provides a useful assessment of user satisfaction for sponsored search advertising that complements other quality metrics such as clickthrough and conversion rates. We described methods of estimating bounce rate through observing user behavior, and have provided extensive analysis of real world bounce rate data to develop the reader’s understanding of this important metric. We have also shown that even in absence of substantial clickthrough data, bounce rate may be estimated through machine learning when applied to features extracted from sponsored search advertisements and their landing page.

The SEO game isn’t really changing, just Google is getting better at applying machine learning to user data, spam analysis, and topic modeling. The goal is still, as it has been for a while now, to not chase the algo, but to work to at least deserve to be listed as a top result.

If you want to chat more about SEO, follow me on Twitter. I often write about link building, but I also love technical and research related SEO.

Get blog posts via email

9 Comments

  1. First of all, great post!

    In terms of findings and assuming this would be true and something Google and other search engines will be implementing (or probably already are to some degree), would you say that this would perhaps shift the balance a bit away from traditional link building and back towards on site factors, thinking primarily about quality of content?

    If bounce rates and CTR% will start to play a more significant role in SERPs, getting the basics right in optimizing the set up, UI, and quality of content in relation to search query will become increasingly important. In my view that is a very good thing and something we will hopefully see more of.

    It would be very interesting to see this type of information mapped against social factors, such as retweets, likes, and now +1's, to see which of the factors would carry the most weight.

    reply >
  2. This could be a really dangerous signal to use - some of our customers already have problems with unscrupulous competitors who click their ppc ads at every opportunity to cost them money and deny customers.

    Now the SE's are telling these people they don't even have to find the ads - they can just bounce off a competitor's site a few hundred times a day to push them out of the rankings!? Nice.

    Strikes me - sometimes "rocket science" can be what is responsible for completely blowing a good thing to oblivion!

    reply >
    • I would imagine that if Google is using these kinds of search signals to begin with, combined with their PPC knowledge of click fraud and filtering these kinds of fraudulent clicks out, that they are able to pretty accurately eliminate spammy manipulation of search results in this way ... unless the spammer uses some kind of network or proxy services to be mass spam. This of course, is a possibility.

  3. @Justin gr8 value add. Thanks
    @shamenz spam is inevitable, but if SE consider data mentioned above for ranking, it will be just one small component.

    reply >
  4. Martin

    I would agree that the usage of user data to reorder search results was inevitable, but even if you just take bounce rate there are just as many reasons why a good quality website might have a large bounce rate. As with all these things one metric in isolation is never enough to determine the true picture.

    Those website that rank well for single search terms know that single search terms often represent high volume, low quality visitors; in the past it has always been a 'nice to have' but generally the more refined search terms have generally resulted in a higher quality of visitor. It seems now that ranking well for a highly generic search term could prove detrimental.

    Also, it doesn't always follow that the old guard is always going to be better than the new guard, having search results that effectively exclude new websites is not serving the searcher.

    Well established, well designed and popular websites should be ranked well, but it is my view that there also needs to be space made available for the young fresh upstart that may be about to turn the establishment on its head.

    There are some things that we know when it comes to search:


    one size does not fit all
    all is not what it first seems when looking at user data
    'new' does not necessarily mean bad
    'established' does not necessarily mean good
    poor website, can be made good
    once good websites, can become bad
    user data can be gamed just like everything else

    reply >
  5. If SEs actually use CTR and bounce rate data, I guess they would be smart enough to detect noise signals that indicate manipulative behaviour. For instance, if CTRs and/or bounce rates for a certain keyword SERP differ a lot between a place in India and the rest of the world (or the rest of the regions that are relevant to that keyword search), I assume the SEs would simply discard the CTR and bounce data from that place in India. The SEs just need to cut off any unusual extraordinary peaks in their data sets and graphs.
    So, companies hiring or paying for manipulative clicks in organic search results would just waste a lot of money which they should better invest in a good user experience of their own websites.

    reply >
  6. great post, really enjoyed reading your analysis. I wish there was a way to prove how important the signals are in Google's algorithm. It's so hard to educate clients in SEO, all they care about is their rankings and they believe link building is the only way to get there!

    reply >
  7. I totally agree with your findings and observations. But here in germany we still have to educate the client. They are always talking about rankings achieved through optimized content, but they never heard of such things that ctr-data is influencing the ranking.

    reply >
  8. Roy

    Good analysis. It quiet certain that SEs will use user data in the search as they evolve. But i think their is a hole in your analysis which is based on the assumption that "BigBrand site certainly outweighs NicheBrand site which certainly outweighs Spammy-site.com". As we talk about spammy sites, we are talking mainly about their techniques employed to build their links and contents, IMO. A so called spammy site may actually have better user experience in terms of bounce rate or CTR than a Nichebrand or a BigBrand.

    reply >

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>