MozCon 2011 - International SEO Presentation & Data

Earlier this month I was asked to speak at MozCon on International SEO and had an amazing time.

Here are the attendees looking just lovely.

 

 

 

Image credit: http://www.flickr.com/photos/thos003/5982940906/sizes/m/in/photostream/

My presentation covered some tips on how to avoid sucking at international SEO. I also shared some data which we’d gathered in order to try and determine whether local language links (for example, links from pages in French to French content; links from pages in German to German content, etc.) correlate to rankings.

For those who attended I promised that I’d share our data and a fuller explanation of methodology we used - if you weren’t able to come along you can get up to speed by viewing my slide deck below:

So, to the data and methodology...

Aim

To determine whether local language links correlate strongly with site ranking.

Methodology: Data Collection

  1. We selected 15 keywords with commercial intent (to ensure they were fairly competitive): Armani Jeans, Diesel Jeans, DVD, Buy DVDs, Car Insurance, Mortgages, Loans, Life Insurance, Music Downloads, TV Downloads, Kindle, iPhone, iPad, Cheap Hotels and Cheap Flights.
  2. We translated these keywords into 9 languages - French, Spanish, Italian, German, Portuguese, Russian, Swedish, Dutch and Polish.
  3. We ran each keyword (in each language) through the SEOmoz Keyword Difficulty Tool and pulled out the top 10 URLs in each SERP, page authority and domain authority.
  4. We ran the top 10 URLs for each keyword through ‘Hannah’s Moz Ninja Language Tool’ (built by Tom Anthony). This tool pulls back the top 1,000 backlinks for each URL from Open Site Explorer, detects language indicators from TLDs, sub-domains and sub-folders and outputs how many of the rankings URLs’ backlinks contain elements matching the chosen language (e.g. www.randomdomain.pl is a Polish link). If the language cannot be determined by TLD, sub-domain or sub-folder (as is often the case), then the tool will determine whether or not this is a local language link by identifying the language in the title tag of the linking page. The tool works systematically through each of these elements, so that links are only counted as local by one attribute (in a hierarchy, from sub-domain to title tag).We elected not to use hosting to determine locality as this was deemed too unreliable.

As such the sum of local links was the total identified with language attributes in the following order:

Local Links = TLDs + Sub-Domains + Sub-Folders + Title Tag Text

Methodology for Analysis

  1. We elected to look at % of ‘local links’ and compared to Google rankings. We felt that % local links was probably a better metric than number of local links because the URLs analysed varied dramatically in terms of numbers of links (plus of course number of links might be a very noisy metric). Data analysis was completed by Alice Murphy.

Limitations

  1. Small data set
  2. % of local links may not be the best measure
  3. Open Site Explorer might not have picked up all of the external links to a given page
  4. Doesn’t weight links for quality
  5. Doesn’t take into account overall link profile of a site
  6. Doesn’t take into account any other ranking factors
  7. It can be difficult to determine the language of a page from the title tag
  8. Language detection is not 100% accurate (but agrees with Google Translate 81% of the time at >30% confidence)

Results

Prior to getting into the data we collected on local links, I first wanted to illustrate the correlation between page authority (the SEOmoz metric which attempts to quantify the strength of a given page) and Google ranking. The purpose of this was simply to highlight that there is a relationship between links and ranking (not a particularly controversial point, but probably one worth making nonetheless).

However, before I did this, I thought it would be useful to show people what a positive correlation might look like, and included the following graph:

The purpose of this graph is simply to illustrate what a strong correlation might look like.

In this example the x-axis shows the Google Ranking of our faked pages and the y-axis shows the faked page authority of the pages.

This is obviously faked data as I’d suggest you’d be highly unlikely to see such a neat relationship; plus of course the chances of a page ranking 10th with a page authority of less than 10 is pretty unlikely.

I then went on to show the real page authority data:

So clearly we’re not looking so neat and tidy as our faked data, but you can see that there’s some sort of relationship there.

Plus of course we wouldn’t expect to see a perfect relationship as that would indicate that the Google God’s algorithm was all about the links - whereas in fact there are many more factors taken into account when pages are ranked.

 

So, I’m guessing it’s time to get to the good stuff - I’m going to skip straight to the binned local link data for France (i.e. just those pages with 601-800 links). This is far and away our most robust data.

So here we have Google ranking on the x-axis and the % of local language links on the y-axis.

Lets take pages ranking first in Google as an example. You can see that there is a huge variety in the percentage of local language links for these pages - one has no local language links at all, one has 80% local language links. Outliers aside, the bulk of the sample have between 40-70% local links.

However, even if you ignore the outliers for the rest of the top ten the data still lacks an obvious pattern or relationship.

Huh?

From our data set it looks as though there isn’t an obvious correlation between % local language links and rankings, and it looks like ‘good’ links are perhaps more important than local language links. Of course it may be (and I certainly would hope) that local language links may become a stronger ranking signal in the future.

Go play with the data...

I’d love to hear your thoughts on this - you can download the data here

We’ll also be making the ‘Hannah’s Moz Ninja Language Tool’ available for you to play with shortly.

Hannah Smith

Hannah Smith

Hannah joined Distilled in September 2010 as a Consultant and is now on the Content Strategy team. Prior to this she spent over 7 years in offline marketing (point of sale, press advertising, direct mail & sponsorship), until her fairy godmother...   read more

Get blog posts via email

10 Comments

  1. Hi, I used the logic to explain the notorious TLD, subdomain, folder issue here: http://www.matteomonari.com/en/top-level-domain-folder-or-subdomain/ . Nice to see we are in line :-)

    reply >
  2. Thanks for the refresh from MOZcon!

    While I really enjoyed Hannah's presentation style and her preso in general, I felt the data and methodology were the weakest parts: The data set(s) were far too small to be meaningful or to compel a discussion around correlation ... making these observations purely speculative. (with some pretty plots :-) I would definitely like to see this analysis performed with more data, however.

    reply >
    • Hannah

      Hi Anthony,

      I'm glad you enjoyed the presentation. I agree the data set is small, but I think it's nonetheless interesting that patterns failed to emerge - I think that if there was a strong correlation we would have seen it.

      Like you, I would definitely like to see some further analysis with more data :)

      Hannah

  3. Hannah, thanks a huge lot for sharing the insights! Totally answered a few of my questions. I temporarily moved from Canada to Russia and now working on optimizing a couple of sites for Yandex engine. Overall, pretty much the same principles apply, the only difference is the language. Was pretty happy to see some Russian data in the study. Will go and play with the spreadsheet. Thanks a bunch, again!

    reply >
  4. Czech Fan

    Matteo as a world renowned international SEO, what's your take on the "myth" that you need local links to rank well?

    reply >
  5. I'd also like to hear more feedback about "localized links". In my experiance gaining links that are from local, often country specific domains, are really useful at helping associate your site with specific language/country based rankings for keywords. I think as Google grows more and is able to better specialize and rank country/language specific sites, this is an important link-building strategy to get better results.

    reply >
  6. Hannah

    Hi Mike,

    The data set which we collated (you can download the data yourself and have a play) suggested that there wasn't a strong relationship between the % of links which were 'local' - and rankings.

    That's not to say that getting local links won't help - it's just that it appears to be possible to have strong rankings even if the links to a given page aren't particularly local.

    Hope this makes sense,
    Hannah

    reply >
    • Jeanette

      Thanks Hannah for this great work!

      I agree with you regarding the impact of links from other countries / languages, but I'm not really sure about it.

      Let's take a German company, running an E-Commerce-site in German language for customers located in Germany and using Germanic product names.

      What's the impact of linkbuilding in the US in English, in France in French, etc.? My take and experience: a link from a website in the US with a high site prominence is worth more than a link from a German website with a lower prominence. Your experience?
      But then there are the "details".
      - Normally it's important that the linking page is about the same subject as the landing page. But how is this when the linking page is in English and the landing page in German? Does Google recognize the content across languages?
      - On the other end of geography: how local should linkbuilding get when the site targets whole Germany? E.g. if the company is headquartered in Munich and is focussing on local linkbuilding - would it win SEO-ground in Munich but loose e.g. in Berlin?

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>