Earlier this month I was asked to speak at MozCon on International SEO and had an amazing time.
Here are the attendees looking just lovely.
My presentation covered some tips on how to avoid sucking at international SEO. I also shared some data which we'd gathered in order to try and determine whether local language links (for example, links from pages in French to French content; links from pages in German to German content, etc.) correlate to rankings.
For those who attended I promised that I'd share our data and a fuller explanation of methodology we used - if you weren't able to come along you can get up to speed by viewing my slide deck below:
So, to the data and methodology...
To determine whether local language links correlate strongly with site ranking.
Methodology: Data Collection
- We selected 15 keywords with commercial intent (to ensure they were fairly competitive): Armani Jeans, Diesel Jeans, DVD, Buy DVDs, Car Insurance, Mortgages, Loans, Life Insurance, Music Downloads, TV Downloads, Kindle, iPhone, iPad, Cheap Hotels and Cheap Flights.
- We translated these keywords into 9 languages - French, Spanish, Italian, German, Portuguese, Russian, Swedish, Dutch and Polish.
- We ran each keyword (in each language) through the SEOmoz Keyword Difficulty Tool and pulled out the top 10 URLs in each SERP, page authority and domain authority.
- We ran the top 10 URLs for each keyword through 'Hannah's Moz Ninja Language Tool' (built by Tom Anthony). This tool pulls back the top 1,000 backlinks for each URL from Open Site Explorer, detects language indicators from TLDs, sub-domains and sub-folders and outputs how many of the rankings URLs' backlinks contain elements matching the chosen language (e.g. www.randomdomain.pl is a Polish link). If the language cannot be determined by TLD, sub-domain or sub-folder (as is often the case), then the tool will determine whether or not this is a local language link by identifying the language in the title tag of the linking page. The tool works systematically through each of these elements, so that links are only counted as local by one attribute (in a hierarchy, from sub-domain to title tag).We elected not to use hosting to determine locality as this was deemed too unreliable.
As such the sum of local links was the total identified with language attributes in the following order:
Local Links = TLDs + Sub-Domains + Sub-Folders + Title Tag Text
Methodology for Analysis
- We elected to look at % of 'local links' and compared to Google rankings. We felt that % local links was probably a better metric than number of local links because the URLs analysed varied dramatically in terms of numbers of links (plus of course number of links might be a very noisy metric). Data analysis was completed by Alice Murphy.
- Small data set
- % of local links may not be the best measure
- Open Site Explorer might not have picked up all of the external links to a given page
- Doesn’t weight links for quality
- Doesn’t take into account overall link profile of a site
- Doesn’t take into account any other ranking factors
- It can be difficult to determine the language of a page from the title tag
- Language detection is not 100% accurate (but agrees with Google Translate 81% of the time at >30% confidence)
Prior to getting into the data we collected on local links, I first wanted to illustrate the correlation between page authority (the SEOmoz metric which attempts to quantify the strength of a given page) and Google ranking. The purpose of this was simply to highlight that there is a relationship between links and ranking (not a particularly controversial point, but probably one worth making nonetheless).
However, before I did this, I thought it would be useful to show people what a positive correlation might look like, and included the following graph:
In this example the x-axis shows the Google Ranking of our faked pages and the y-axis shows the faked page authority of the pages.
This is obviously faked data as I'd suggest you'd be highly unlikely to see such a neat relationship; plus of course the chances of a page ranking 10th with a page authority of less than 10 is pretty unlikely.
I then went on to show the real page authority data:
Plus of course we wouldn't expect to see a perfect relationship as that would indicate that the Google God's algorithm was all about the links - whereas in fact there are many more factors taken into account when pages are ranked.
So, I'm guessing it's time to get to the good stuff - I'm going to skip straight to the binned local link data for France (i.e. just those pages with 601-800 links). This is far and away our most robust data.
Lets take pages ranking first in Google as an example. You can see that there is a huge variety in the percentage of local language links for these pages - one has no local language links at all, one has 80% local language links. Outliers aside, the bulk of the sample have between 40-70% local links.
However, even if you ignore the outliers for the rest of the top ten the data still lacks an obvious pattern or relationship.
From our data set it looks as though there isn't an obvious correlation between % local language links and rankings, and it looks like 'good' links are perhaps more important than local language links. Of course it may be (and I certainly would hope) that local language links may become a stronger ranking signal in the future.
Go play with the data...
I'd love to hear your thoughts on this - you can download the data here
We'll also be making the 'Hannah's Moz Ninja Language Tool' available for you to play with shortly.