Linkgex: Tool to Get Links to Specific Subsets of Pages

Recently I have found myself fairly frequently wanting to get links that are linking to a certain sub-section of a website (i.e. links to only certain pages on the domain). Reasons why this might come about:

  • to know how many links my product / job listing / category pages etc. are getting.
  • to find links to pages that mention a something in the url.
  • to find links to only certain language sections.
  • to exclude links to certain pages.
I tend to use a mix of OpenSiteExplorer, Majestic, and Ahrefs when I get backlinks, but currently none of these services actually allow me to get backlinks in such a fashion. OSE does allow a ‘to this subfolder’ in the advanced reports section which sometimes does the trick, but otherwise I’m left to download all the links and filter them myself.

Not content with this, I decided to put together a short script proof of concept script to do this This initial version is only for OSE, but I plan to build Majestic and Ahrefs versions too if there is interest.

Introducing Linkgex

Catchy name, eh?

The script is easiest explain with an example (details on installing it below). You enter a domain, a set of regex matches, and whether you want to include or exclude URLs on that specified domain that match these rules. You also decide whether you want all the rules to be matched or any one of them (i.e. “AND” or “OR”):

Linkgex Input Screen

So in this example, I want to include only URLs on that have either ‘linklove’ or ‘searchlove’ in the URL. Here is the output as an HTML table (with a link to download a CSV):

Linkgex Results

You can choose to sort the results by clicking the table headings, or alternatively you can simply download the CSV to mash about in Excel as you see fit.

Get Linkgex

Simple, you can download it here:


Download Linkgex


Configuring Linkgex

Also simple, you need to grab your SEOmoz API details (its free:, and enter them into the config.php file:

Linkgex Config File

There are a couple of other settings in the file which are not necessary (but you should look at them - they have notes). Once all this is done, you just need to either upload to some hosting (you should be careful not to reveal where so others don’t use your API credits), or run from your local machine if you are geeky enough.

It’s too complicated... help!

I appreciate that not everyone is a coder and some people might not like to have to set this up themselves. If there is interest then I’ll probably create a hosted version of this that also works with Majestic and Ahrefs. If you’d be interested then I’d love to hear from you in the comments, along with any feature requests.

Wrap Up

I’d love to hear from anyone who uses this about what you’ve used it for, so I can feed that into any future versions. Feel free to take the code and do what you want with it, but please consider sharing whatever you build with the community.

Have fun! :)

About the author
Tom Anthony

Tom Anthony

With a background in freelance web development, a degree in Computer Science, a PhD in Artificial Intelligence (almost – he is still writing his thesis!) and having taught himself to program on a BBC Master compact at the age of 8, it could be easy...   read more