Linkgex: Tool to Get Links to Specific Subsets of Pages

Recently I have found myself fairly frequently wanting to get links that are linking to a certain sub-section of a website (i.e. links to only certain pages on the domain). Reasons why this might come about:

  • to know how many links my product / job listing / category pages etc. are getting.
  • to find links to pages that mention a something in the url.
  • to find links to only certain language sections.
  • to exclude links to certain pages.
I tend to use a mix of OpenSiteExplorer, Majestic, and Ahrefs when I get backlinks, but currently none of these services actually allow me to get backlinks in such a fashion. OSE does allow a ‘to this subfolder’ in the advanced reports section which sometimes does the trick, but otherwise I’m left to download all the links and filter them myself.

Not content with this, I decided to put together a short script proof of concept script to do this This initial version is only for OSE, but I plan to build Majestic and Ahrefs versions too if there is interest.

Introducing Linkgex

Catchy name, eh?

The script is easiest explain with an example (details on installing it below). You enter a domain, a set of regex matches, and whether you want to include or exclude URLs on that specified domain that match these rules. You also decide whether you want all the rules to be matched or any one of them (i.e. “AND” or “OR”):

Linkgex Input Screen

So in this example, I want to include only URLs on www.distilled.net that have either ‘linklove’ or ‘searchlove’ in the URL. Here is the output as an HTML table (with a link to download a CSV):

Linkgex Results

You can choose to sort the results by clicking the table headings, or alternatively you can simply download the CSV to mash about in Excel as you see fit.

Get Linkgex

Simple, you can download it here:

 

Download Linkgex

 

Configuring Linkgex

Also simple, you need to grab your SEOmoz API details (its free: http://moz.com/api), and enter them into the config.php file:

Linkgex Config File

There are a couple of other settings in the file which are not necessary (but you should look at them - they have notes). Once all this is done, you just need to either upload to some hosting (you should be careful not to reveal where so others don’t use your API credits), or run from your local machine if you are geeky enough.

It’s too complicated... help!

I appreciate that not everyone is a coder and some people might not like to have to set this up themselves. If there is interest then I’ll probably create a hosted version of this that also works with Majestic and Ahrefs. If you’d be interested then I’d love to hear from you in the comments, along with any feature requests.

Wrap Up

I’d love to hear from anyone who uses this about what you’ve used it for, so I can feed that into any future versions. Feel free to take the code and do what you want with it, but please consider sharing whatever you build with the community.

Have fun! :)

Tom Anthony

Tom Anthony

Joining Distilled as an SEO, Tom comes from a background in freelance web development. With a degree in Computer Science, a PhD in Artificial Intelligence (almost – he is still writing his thesis!) and having taught himself to program on a BBC...   read more

Get blog posts via email

21 Comments

  1. I would certaily be interested in Ahref and Majestic version as well. They are known to display maximum number of links (be it valid or decayed). A detail tutorial would help a lot, though (say with couple of more examples)

    reply >
    • Hey Ricky,

      I was tempted to create more examples - but as it is a prototype I thought I'd keep it quick.

      It seems there is interest, so I will try to create a hosted version that also plugs into the other APIs.

      Thanks.

    • I second that, Ahrefs please!

  2. Very cool. Thanks very much! I'd be very interested in Majestic and AHREFS versions too.

    reply >
  3. Brilliant!

    reply >
  4. Would love to see a Majestic version as well :)

    Gave you a share on Inbound to show my appreciation.

    reply >
  5. Wow that's pretty cool. Would also be interested in seeing a Ahref and Majestic version too. :)

    reply >
  6. pretty cool tool indeed, im always amazed at distilled appproach, whatever come from tom,william or paddy is usually AMAZING asset.

    reply >
  7. Cool!

    reply >
  8. Well, another option is just download the data from Google Webmaster Tools and/or Bing Webmaster Central. I would begin with those options first.

    reply >
    • Hi Michael,

      True - but I've never found that data to be very reliable. You could dump the data from OSE, Majestic and then filter in Excel too. I just crated this tool to streamline it for quick searches.

  9. Timing! I've been wanting a tool that does exactly this, but thought it was too trivial so thank you for creating it. It would be useful to select other link attributes such as follow status, redirect.

    reply >
  10. Would love a quick video walkthrough on setting it up! Even if it's only 1 or 2 minutes.

    Great tool though!

    reply >
  11. Great idea and Good to know that you have implemented quickly. Filtering is a big headache especially while coming to links. Your idea will save SEO's Time. Great job :). Look forward to your hosted version.

    reply >
  12. Hi Tom,

    Handy tool. I'd be interested to see to AHREFS/Majestic versions too, but ideally I'd love to see a similar tools which pulls data from all three sources, strips out any duplicates/redundant data, then displays the amalgamated data rather than using three separate tools. There's a challenge for you! :-)

    Chris

    reply >
  13. Would love to have this tool plugged with other APIs. Either hosted or a download. Works for me and this will be a relief for a lot of people I believe.

    reply >
  14. Brilliant idea!!! Should try doing this with anchor text as well not only urls.

    reply >
  15. Roie

    Nice, but you English speaking guys keep forgetting there are other languages out there :) Works perfectly though after adding the utf8 meta string. Thanks for this handy tool.

    reply >
  16. Nice tip, I will install it and test how it works...

    reply >

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>