Sometimes we at Distilled go a little far out with our posts about statistics and data and all that, but not all of that is applicable to everyday work. Not all tasks are so complicated—some are fairly cut and dry, like figuring out which landing pages have received the most traffic over the last year. That’s straightforward, right? At some level this is clerical work—but that’s all the more reason to get it over with as quickly as possible.
Often we can slow ourselves down by organizing data in inefficient ways. These methods might seem smart or fast at the time, and yet they can often make it more difficult to get what we really need out of our data set. Maybe you are trying to figure out which page received the most traffic over the last year—and this is what your Excel file looks like:
If your spreadsheet looks like this, you’re gonna have a bad time.
Don’t separate sheets by date
The problem with organizing data into separate sheets based on month (or week for that matter) is that if you want to calculate any data that uses information from multiple months, you’re going to need to get the data all into one place. One way or another, those numbers have to come together—but right now they’re on separate sheets.
What I’d suggest is a sheet that looks something like this drastically oversimplified example:
So instead of separating this data into different sheets, we just have a field that specifies a month. Each unique keyword can appear in multiple rows corresponding to each month. There is no question that this results in a less human readable table, but the reality is that no one reads a table with 1,000,000 or even 1,000 lines by going through it line by line. Human readability isn’t a concern here.
So why do I think this method of organization is so superior?
It’s all about the pivot
To see the benefit of this organization, it will help to illustrate the end point. We’re trying to get at the total number of visits for a certain keyword across multiple months, right? The fastest way to do that is going to be a pivot table. If you’re not familiar, a pivot table allows you to summarize data in a table.
This is our pivot table—and look, the data we want is there! Poorly labeled, but there nonetheless:
This is something that can not be done without all of the initial data existing on the same sheet—a pivot table has to reference one and only one table. And, as an added bonus, we can just as easily reverse out pivot table so that the various months appear as row labels and the sum of traffic is calculated for each month instead of each keyword. Neat!
Keep up with the data
It may be that you get reports like this at regular intervals—maybe you don’t have control of the format of the data. That’s no reason to let things pile up and have your spreadsheet out of control. It may be that you have to cut and copy data from each report into one larger table. Do so.
It’s worth the time and effort, and if you do it on a regular basis, instead of going back and slogging through the task of copying all the data at the end of the year, it will seem much more manageable. You may, of course, need to add a column for the time period the data represents. Be careful when autofilling such data because Excel will often assume you want to increment dates by day instead of just copying the same value over and over:
If you keep all of these points in mind you’ll probably wind up with fewer, larger tables—and more importantly fewer, smaller headaches. If you’d like to increase your Excel proficiency further, our Mike Pantoliano has written a well received guide to Excel for SEO, which I highly recommend.
A potential problem with this methodology is that Excel only supports ~1,000,000 lines per sheet. If you have more than 1,000,000 lines I’d suggest that you belong in a database program like Access instead of Excel, but that’s another post entirely.
Are there any other basic rules you follow to make sure your Excel workflow stays efficient?
Benjamin Estes SEO Consultant