Just before last Christmas Google posted a method to download search queries from Webmaster Central to a CSV file using Python. Downloading a CSV files can of course be done from within Webmaster Central when you’re logged in, but by using Python and the Windows Task Scheduler it can be easily automated, which is very useful as Webmaster Central Data only goes back one month. With automation it is possible to archive this data and see trends over longer periods of time.
The original post does a reasonable job of describing the process but I’m going to try to go into more detail for those that aren’t as familiar with Python and all that jazz. And unlike the original post I’ll delve into how to use the Windows Task Scheduler for automation. I’m going to assume that you as a user know how to access the windows command line, but that’s about all that I’ll assume.
Getting the Job Done!
1. Download and configure Python.
Download Python 2.7. The default settings will be fine for installation, but you will want to take them one step further by adding Python to your system path if you haven’t. Access the advanced system settings dialogue (it’s in the Control Panel under System and Security > System > Advanced system settings).
After clicking “Edit…”, you’ll want to add a semicolon to the end of the list of directories that shows up, followed by “C:\Python27\”. If you installed it to a different directory, enter that instead.
2. Get the Google Data Python Library.
This will be the only step that requires you to access the command line directly. Download the latest versionof the Google Data Python Library (2.0.16 as of this post). Extract that archive to a convenient directory, and run the following lines in the Windows command line or PowerShell.
NOTE: If the directory you extracted the archive to is different than mine, be sure to account for that.
If you have problems running the script, make sure the install or test scripts, be sure that you added Python to the system path, as described in the previous section.
3. Download and configure the necessary scripts.
The original scripts are hosted by Google Code, from which you should download all three scripts, but must at least download the downloader.py file. You can get by with just that if you would like to use our Distilled-customized script: get-wmt. As mentioned above, this script has been modified to download a Top Pages CSV as well as the Top Queries CSV, and it also allows you to input multiple domains if you so choose. The downloader.py is used in the background by whichever script you choose to go with, so be sure it resides in the same directory as get-wmt.py or example-simple-download.py.
So when you know whether you want to use our script or Google’s provided script (example-simple-download.py). You’ll need to open your chosen script in a text editor and change the lines highlighted below to reflect your own personal information. The username and password are for the Webmaster Tools account you use, the domains are the sites you want to run the queries for. The domains must, of course, be verified in your WMT account.
This screenshot is of the get-wmt.py file. If you use example-simple-download.py, the lines which need editing are very similar.
4. Run the script!
Navigate to the directory that the script is in and double-click the get-wmt.py file to run it. A command line window should open up while the script is executing. Once it’s closed, you should have new CSV files in the same directory as the script you just ran!
Now we’re going to depart a bit further from the original article, so whip out your Task Scheduler!
You’ll want to create a Basic Task:
This will start the Basic Task Wizard. The first few steps are fairly self-explanatory and deal with setting the schedule for the task. I personally run the script weekly so that my data will overlap. When you select an action, you’ll want to choose “Start a program”.
The proper way to set this up is hinted at above. Simply enter the script name in the top box. To tell the Task Scheduler where to find the script, specify its location in the “Start in” area. That’s it! Your script will now run weekly or monthly or daily or at whatever interval you specified.
Have fun with your new data!
Benjamin Estes SEO Consultant