What Might Be Next For (not provided)

On May 7th, Mozilla announced that Aurora (Firefox’s public beta) is beginning to roll out their HTTPS-by-default Google search behavior. It won’t be long before this is default in Firefox. As their announcement states, the user will be none the wiser (save for that tiny ‘s’ after http, and the secure lock icon). Webmasters, on the hand - let’s just say (not provided) is not making any friends.

I spoke on advanced analytics at #SearchFest back in February and the topic of (not provided) was on everyone’s minds. The question - “What can be done about (not provided)” - is and will remain inescapable in any analytics discussion. My advice now and at the time: ignore it. Any ‘hack’ is really just extrapolation of known keyword data, and was a pretty irresponsible calculation on which to be making business decisions. It’s dirty data. At the very least, worrying about (not provided) is probably just a waste of time and energy. The slides from that presentation and my writeup are here.

I tend to lean toward a pessimistic outlook of the future of (not provided). One in which Google eventually owns your keyword data, and the only way to get real, hard numbers is by buying AdWords. Which, by the way, Google had to create a workaround for the standard way https operates in order to still pass keyword data, as Danny Sullivan says (Caller ID, below, is the full referrer string that contains keyword data),

Let me be very clear. Google has designed things so that Caller ID still works for its advertisers, but not anyone else, even though the standard for secure services isn’t supposed to allow this. It broke the standard, deliberately, to prevent advertiser backlash.

But I’ve been thinking about a rosier possiblity that I thought I’d share. Before getting into it, I want to cover where we are now and where we’re likely headed in the short term. I think it’s also important to understand the motivations at play, which solidify my rosier possiblity (I hope).

So let’s start with:

The State of (not provided)

Remember this?
“even at full roll-out, this would still be in the single-digit percentages of all Google searchers on Google.com”
That’s Matt Cutts on record as the secure search default was rolling out. As it turned out, when the full roll out was complete, a spot check of 100s of Google Analytics accounts revealed that not one of the websites had under 10% (not provided). Distilled.net was sitting at roughly 20%.

So in November if 20% of Google searchers are logged in and using secure search, it wouldn’t be far off to assume that today that number would have climbed. After all, Google is certainly taking every opportunity to have you sign up for Google+ (and now Google Drive). More people signed in, more (not provided).

Not having run the numbers in a while I was surprised when on April 25, Mr. Casey Henry of SEOmoz tweeted:


Sadly, Distilled was worse:


The ugly Distilled graph:

not provided graph for Distilled

So (not provided) is currently greater than 50% for both Distilled and SEOmoz. Granted, our users are likely a savvy-ass group, but that is a drastic increase. Now throw in Firefox’s subset, fast forward a year as Google continues to push Google+ signups… It ain’t pretty.


Why, Again, was this change made?

Privacy, of course! Well, that’s what the Google PR would lead you to believe. There’s almost certainly some truth to it, but a few astute bloggers picked up on some possible ulterior motives that sounds pretty convincing to me. Danny Sullivan’s “Google Puts a Price on Privacy” is probably the definitive round up of motivations.

The strongest non-privacy reason for making this change has to do with Google’s bread and butter: ads. Before the change, 3rd party ad networks (AKA Google’s competition for ad space) that had inventory on a publisher’s site were privy to users’ queries. This allowed them to target and retarget ads across their network to match a user’s search history. Creepiness conversation aside, it _works_.

With keyword data no longer available for these ad networks, their relevancy and overall performance might suffer. It would be difficult for advertisers to target their ads at users who have shown a propensity to search for concepts that are closely related to their services. Relevance could be hard to come by. UNLESS, of course, you choose to use Google’s ad network: the only game in town with the data to know just how to serve your ad.

Another motivation: web analytics. There are a number of players out there, all of which have their own version of (not provided). The Google change does affect all analytics software the same, though. Whether its an on-page JavaScript or server log processing, Google’s “&q=” parameter in the referrer string is empty. No way around it. But how does this benefit Google? I’ll get to that in a bit.

The Future of (not provided)

Well, I think it’s fairly obvious that the Firefox change will lead to a modest bump, I’m guessing anywhere from 5 to 10%. Even after that, I would expect it to steadily climb as user’s are forced wooed into signing up for Google+ and other Google services.

But wasn’t there a rosy outlook?

Google Webmaster Tools

GWT has been pumping out updates and improvements over the past few months. The team recently revealed an overhaul to its dashboard, has improved the usefulness of server error reports, and has increased its communication with registered webmasters.

When the https change was first announced, Google was quick to point to GWT as an option for webmasters fretting about losing their keyword data. At the time, we weren’t thrilled with this idea, as GWT’s data search query data has proven to be rounded and sampled to the point of uselessness. But perhaps this is where Google gives back what hath been taken from webmasters by improving this data.

The SEO reports in Google AnalyticsThe “Search Engine Optimization” reports in Google Analytics that receive data from Webmaster Tools is currently pretty useless. I rarely utilize it. In this scenario, it becomes the go-to report for organic search marketers.

Obviously the flaw in this idea is one of the three motivations mentioned above for Google making the change in the first place: privacy.

But looking past that:

- How can Google provide the only ad network in town with real keyword data powering its targeting and retargeting? Turn off referrers.

- How can Google become the only web analytics provider with real keyword data? Turn off referrers. Pass keywords through to Google Analytics with GWT integration. GA Premium ends up looking pretty attractive compared to other paid platforms, no?

In the end, the only people unhappy: privacy advocates, 3rd party ad networks like Chitika and Chango, and 3rd party web analytics platforms like Omniture and Coremetrics.

And, what about that privacy thing? Ian Lurie pretty much nails it at Search News Central:

Don’t try to say this is a privacy thing. It. is. not. How exactly does this protect privacy, when you tie the text of e-mails to your advertising platform? How does this protect privacy when you’re photographing people’s streets, homes and whatever else you can lay your hands on?

Don’t get me wrong - I’ve not opposed e-mail ads, or Street View. But you can’t shut down search query data and then protest privacy. That’s like leaving one bite of steak on your plate and saying you’re a vegetarian.

Are you honestly telling me you had no way to deliver anonymous counts of keyword searches by signed-in users? You’ve never found a way to do this? ‘Cause that sounds like a load of horse hooey, if ever I’ve heard one.

Me, too.

This is all my opinion, not Distilled’s, and I could very well be dead wrong. For the sake of our keywords, I hope I’m not.

I would love to hear your take below or on Twitter.