06
2008
Guest Post – Sizing Up the Long Tail of Search
At Hitwise our clients are always discovering new trends from the vast amount of data we offer. From those findings we frequently get asked by clients to let them post their findings to our research blog. So I am happy to present our first guest blog post below from Dustin Woodard, a veteran SEO and long-time Hitwise client. You can see Dustin speaking at search conferences or read his writings on his SEO blog.
Considering the large number of references to the “long tail of search” over the past couple years, you’d expect a plethora of research about it. Truth is, there has been very little published on this topic. After great dissatisfaction with the existing research, which I felt vastly understated the true size of the long tail, I decided to do my own research armed with the powerful set of data Hitwise provides.
Estimating the Long Tail of Search
In September 2008, I opened the All Categories search report to discover Hitwise has measured more than 14 million different search terms in the U.S. alone for the previous 3-month period (note – Hitwise provides data going back three years). I exported the top 10,000 search terms to analyze the data. It came as no surprise that the top terms are generally navigational or behavioral terms with Myspace topping the list. However, what was surprising was the resulting chart below of the top 10,000 search terms:
Top 10,000 Search Terms by Percentage of All Search Traffic

Source: 1 Hitwise
Despite how it may appear, my Y-axis is scaled correctly and, though difficult to see, there is actually data displayed along the entire keyword plot. Keep in mind, this is just the top 10,000 search terms of more than 14 million. Zooming in even further to the top 100 search terms, I get this chart:
Top 100 Search Terms by Percentage of All Search Traffic

Source: 2 Hitwise
Now we are seeing something similar to past long tail search reports. However, this is just 100 search terms out of the more than 14 million. Even looking at the top 100, it appears that the tail starts around term #18, “bank of america.”
Assuming the tail doesn’t begin until term 18, the head and body together only account for 3.25% of all search traffic! In fact, the top terms don’t account for much traffic:
· Top 100 terms: 5.7% of the all search traffic
· Top 500 terms: 8.9% of the all search traffic
· Top 1,000 terms: 10.6% of the all search traffic
· Top 10,000 terms: 18.5% of the all search traffic
This means if you had a monopoly over the top 1,000 search terms across all search engines (which is impossible), you’d still be missing out on 89.4% of all search traffic. There’s so much traffic in the tail it is hard to even comprehend. To illustrate, if search were represented by a tiny lizard with a one-inch head, the tail of that lizard would stretch for 221 miles.
The truth is my research is still greatly understating the true size of the tail because:
· The Hitwise sample contains 10 million U.S. Internet users and a complete data set would uncover much larger portions of the long tail.
· The data set I used filtered out adult searches.
· I only looked at 3-months worth of data (which were some of the slower months for search engines).
In summary, the long tail aspect of the search is true, but the data tells us that there may really be no head or body. When it comes to search, virtually all traffic is long tail and the word “long” doesn’t do the length of the tail justice.


Very interesting – I’ve gotten a taste for the tail @ much smaller scale @ a leading site in our industry. This confirms what I always have trouble explaining to my boss.
Forwarding this right now…
The truly interesting thing about this, is that there is so much information, and therefore data, on the Net to cross reference. Looking at these terms as a lattice-work of of human understanding, taking them in as a slice of what our collective consciousness is thinking at anyone time, there is much to be gained from that alone. I would love to see a heat map or tag cloud built around this data, moving in real-time.
Amazing.
Really helpful long-tail stats, thanks for sharing!
It just goes to show that there’s still a lot of life in long-tail yet, even with search engine suggestion facilities creeping in….
Thanks Bill,
Ben M
Great piece. Been waiting for someone to “get it” when it comes to the long tail of search. Chris Anderson wrote a brilliant book, but he underestimated the long tail when applying it to the search industry.
After constant often neagtive vibes about chris’s book it really is great to see solid data to support the Power of the Long Tail
Thank you so much for writing about this. I am a big believer in long-tail but it has been challenge convincing others on this. We have seen results first hand on our sites by focusing on long-tail searches.
Hi Bill,
Here’s a thought, why do people search for Fortune Tellers around September?
http://google.com/trends?q=fortune+teller&ctab=0&geo=all&date=all&sort=0
Great post Dustin. I’m really surprised there’s not more research on the long tail. I can see it accounts for around 80% of my clients traffic, and that’s from aiming for the “primary keywords”. I think it’s important to note that out of the 80% at least 60% of it includes some variation of the “primary keywords”.
Thanks for this very interesting data. It would be enlightening to see average numbers of searches in a given time period for the various segments, i.e. top 10 – x per month, top 1000 – y per month, top 100,000 – z per month and so forth.
Thank you Bill and Dustin for sharing this information. Hopefully some of it will hit home for those still going after the ‘golden goose’ single words…
Good Call – further evidence to reinforce the principle of writing full and descriptive proper content and not trying to find perfect keyword densities.
Great article. I am releasing a white paper soon about long tail search. I will let you know when it is available.
Excellent article. Your data reinforces our long-standing approach to targeting long-tail organic and ppc search for quality b-to-b leads. This is great validation and helps us understand just how long the long-tail really is.
we’ve seen the power of long tail in our ecomm business, max conversion on terms that only provide 1 or 2 visits. Is there a guide on a how to really focus on getting the long tail?