Following my original post on the subject, I've been having some interesting discussions with smart folk about signal-to-noise ratios in the Long Tail. Those ratios are important because they dictate consumer behavior. Too much noise, and people don't buy. And without good filters (from search to recommendations), the Long Tail is just noise.
This may seem a bit arcane, but as I'm writing a whole chapter on filters, recommendations and other tools and techniques to drive demand down the Tail, it's worth drilling down a bit more here.
John Hagel thought I'd gone off the rails a bit with my analysis. He's as bright as they come, so I've gone to more than the usual lengths to reconsider this issue. Here's our debate, which I'll share with all of you because I think it eventually led to an interesting insight.
I originally posted this conceptual graph:
In a post, John took issue with it:
[Chris] asserts that signal to noise ratios decrease as one move down the tail. Really? Isn't that subjective? I may be a real outlier (aren't we all?) but, at least for me in the realm of music, the signal to noise ratio decreases as I move up the tail. The real point, I think, is that the sheer quantity (rather than the quality) of items increases as we move down the tail and the ready availability of information about these items diminishes - that's what increases the difficulty of connecting with relevant resources as we move down the tail.
We batted this back and forth in several emails, but I didn't make much headway. John's taste in music is quite niche, he says. He doesn't like anything in the top 100, or even much in the top 1,000. And he suspects that there are lots of people like him (not necessarily in his particular niche, but in ones like it). As a result, he argues that his s/n ratio actually goes the other way, which is to say that it's zero at the head (no signal), peaks somewhere in the middle or even further down, and only then eventually falls under the weight of a zillion garage bands at the end of the Tail.
I tried to persuade John with yet more conceptual charts to explain that shape I drew was the aggregation of everyone's s/n ratios, which are indeed all different but together end up looking like the one I originally posted. Like this:
But he was still unconvinced. This was surprising, since I think it's totally straightforward: there's more noise in the tail because there's more everything there. Most stuff doesn't sell very well, so the volume of the material available--and by extension the volume of stuff you don't want--rises as the Long Tail falls. Like this:
Whatever you're looking for, there's more stuff you aren't looking for the further you go down the tail. Which is why the signal-to-noise ratio gets worse, even if you're more likely (with good search and filters) to find what you want as you go down the tail.
sounds like a paradox, but it isn't. Much of what you want is in the
tail. Most of what you don't want is also in the tail. That's why you
need increasingly powerful filters to extract the good from the greater
But conceptual graphs weren't doing it; John was still skeptical. So I turned to actual data. I analyzed the music collection of five people: Me, Anne (my wife), my assistant Peter Arcuni, John himself (turns out that he's into rockabilly, surf music and Algerian "rai music", which seems to be some sort of ethno-techno thing), and Koranteng Ofusu-Amaah, for no other reason than that he showed up amongst my trackbacks and appears to have quite fringy music taste (African pop, mostly) that he was kind enough to provide Amazon links to. I've got Koranteng twice: once, for a random list of things he listening to, and the second for his best of the year list.
Overall, the number of albums that I included from each subjects' collections ranged from a few dozen to more than 200 titles. (This is too small to be anything more than suggestive. But I'm working with some companies to extend this analysis to proper large-n data sets, which I hope to be able to include in the book.)
This is what I found (using logarithmic curve-fitting to smooth the underlying numbers):
What's important to note here is that everyone, no matter how niche, shows a falling s/n ratio as you go farther down the tail. Why is this, when John, Koraneng and I (indeed, all of us except for Anne) have no albums in the top 100, and only a few in the top 1,000? Why did it not show the rising-then-falling shape I predicted in my conceptual graph above?
The answer is that there is so much music out there that even what we consider niche is usually still top-decile.
In the above chart I binned the Amazon ranks of record collections
by 5,000s, which is the smallest unit that gives any decent overview. I
cut off the chart at 100,000 for visual impact, although the full
analysis and the collections included several albums in the 600,000
All that rising-then-falling shape that I illustrated with the
conceptual s/n graph actually takes place entirely in the top
5,000-10,000 for most people. By the time you're past that the density
of almost everyone's music collection (which is to say, their s/n
ratio) falls as you go down the tail. When you're binning by 5,000s, as
I have, all that the fine structure in the head is obscured by the
Top 100 is irrelevant in an abundant market. Even the library of my most mainstream subject (Anne) had an average rank of 3,000. In the big picture of the Long Tail, there are so many items that even today's niche looks relatively popular. For instance, the average sales rank in my own collection was 25,000. That may sound super-fringe, but it still puts my average in the top 5% of Amazon's offerings. You've got to pull back and see the whole market. And at that resolution, the falling s/n ratio curve I originally described emerges for almost all of us.
Long Tails are long, and it's illuminating to stand back and see the whole curve. The microstructure of the current hits business, the blockbuster charts our culture has so long fixated on, is quickly lost in the macrostructure of the entire music universe. It's a big world out there, and the top 40 is just the beginning of it, not the end.