The Long Tail: What does it mean for the future of e-publishing?
I'm active on a few author forums, and quite often I see users posting about the long tail effect in ebook sales. Put simply, it's the theory that although most sales come from a relatively small section of the content available the internet allows the rest of it to be found too.
It's the theory that modern algorithms mean we can use similar content to cross sell meaning that demand shifts from the big hitters, who we all know and love, to the more obscure authors.
The main driving factor behind this is that in the ebook world there is no real cost in having extra titles available for sale. If you run a print store you've got inventory, shipping, returns and physical space limitations. Amazon do not. EBooks are delivered digitally the cost in making a page to sell one is miniscule in comparison to the average earnings per book. Even niche titles are likely to be bought a handful of times by the author and their immediate personal network.
Similarly with print on demand technology we can print and ship books only on order. The costs for this are all built into the product, and there are no fixed costs.
This all means Amazon have a huge selection at minimal cost. This should in theory mean that some of the sales get spread out further.
Before we look at whether the long tail really works in terms of selling niche products, let's look at how sales are split in our theoretical long tail.
Academics use two models for determining which ebooks are hits, and which are not (and are thus niche).
Simply put this is either relative or absolute.
A relative model would be '50% of all sales attributed to the top 20% of all earners'. This is useful, but doesn't give any hint at the numbers involved.
It could be that there are ten books, and half of the sales are with book one. So by picking an arbitrary category of 'top 20%' we skew the results. What we really want to do is identify only the outlier bestsellers - which is unlikely to be conform to a nice neat rounded percentage.
The other method is to look at absolute numbers. 'Top 100 ebooks sell 100,000 copies a day'.
Therefore depending on whether we look at the 'Top #' or 'Top %' it's perfectly possible to conclude both the presence, and the absence, of a long tail using exactly the same datum.
To illustrate this, let's pretend in year 2010 there are 100 ebooks for sale. The top 10 are 75% of the sales by revenue (which is again a complication given revenue does not equal gross total sales - higher priced ebooks will yield more per sale).
Another 400 ebooks are added in 2011. Now it takes the top 25 to account for 75% of the sales total.
Absolutely, the top list has expanded - which indicates a shift down the hierarchy in terms of sales.
Relatively the top 10% used to sell 75%, now the top 6.25% sell 75% - a shift AWAY from the long tail model.
So how does this actually apply to the eBook market?
There are 1,281,286 eBooks listed in 'all eBooks' on the Amazon UK store. That's potentially a very long tail. At the top end of this the top one hundred are selling thousands a day typically. The top 5000 are selling enough to make a respectable living. Below that we hit the mid teens - which is typically one a day territory. Anything sub 100k is selling once in a blue moon.
New eBooks are added every single day. The problem is most of this content isn't selling anything. If it takes a modest sale a day to stay top 25,000 then that's 1.25 million eBooks that aren't selling even that.
So if we used a rough 80/20 split as several studies suggest then Amazon UK's eBook store is NOT a long tail model.
It is a wide distribution model though. The idea is with a choice of over a million books we should be choosing more randomly. Yet somehow we're still swayed by those in the top lists. In our droves we jump on the bandwagon with titles like Wool and 50 Shades. Math would say we're pretty crazy doing it. It appears to be a bit of a sheep mentality.
Really, it's simply Amazon metrics. Their algorithms are self propagating.
Traditionally for endogenous interest to build in a book the model would be very much based on human interest. Let's pretend that we can reduce this to a mathematical number - the average number of people we tell about the book. Let's further assume we have an average number of sales per people told.
So if I tell 5 people, and 20% buy then each buyer will gain you one further buyer.
So A buys. Tells B C D E and F. F buys. F tells G H I J K. then J buys and tells L M N O and P.
This is a very slow buildup of endogenous interest. It's also decidedly frail. If any one person doesn't read the book, doesn't like it or doesn't bother to tell their friends then the chain breaks and that interest stops building. You'd then need exogenous shock (such as a free day or external advertising) to restart the chain(s).
However if we've got a book EVERYONE has to talk about. Something controversial, such as 50 Shades of Gray, then the ratio might be: Tell 2 people. 75% of them buy.
A buys. A tells B and C. This time both buy.
B tells D and E. One buys it.
C tells F and G. Both buy it.
E F and G tell H I J K L and M. 75% of them buy it.
H I J and K tell N O P Q R and S. 75% of them buy it.
As you can see the spread is much better - the chain, once it's started to snowball, doesn't break on one person. A dozen iterations in and the odds of ALL of them not telling anyone become exponentially smaller. This is then compounded by exogenous shock when it hits the papers, the morning TV shows and the discussion forums (which reach a wider audience).
Obviously, I've simplified the model. It assumes no one sells the same person early on (which reduces your chain). It also ignores the outlier individuals with huge networks - such as book bloggers, and Goodreads users with massive followings.
So, now we know about a basic model, how do Amazon metrics alter this?
Amazon opens up a huge number of titles, but as we know most of these will never get enough endogenous interest build-up naturally. This is because they don't have that unknown factor that gives an impetus for people to talk about. I'd love to say there was a magic bullet to capturing the readers imagination, but there isn't. The same book rolling the dice again might break the chain early if the first few buyers don't talk about it. It's also very time and place specific. Cultures and tastes change. Vampires and werewolves might be the in thing right now, but probably not forever.
What Amazon does is remove the unpredictable human element by replacing it with 'When bought with', 'Customers who looked at X bought Y', 'Listmania' and tagging.
This means books cross sell each other without the human involvement. So now it's a case not of 'IF endogenous interest builds' but 'When'.
Going by my last statement alone, you'd think everything was a potential best seller. It isn't. There are two major caveats:
1. You need enough sales/ page visits/ reviews etc to get into the when bought with lists. This means you still need the early human input (which should be simultaneously with the metrics). I don't know exactly what Amazon use to compute when bought with - only Amazon know that. Dan speculated it's quite a long list a while back.
2. You will get endogenous growth on your book subject to part 1. The problem is it's relative to everyone else's growth. So you might see an ASBOLUTE increase (i.e. going from 0.5 sales per month to 2 sales per month in a year), but if the guy who launched a similar title goes from 0.5 sales per month to 3 sale per month then you'll RELATIVELY lose ground.
It's this relativism that will influence eBooks in the future. Amazon will hit 2 million novels then 3 then 4, and it's speeding up. Publishers and legacy authors are realising that they need to get backlists up for sale - the cost is minimal, but the return potentially huge. Quite often these are lazy conversions that end up full of typos - and these won't gain interest as readers can and do slate ebooks that don't come up to standard.
So I think the future is an absolute increase. More customers are getting kindles and the like every year, and new markets are a huge potential source of customers. There are 7 billion people on the planet. Roughly half read for pleasure. That is enough demand to see a huge absolute increase in sales. There are many backlist titles to go on to help satisfy this demand, but I think we'll all see some growth as long as we keep in the Amazon metrics system (though I will note, even the 'when bought with' system has an element of relativism. It doesn't show EVERY book that was ever bought with yours, just those bought more often, so that helps sustain the top end).
The question isn't 'Will I absolutely increase my sales long term assuming I keep working on promotion, new books etc' but 'Will I relatively keep up?'. The answer there probably comes down to 'Is your book good?' (or, better than what's out) and 'Do you do as much as the other authors to promote?'.