__The Long Tail: What does it mean for the future of e-publishing?__
I'm active on a few author forums, and quite often I see
users posting about the long tail effect in ebook sales. Put simply, it's the
theory that although most sales come from a relatively small section of the
content available the internet allows the rest of it to be found too.

It's the theory that modern algorithms mean we can use
similar content to cross sell meaning that demand shifts from the big hitters,
who we all know and love, to the more obscure authors.

The main driving factor behind this is that in the ebook
world there is no real cost in having extra titles available for sale. If you
run a print store you've got inventory, shipping, returns and physical space
limitations. Amazon do not. EBooks are delivered digitally the cost in making a
page to sell one is miniscule in comparison to the average earnings per book.
Even niche titles are likely to be bought a handful of times by the author and
their immediate personal network.

Similarly with print on demand technology we can print and
ship books only on order. The costs for this are all built into the product,
and there are no fixed costs.

This all means Amazon have a huge selection at minimal cost.
This should in theory mean that some of the sales get spread out further.

Before we look at whether the long tail really works in
terms of selling niche products, let's look at how sales are split in our
theoretical long tail.

Academics use two models for determining which ebooks are
hits, and which are not (and are thus niche).

Simply put this is either relative or absolute.

A relative model would be '50% of all sales attributed to
the top 20% of all earners'. This is useful, but doesn't give any hint at the
numbers involved.

It could be that there are ten books, and half of the sales
are with book one. So by picking an arbitrary category of 'top 20%' we skew the
results. What we really want to do is identify only the outlier bestsellers -
which is unlikely to be conform to a nice neat rounded percentage.

The other method is to look at absolute numbers. 'Top 100
ebooks sell 100,000 copies a day'.

Therefore depending on whether we look at the 'Top #' or
'Top %' it's perfectly possible to conclude both the presence, and the absence,
of a long tail using exactly the same datum.

To illustrate this, let's pretend in year 2010 there are 100
ebooks for sale. The top 10 are 75% of the sales by revenue (which is again a
complication given revenue does not equal gross total sales - higher priced
ebooks will yield more per sale).

Another 400 ebooks are added in 2011. Now it takes the top
25 to account for 75% of the sales total.

Absolutely, the top list has expanded - which indicates a
shift down the hierarchy in terms of sales.

Relatively the top 10% used to sell 75%, now the top 6.25%
sell 75% - a shift AWAY from the long tail model.

__So how does this actually apply to the eBook market?__

There are

**1,281,286**eBooks listed in 'all eBooks' on the Amazon UK store. That's potentially a very long tail. At the top end of this the top one hundred are selling thousands a day typically. The top 5000 are selling enough to make a respectable living. Below that we hit the mid teens - which is typically one a day territory. Anything sub 100k is selling once in a blue moon.
New eBooks are added every
single day. The problem is most of this content isn't selling anything. If it
takes a modest sale a day to stay top 25,000 then that's 1.25 million eBooks
that aren't selling even that.

So if we used a rough 80/20
split as several studies suggest then Amazon UK's eBook store is NOT a long
tail model.

It is a wide distribution model
though. The idea is with a choice of over a million books we should be choosing
more randomly. Yet somehow we're still swayed by those in the top lists. In our
droves we jump on the bandwagon with titles like Wool and 50 Shades. Math would
say we're pretty crazy doing it. It appears to be a bit of a sheep mentality.

Really, it's simply Amazon
metrics. Their algorithms are self propagating.

Traditionally for endogenous
interest to build in a book the model would be very much based on human
interest. Let's pretend that we can reduce this to a mathematical number - the
average number of people we tell about the book. Let's further assume we have
an average number of sales per people told.

So if I tell 5 people, and 20%
buy then each buyer will gain you one further buyer.

So A buys. Tells B C D E and F.
F buys. F tells G H I J K. then J buys and tells L M N O and P.

This is a very slow buildup of
endogenous interest. It's also decidedly frail. If any one person doesn't read
the book, doesn't like it or doesn't bother to tell their friends then the
chain breaks and that interest stops building. You'd then need exogenous shock
(such as a free day or external advertising) to restart the chain(s).

However if we've got a book
EVERYONE has to talk about. Something controversial, such as 50 Shades of Gray,
then the ratio might be: Tell 2 people. 75% of them buy.

A buys. A tells B and C. This
time both buy.

B tells D and E. One buys it.

C tells F and G. Both buy it.

E F and G tell H I J K L and M.
75% of them buy it.

H I J and K tell N O P Q R and
S. 75% of them buy it.

As you can see the spread is
much better - the chain, once it's started to snowball, doesn't break on one
person. A dozen iterations in and the odds of ALL of them not telling anyone
become exponentially smaller. This is then compounded by exogenous shock when
it hits the papers, the morning TV shows and the discussion forums (which reach
a wider audience).

Obviously, I've simplified the
model. It assumes no one sells the same person early on (which reduces your
chain). It also ignores the outlier individuals with huge networks - such as
book bloggers, and Goodreads users with massive followings.

__So, now we know about a basic model, how do Amazon metrics alter this?__

Amazon opens up a huge number
of titles, but as we know most of these will never get enough endogenous
interest build-up naturally. This is because they don't have that unknown
factor that gives an impetus for people to talk about. I'd love to say there
was a magic bullet to capturing the readers imagination, but there isn't. The
same book rolling the dice again might break the chain early if the first few
buyers don't talk about it. It's also very time and place specific. Cultures
and tastes change. Vampires and werewolves might be the in thing right now, but
probably not forever.

What Amazon does is remove the
unpredictable human element by replacing it with 'When bought with', 'Customers
who looked at X bought Y', 'Listmania' and tagging.

This means books cross sell
each other without the human involvement. So now it's a case not of 'IF
endogenous interest builds' but 'When'.

Going by my last statement
alone, you'd think everything was a potential best seller. It isn't. There are
two major caveats:

1. You need enough sales/ page
visits/ reviews etc to get into the when bought with lists. This means you still
need the early human input (which should be simultaneously with the metrics). I
don't know exactly what Amazon use to compute
when bought with - only Amazon know that. Dan speculated it's
quite a long list a while back.

2. You will get endogenous
growth on your book subject to part 1. The problem is it's relative to everyone
else's growth. So you might see an ASBOLUTE increase (i.e. going from 0.5 sales
per month to 2 sales per month in a year), but if the guy who launched a
similar title goes from 0.5 sales per month to 3 sale per month then you'll
RELATIVELY lose ground.

It's this relativism that will
influence eBooks in the future. Amazon will hit 2 million novels then 3 then 4,
and it's speeding up. Publishers and legacy authors are realising that they
need to get backlists up for sale - the cost is minimal, but the return
potentially huge. Quite often these are lazy conversions that end up full of
typos - and these won't gain interest as readers can and do slate ebooks that
don't come up to standard.

So I think the future is an
absolute increase. More customers are getting kindles and the like every year,
and new markets are a huge potential source of customers. There are 7 billion
people on the planet. Roughly half read for pleasure. That is enough demand to
see a huge absolute increase in sales. There are many backlist titles to go on
to help satisfy this demand, but I think we'll all see some growth as long as
we keep in the Amazon metrics system (though I will note, even the 'when bought
with' system has an element of relativism. It doesn't show EVERY book that was
ever bought with yours, just those bought more often, so that helps sustain the
top end).

The question isn't 'Will I
absolutely increase my sales long term assuming I keep working on promotion,
new books etc' but 'Will I relatively keep up?'. The answer there probably
comes down to 'Is your book good?' (or, better than what's out) and 'Do you do
as much as the other authors to promote?'.

## No comments:

## Post a Comment