walkitout (walkitout) wrote,

Data Quality, Decision Making and Publishers (LONG)

This is the second attempt at this post. The first attempt was ridiculously verbose.

In an earlier post regarding Amazon's participation in the auction for the Hocking contract, I quoted an anonymously sourced quote from Crain's coverage. The contents of the quote (Amazon has 65% market share in ebooks and falling) struck me as bizarre. I knew it was false; I was curious as to why a publishing executive might think it was true. I was able to find multiple possible sources of the investment advisor sort (Yankee Group, Goldman Sachs, and I think I remember Forrester weighing on this topic as well), all dating from a period of a few months last year when people were really pessimistic about Amazon (because of the iPad and iBooks announcement/release) and really optimistic about future competition for kindle (Plastic Logic's Que, Entourage's Edge, Hearst's entry, the Cool-er, etc.).

The question is, why might a publishing executive be believing these numbers in 2011, when a lot of that hypothetical competition has gone under or become irrelevant as a platform for ebook sales? After I thought about it for a while, I concluded that publishing executives exist within a culture which is completely accustomed to making decisions based on worse and more incomplete data than they could get access to if it occurred to them that they might need it -- which it would seem does not happen. It was thinking about publishers' recent interest in genre fiction which led me to this conclusion.

Genre fiction has been around in one genre or another and in one publishing format or another for a long time. Occasionally, one or a small group of authors within a genre would break out of what used to be referred to as the "ghetto" and get hardcover deals, but when they did, they had to accept an editorial process which attempted to convert their genre title into something closer to whatever passed for "mainstream" "general" or "literary" fiction was expecting at the time (cf. Heinlein's multiple versions). This happened when it became impossible to ignore the sheer volume of sales being driven by that author or that group of authors.

The formats in which genre fiction sold (pulp magazines, mass market paperbacks) made it difficult to identify which authors and which titles were selling and which ones were sitting unread (in some under-served moments, it all sold equally because the hunger for that genre was so great that the audience would read any form of it, however craptastic). Mass market paperbacks _had_ bar codes on them when they moved to the chain booksellers from supermarkets/grocers/drug stores and similar. But the bar codes did not uniquely identify a title when scanned; they identified a publisher. Over time, the chains got ISBNs into the inner cover of mmpbs and eventually the bar code on the outside became a unique identifier for the title. That happened _after_ Amazon had entered the scene. (<-- Really.)

I don't know how publishers paid direct-to-paperback authors royalties. I don't see how they could possibly identify how many of their titles were sold vs. cover removed and dumped. I harbor a suspicion that some authors got more than they should and some got shafted, but I have no proof.

In any event, Amazon _did_ know how many of any title was sold and, over time, they got better at identifying an author across the body of their work. When Amazon rolled out the first version of used books (the one you may not remember), a lot of the used catalog were out of print titles that would be fulfilled by (I could not possibly make this shit up) calling a bunch of used book stores after the order was received to see if anyone had a copy. In the meantime, Amazon had a list of an unholy number of completely unfulfillable titles that, apparently, a whole lot of people wanted really badly. (Yes, Dear Reader, Neal Stephenson's _The Big U_ was _really_ high on that list.)

The nature of internet retailing allows the seller to collect per-customer information in a way that a traditional retailer can only approximate with loyalty cards. This exposed very early at Amazon (well before the IPO) an aspect to book purchaser behavior that had been intuitively obvious to many people for a long time, but which the entire publishing industry remains blissfully unaware of _to this day_. A small number of people buy a huge number of books each (as in, hundreds every year, year in and year out. Occasionally, a thousand or more in a year, but that's unusual). A vast sea of people buy a small number of books each. The total effect of the first group -- if you can successfully deliver what they want -- is greater than the total effect of the second group. That turns out to be a _huge_ if, and in the past, there was a secondary constraint: you can't buy more books than you have space for unless you are prepared to get rid of the old ones and/or buy a new house and/or get a divorce and/or you get the idea. With ebooks, that constraint is entirely gone (at least it is as long as someone else curates and stores the ones that don't fit on your drive -- and you'll notice that Amazon switched from advertising how much storage their kindle has to providing the Media Library).

There is another characteristic of the high volume book purchaser: they read a lot of genre fiction.

Once Amazon and the chains had successfully demanded a per-title unique barcode, tracking sales on a per author and per title basis was definitely possible and should have been the norm throughout the publishing industry. This was the time frame in which genre fiction started to break out of paperback in significant numbers: you _could_ identify stars (it, in fact, became impossible to continue ignoring them when Amazon kept selling the damn things even when no one else was shelving them) and once you have stars, the temptation to ratchet the margin up via a hardcover release is actually irresistible. The case that genre fiction authors had to make to publishers in this time frame to get a hardcover was _insane_: no debut author in "literary fiction" _ever_ had to prove sales of that magnitude ahead of time. But once a few had gone through the ringer and started landing on bestseller lists (at Amazon if nowhere else) and enjoyed the ratcheting effect that has on sales, publishers were capable of doing "more of the same": they found some me-too authors and they absolutely signed deals for more entries in successful series. What is less clear is whether publishers recognized that the same idiots (<-- this group includes me) that were signing up ahead of time to preorder the hardcover edition of Sookie Stackhouse number 6 were also signing up ahead of time to preorder the hardcover edition of Jim Butcher's books and a half dozen or more authors. If they _had_, wouldn't they have put together a forum for us to hang out and chat about the books and for them to push more authors at us and accept our feedback and blah blah blah?

Why did it take them the better part of another decade to figure that out? In fact, as long as I'm on this topic, I _review_ in this blog a whole lot of these series and repeatedly say, yes I'll keep buying them or fuck no you've stepped over the line. When I blogged that I hated a certain package delivery service, they got in touch with me. I've never heard a _peep_ out of anyone trying to sell me anything related to the fact that I buy a truly silly amount of bad genre fiction (unless you count Amazon's recommendations). Self/Epubbed authors have responded to reviews as have romance authors -- but I never hear jack from the publishers. This strikes me as a little weird.

Recently, some publishers have figured out that genre fiction has some "vibrancy" and they've been "verticalizing" and creating "communities" for authors and/or readers. I found this out by visiting Publishers Weekly and looking through the daily lists-of-links. I think that says a lot about how far publishers have to go, in terms of improving the data quality on which they make decisions.
Tags: publishing
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.