Finding your next book, or, the discovery problem

A big flap has arisen this week — which I believe I would have been equally aware of had I been home in New York rather than in London — because the giant UK books-and-stationery retailer WH Smith has apparently found inappropriate ebooks being recommended through the kids books portions of the Kobo-managed ebook offering they host. This has sparked a lot of conversation about how recommendations — indeed how curation — is managed in the online environment. In this case, the discussion is about the specifics of this problem and how metadata might have been wrong, gamed, misunderstood. This has resulted in Smith’s turning off their whole web site, which contains the Kobo-offered ebooks, while the problem is “fixed”. It’s a mess that points to how far we are from solving core challenges of selling books in a virtual environment.

Online bookselling has a long way to go before it can deliver even what it intends to deliver in response to a search or to prompt a next sale. Of course, there are two additional and larger problems that come first: knowing what the right suggestion(s) would be and being able to make enough of them to match the book shopping experiences online sales must replace.

Analysis offered by Russ Grandinetti of Amazon at our Publishers Launch Frankfurt conference last week suggested that the US and UK are on the verge of transacting more than 50% of the book business online, with other markets in Europe and Asia not more than two or three years behind. (This may understate the real state of affairs; in a meeting I just had in London I was told that one of the biggest UK publishers says that 60 percent of their sales of print, ebooks, and audio are through Amazon!) Online sales of books were probably in the neighborhood of 10%, or less, for most publishers a decade ago. That shift is why retail shelf space has diminished so much, with major chains having sunk in both of the big English-speaking markets (and in smaller ones as well).

When most books were bought in physical locations, it was axiomatic that a book displayed in a store had an exponentially greater chance of selling than one that wasn’t, despite wholesale supply in the US from Ingram and Baker & Taylor that could get almost any book to almost any store in 24-48 hours. It had to be seen in the store to be bought. Competent commercial trade publishers knew there was very little point to pushing a book through marketing efforts if inventory wasn’t in place at retail, because seeing the book at the time you might buy it was a more powerful trigger for purchase than any other. Indeed, all the other stimuli (reviews, suggestions from friends, conversation at the office) tended to be acted upon only when the presence in the store was in proximity to the suggestion or recommendation. (And that’s why recommendations from clerks in the store were the most powerful recommendations of all: hence the concept of “hand-selling”.)

One problem with the change to online buying from the discovery perspective is that the funnel for each shopper keeps getting narrower. It isn’t hard for somebody in a bookstore to look at hundreds of books in a few minutes. It’s nearly impossible online. This either requires the consumer to spend more time shopping to see the same number of titles they used to see in a store, or to make a decision having seen fewer. And the concern is that the decision that gets made having seen fewer can be not to buy anything at all. (Or, particularly in the case of tablet users, to buy something other than books.)

Of course, in theory, being able to present a personally-curated batch of suggestions for each customer could be far more precisely targeted than what a store can do, and, in that case, fewer titles shown might do the same job. But we are a long way from that. And, for reasons I hope this piece will make clear, personally-curated choices would actually be far more likely to be delivered by Google than by Amazon (although they would raise a host of what would be considered big privacy concerns to a lot of customers by doing it). And that’s not a reflection on the quality of anybody’s programmers, and certainly not of their commitment to their customers.

The technology that hopes to help you “pick your next book” is referred to as a “recommendation engine”. I’ve never been on the inside of such an effort but the thinking behind them seems to center around analyzing what books you’ve bought and what you’ve searched for and, from that, figuring out what you might read next. This might be based on analysis of the content itself (e.g. Pandora recommending music of similar style and quality) and/or collaborative filtering models — leveraging user inputs (purchase history, ratings, and reviews) to make recommendations for other similar users (“people who bought x also bought y”). It all recalls for me the experience of being told when I met a great bookseller, the late Joel Turner, at the 1978 American Booksellers convention in Atlanta, that “if a customer walks up to my cash register with five books, I can always sell him a sixth”.

Of course, over time, a bookseller can fill out that knowledge with even more data as they see more and more purchases and get to know their customers, and perhaps their families. But, in fact, using books bought as a guide to recommendations is an incomplete data set. It can also be a misleading one since people buy books for people other than themselves.

Another way to look at it came from my friend, Andrew Rhomberg. Based on his experience with start-up Jellybooks, he formulated five major book discovery paths: serendipitous, social, distributed, data-driven and incentivized.

The point is that most people get their ideas about what to read next from many sources: people they talk to, reviews, news reports, business interactions. Some people say they get book recommendations from their friends; others (like me) say they don’t often read the same things their friends or relatives read. I suspect that online communities of readers tend to work best for people who do a lot of reading in genres and not nearly as well for people who mix fiction and non-fiction, entertainment and learning. And some people gravitate to what’s popular, so bestseller lists work best for them. It is clear that getting on a bestseller list fuels a book’s sales.

And books are bought for motivations other than “to read”, so it might also be important to know that a customer’s son is having a birthday, that a customer’s cousin is getting married, that a customer is shopping for a new home or looking for a new job or starting on a new hobby or spending money on an old one.

Few, if any, of these things would be apparent to even the most diligent hand-selling bookstore personnel. Bits and pieces of it might be detectable by the super-merchant Amazon (but not likely to any other).

This is one devilishly complex problem. There are countless potential inputs to the “next book purchase” decision and they are processed by each different individual in a highly personalized way. If you think it through, it seems obvious that most recommendations to most people wouldn’t work. Which takes us back to the need to make a lot of them, which a bookstore display does much better than online pages that show 10 or 20 books at one time.

In the long run, it would seem to me that Google is the entity best-positioned to address this challenge if they can somehow combine the knowledge of what you searched for (which they know), with what you read online (which they could know if you use Chrome for your browser), and the topics and book titles that have appeared in your emails (which they could know if you use Gmail) and the things you ‘like’ and talk about online (if you use Google+). Knowing your travel plans and patterns would be helpful too.

Of course, unless you use Google Play for ebook purchase and consumption, they’d be missing the two most important bits of data — what you bought and how voraciously you read it and they still wouldn’t know your print book purchases (unless they crawl your email receipts for that as well) — which Amazon is building on without all the other information. What you’d really want to do is to correlate the book buying and consumption information from the past with the behavioral data contemporary to it. With it all combined, perhaps you could filter recommendations so that the 20 or 50 you could show on line would have the commercial power of the hundreds or thousands you could see in the same amount of time in a store.

At the moment, both Amazon and Google are trying to see a pattern through one nearsighted eye.

But is this all really part of a larger problem for publishers? Is online discovery really affecting the sales patterns for books? It would appear so. One of the global ebook sellers told me during Frankfurt that their online sales are far more concentrated than publishers’ sales tended to be, with a tiny fraction of titles (under 5%) making up a huge percentage of total sales (nearly 70%). (I am assuming here that this retailer’s data is typical; of course, it may not be.) If memory serves, at the turn of the century Barnes & Noble stores saw only about 5% of their sales coming from “bestsellers” and, I believe (relying on memory of detail, which I admit is not my most powerful mental muscle) backlist outsold new titles. Publishers really live on the midlist. We know the long tail is taking an increasing share of sales and it would appear the head is too. Those sales come out of the midlist. It is pretty hard to run a profitable publisher without a profitable midlist.

And that would suggest that the increasing concentration of sales, which is likely the result of our hobbled ability to present choices in the digital sales environment, is a problem that publishers will want to address.

Finding your next book, or, the discovery problem

Recent Posts

Pages

Search

Recent Posts

Pages

Follow Mike

Search