The Shatzkin Files

A New Project: "StartwithXML, Why and How"

Good morning. We have turned our attention to a problem we believe will occupy just about all publishers in the years to come, the opportunities and challenges presented by an XML workflow that starts with the author, or even before there is an author.

Why should you care? Because the world we live in is changing, and XML is the key to mastering the change..

All of us in this room started in a book publishing industry that was pretty single-minded. We developed content into books. When something was extracted from the book, the publisher almost never had to deal with the physical production of it. So we had one output that mattered, which was, early in my career, the mechanical and then later the film and now the file which we prepared to go to the printer. The printer’s job was to deliver accurately what we specified. And that was that.

Although it is early days, we can see that is becoming very different. Overall sales of books from publishers may actually be diminishing: all the major US book chains reported year-on-year declines in their most recent financials, although online sales are still going up. But we know that with the web, with POD, with used books, with ebooks, and with the changing habits of the younger reader, that sales of pre-printed and pre-distributed books might very well decline.

New revenue opportunities are springing up; you heard about one this morning called Daily Lit. There are others. But two things characterize the new opportunities: they are relatively small on a per-title basis and they require a little bit of digital massage to take advantage of them. If the digital massage costs very much, the revenue gain could be wiped out. It is a StartwithXML workflow that is the key to being a cost-effective 21st century publisher.

There are four major components of the StartwithXML: Why and How project in place, and more will come.

The first component is an online survey, which you can access through the survey button on our web site. As we did with the Experimentation and Innovation project last spring, we are using a broad industry survey to make sure we’re capturing as many of the issues and ideas surrounding this subject as we possibly can. We would encourage all of you to fill out the survey; we know you will all be interested in the results when they’re tallied.

The second component is a Research Paper, which will weigh all the factors that make a StartwithXML workflow both useful and tricky, and will also include vendor profiles, case histories, and will be built on wide-ranging interviews with publishers and industry suppliers.

The third component is a 1-day Forum, to take place at the McGraw-Hill Auditorium on January 13, 2009. Mark your calendars. The Forum will cover the Why of a StartwithXML workflow in the morning and the How of doing it in the afternoon.

And the fourth component is a living conversation at That conversation is now open — our Research Paper outline is posted for comments. After you’ve gone to Survey Monkey and filled out the survey, you can continue being involved in the project through the web site.

We’re very proud of the team that is executing this project. Brian O’Leary of Magellan Media is providing the overall coordination. Ted Hill of THA Consulting and Laura Dawson of LJNDawson are doing a lot of the heavy lifting on the planning, writing, and interviewing, and because BISG is a consulting sponsor, we?e fortunate to have Michael Healy’s contributions on a regular basis as well. And because O’Reilly Media is publishing the Research Paper, hosting our web activity, and organizing the Forum, we also benefit from Andrew Savikas’s full-fledged involvement.

I also want to briefly acknowledge that four major industry suppliers of XML-related services are the sponsors making this possible: codeMantra, Klopotek, Publishing Dimensions/Jouve, and Rosetta Solutions.

In the past few years, most publishers have learned that they need an XML-structured file of all their content to facilitate re-use in different formats. In the trade and juvenile areas, at least, most publishers are getting there with post-production XML, creating an XML-structured export from InDesign or Word or PDF when the project is complete. That? a very partial solution, and, we believe, not going to be an adequate one going forward.

As we said at the top, revenue opportunities for content will proliferate, but saying only that is a bit misleading. It is not all good news. The BAD news is that so many of the new revenue opportunities will be small and will require both chunking and some content conversion to be realized. Publishers are learning every day about those challenges. If they can? find the specific chunks of content they need and deliver it in a particular way, the opportunity evaporates. And if it costs much to do those things, then what appears to be an opportunity may cost more than it yields.

So a StartwithXML workflow is essential both to increase revenues and to cut costs, and there will be many circumstances when realizing the revenue depends on reducing the costs.

Converting to a StartwithXML workflow is not a trivial undertaking. Doing so requires new understandings and processes that back right up to the author and editor. To be practical, it requires the creation and then multiple re-use of style sheets. It was not previously necessary for the author or editor to know about the book design before they handed in their work; now they must think from the start about design components. The first stages of converting to StartwithXML are about learning how to do old things in new ways, and sometimes they seem like harder ways.

But making that workflow change is just the beginning of the opportunity, and the work. A great opportunity in having XML documents is being able to have information embedded to travel with the document. The embedded information on the content structure cuts the costs of executing a new format.

The ability to embed rights information is valuable in other ways. When you go back to a chunk of content to re-use it, you will not have to conduct separate research on whether you can use that picture on a web site; the rights information will sit there with the picture.

What will occupy publishers for the next many years will be the opportunities surrounding tags for discovery and re-use. This will require the creation of subject-specific taxonomies, building on industry work that has been done with BISAC. Our colleague Brian O’Leary said a couple of years ago that the editor’s job is changing: it used to be all about deciding what was published. Now it is largely about anticipating how content will be discovered and re-used in the future. That new role for editors needs to be developed and defined. That is work. And it means that both editors and authors will have to follow learning the basics of XML with inventing the procedures for identifying and developing content so that it will generate ongoing revenues.

All of this gives publishers an enormous amount to think about. Each company’s pain versus gain equation is different, depending on their list, their current processes, how much they do in-house versus outsource, and how tech-savvy and open to change their management and creative teams are.

The companies that have more chunkable content have a lot more to contemplate than those with less. Of course, as they ponder it and as the market for chunks grows, many companies will dig deeper to find more when they initially thought they had less.

Companies are going to have to examine their content holdings for vertical-specific critical mass. We see this awareness growing in companies now. I had a meeting with one of the top trade houses a month ago about future web development at which they freely acknowledged that there are a few areas where they have enough content to go after a vertical, but in most areas, they do not. It does not make a lot of sense to develop taxonomies and standard tags for a subject on which you do three books a year. But, then, the next question will become: should you even publish in an area where you do three books a year?

Design in most book publishing companies has always been a book-by-book proposition. It won’t be in a StartwithXML house. Each book will want to be matched to an existing style sheet when at all possible. Analyzing what it will cost to build the style sheets required is another thing each house will need to do on its own.

Harnessing freelancers will present another challenge to publishers making this shift.

In fact, once the transition to a StartwithXML workflow has been made, the work of the coming decades really begins. It will never end. Taxonomies can always be improved and modified. New authors will always have to learn how to identify useful chunks, even though the tools for actually marking them can get simpler and easier. Nothing less than the redefinition of the editors job — and the authors — is enabled, and will happen, once a StartwithXML workflow is in place.

We are excited about this project. We believe our Research Paper and our Forum will be very valuable to the industry. We very much value the participation of everybody in this room, and colleagues of yours in every company. Please participate in the survey and offer comments at And, of course, we want to see you at our Forum on January 13.

  Back to blog

4 Responses to “A New Project: "StartwithXML, Why and How"”

  1. Diana Henry says:

    Perfect inspiration and direction for our publishing venture about KLNa/Natzweiler-Struthof, the only konzentrationslager on French soil. Our many titles must be guided by your vision, Mike, and taxonomy now moves to priority before more content is published in the archaic way. Thank you for your leadership and sharing your expertise!

    • admin says:

      Kind of you to say so, but there are a lot of people leading on this front. The most important single entity is O’Reilly Media, our partners on the StartWithXML project. We’re a small piece of their total effort, to put things in perspective.

  2. The style of writing is very familiar . Did you write guest posts for other bloggers?

Leave a Reply