Archives for category: ebooks

I just returned from TOC 2013. I got the chance to catch up with colleagues and friends, as well as meeting new ones (and since I work remotely, I even got to meet some of my Safari colleagues IRL for the first time!)

The programming for this year’s TOC offered a few high points, as well: the “Get Better at Git: Applying Version Control to Publishing” session, run by Matthew McCullough and Tim Berglund of Github, provided me with a long, long overdue a-ha moment for using Git; and as a digital comics geek, I was thrilled (if you’ll pardon the pun) to see the legendary Mark Waid deliver an engrossing demo of his fantastic Thrillbent comics platform.

One of the sessions that I found most compelling was Alistair Croll and Hugh McGuire‘s “Book as API” talk. Hugh has covered the gist of this talk over on the O’Reilly TOC blog, and the whole post bears reading and thinking on—it’s compelling stuff:

If we start to think of “books as data,” then the traditional publisher’s role starts to sound a lot like the role of providing an API: A publisher’s job is to manage how and when and under what circumstances people (readers) or other services (book stores, libraries, other?) access books (data).

During his talk, Hugh focused on the indexing of content from a book and making that information available via an API, and called out particularly clever and interesting uses for this information, from one-off projects like Dracula Dissected (in which Bram Stoker’s novel, Dracula, is broken down into parts — people, locations, journeys, journal entries, letters, etc. — that are presented to the reader over a Google Earth map, and connected with the story’s internal timeline), to full-on services such as Small Demons, which takes the people, places, and things mentioned in books and shows you their relationships to other people, places, and things. It’s fascinating stuff, and opens up the possibilities for how readers can engage with books.

All this talk of atomizing the book’s information into discrete chunks that could be rearranged depending on context got me thinking about streaming books, which is a concept that we here at Safari talk about a lot—in fact, Liza Daly delivered a presentation on this idea at the IDPF Digital Book 2012, and I riffed off of her work for a talk I gave at the Guadalajara Book Fair in November of last year.

A streaming book is a book that lives on a server in discrete parts, as raw assets, and is delivered to the reader over the network as a uniquely packaged collection of assets that respond directly to the individual reader’s particular usage conditions.

So for example: let’s say that we have a book that lives on a server, in parts: we’ve got our main text, translated into a handful of languages and semantically marked up, but otherwise unadorned; accompanying images, in various sizes and resolutions; styles and layouts for different contexts, such as mobile phones, low-resolution eink devices, high-resolution tablets, or digital broadsheets; supplemental files such as video or audio, also at various file sizes and resolutions.

Using mechanisms such as content negotiation, a device can send the server information about its conditions — “I’m a low-resolution eink device sipping low bandwidth in the mountains of Colombia,” or “I’m a high-resolution tablet in high-bandwidth Hong Kong” — and the server can then assemble and deliver a version of the book that is appropriate for the reader’s context: an image-less, plaintext version for our friend in Colombia, perhaps, and a high-res, finely laid out multimedia smorgasbord for our pal in Hong Kong.

Once you start thinking in this fashion, the possibilities become really, really compelling:

  • A reader in Brazil can request a book on their browser, and the server can deliver a version in Portuguese instead of English.
  • A reader on a mobile phone can get a version of the book which sports low-resolution images, and text that is specifically formatted for small screens.
  • A reader can request the book in a version specifically designed for printing on demand, via either an Espresso Book Machine at a library or bookstore, or a copy shop service such as Paperight (one of the judge’s picks at this year’s TOC startup showcase).
  • A reader on an iPad can receive a multimedia EPUB file, full of high-res images and widescreen videos.
  • A reader on a Kindle can get the Mobi version of the book.

All this from one single repository (yep, still got Git on the brain), without having to create each version of a book manually each time — as long as the assets have been created correctly, are properly stored and described, and the server receives the information about a reader’s context, it can manage to serve up the correct version of a book to the reader automatically.

Moreover, using this approach, you can create books for mixed use within one space. For example, if a server knows that a request for a book is coming from a tablet, or a computer, or a TV, it can serve up different content for each context, thereby facilitating learning in a classroom setting:
the instructor gets a presentation-style layout for their wall-screen (the big board!); students on their tablets get a workbook-style layout with quizzes for evaluation; desktop computers get multimedia presentations and essay questions; mobile phones get shorter chunks of text, or surveys. All from the same source, and all on the fly.

Naturally, these techniques aren’t only appropriate for books — all types of editorial products can be thought of in this way. In fact, some already are: NPR treats its content in this way, and they enjoy a wide reach via various media as a result (for more info on this approach to content strategy, check out Content Strategy for Mobile by Karen McGrane, a short, fascinating, and incredibly useful read).

As ereading devices and services proliferate, it will become harder and harder for ebook makers to generate each necessary version of a book to reach all devices and contexts, and the process will become even more time-consuming and probably frustrating than it is now (I believe the technical term for this quixotic pursuit is “chasing the unicorn”). Approaches to content production and management such as the streaming book can help simplify the production process, and make it just a bit (or a helluva lot) more rational.

Safari’s Content Team has the dubious distinction of having the highest volume of tickets in our company-wide issue management tracking system (we use Atlassian’s JIRA). We easily win this competition, with more than 1,500 open issues on any given day. But do we buckle under the psychic weight of all these tickets? Nah… go ahead, bring ‘em!

Content Issue Pie

Content Issue Pie

Why So Many, You May Ask?

The Content Team has quality-checked 12,729 brand new titles loaded onto Safari Books Online from April 2011 to last week. For the past 6 months, we averaged 753 titles/month, or 177 titles/week. We track only issues that are clearly errors (e.g., a title-cover image mismatch) or issues that seriously impact readability (e.g., all images are random color bitmaps like this one from a real book).

Mangled image

Mangled image

Each time we find an issue like this, we stop the title in the pipeline before it goes live, and follow up one way or another to correct it. We track all of these issues in JIRA, so we can manage the corrections and move each title live as quickly as possible.

At this time, we only check brand new titles, but our publishers are free to update titles at any time without oversight. And, since we only started quality-checking new titles in April 2011, but Safari launched way back in September 2001, there are quite a few titles that we haven’t scrutinized. Various problems get reported: the unavailability of practice files referred to in the text, teeny tiny images too small to make out, or broken links. An average of 200 new content issue tickets are created each month.

Issues Created Monthly

That explains where our issues are coming from. So, how do we manage them?

Standardization, Automation, and Elbow Grease

Well, managing these issues has been an evolving process. We are fortunate to have on staff not just one, but several JIRA experts, who are always willing to help us out with custom fields and productivity brainstorming.

We’ve been working our way up to several key improvements, which are now at a point where we are starting to realize the benefits. With >1,500 issues, global improvements don’t happen overnight. It’s easy to add new fields to help us organize and track issues, but then those fields need to be populated – a daunting task. And of course, in order for this system to work, everyone has to use it the same way — which means a bit of documentation, training, and oversight are needed. Here are the keys to managing this type of issue volume:

  1. Standardization: custom fields, boilerplate language
  2. Automation: QaQ, automated email
  3. Elbow Grease: Monthly issues export & follow up
  4. NEW: Greenhopper

Standardization. Custom JIRA fields help us slice and dice the issues into manageable groups. For example, we added a publisher field, which allows us to export all the open issues for a given publisher. We use a component field, which allows us to sort that publisher’s open issues by whether the issue relates to the source PDF, the source EPUB, the metadata, companion files, etc.

Component Pie

And we have boilerplated the language we use in certain fields, which serves two purposes. First, it saves the ticket writer time – she doesn’t have to consider how to explain a given issue, she can rather just copy/paste the explanatory text from our (constantly updated) JIRA Issue Map. Second, we make sure our boilerplate language is clear enough for publisher-facing communications, even if our primary publisher contact is a rights person who has no need to speak the lingo of CSS or toc.ncx, for example.

Automation. Our stellar engineering team has built us an QA Queue application (we call it the QaQ) to manage our daily load of new titles to quality-check, and this system hooks right into JIRA. After we check a publisher’s new batch of titles, we follow up via email to let the publisher know which titles are live, and which need a little more work before they can go live. The QaQ automates the creation of lovely formatted emails; for titles with associated JIRA tickets, it exports the text from key fields which detail the required fix in easy-to-understand language.

Elbow Grease. We are now rolling out a monthly export of issues for each publisher. When a publisher receives a spreadsheet listing their issues in detail, sorted by issue type, it’s a lot easier for them to follow up en masse, so they can get as many new titles live (or corrected, if they are already live) as quickly as possible. We did a pilot of this new process with a select set of publishers, with very promising results. We don’t want our publishing partners swimming in the JIRA sea, nor should we require them to rely on email alone for making sure all their titles are working well on Safari.

New: Greenhopper. This plug-in to JIRA has us really excited. We are doing a trial run with a Kanban workflow for the subset of Content issues requiring engineering work. In 2010, we were managing the long list of engineering Content issues via JIRA and email alone. Well, that doesn’t work so well once you have more than a handful of issues. So in 2012, we switched to a shared Google doc so we could be sure we were all working off the same songsheet. But even that has its shortcomings – we meant to keep notes in the Google doc and ALSO update each JIRA ticket as we worked. In theory. Often, only one or the other would get updated, and sometimes the priorities in the doc didn’t match the priorities in JIRA.

But with Greenhopper, we plan to kiss the Google spreadsheet goodbye, for the most part. We created a Kanban board with a few key buckets: Pending, In Progress, In SBO QA, and Completed. We are strictly limiting the number of In Progress tickets to 10. (If you go over 10 tickets In Progress, the whole board turns a distressing bloody red.) This way it’s very clear for engineering to know exactly what must be worked on. And the Kanban board is very easy to work with – in our status calls, we can discuss the entire board, and update each individual issue as we discuss it from the same board. No more getting lost in a sea of dozens of browser tabs or windows.

If this Greenhopper experiment works well for our Engineering tickets, we will explore creating boards for other types of Content Issues. The sky seems to be the limit in terms of how you structure your boards; they seem fully customizable based on the fields you want to use.

OK, now that we have these great tools in place and are starting to use them, we can start setting some nice aggressive goals to get our overall numbers down. (The team is going to kill me when they hear this.)  Let’s beat our current created-to-resolved ratio by summer, guys!

30 Day Summary to Beat

TOC_logo_twitter

I wasn’t sure until the last minute whether I was going to Tools of Change 2013. When I ran a publishing startup, TOC was the most important event of the year: we organized our entire product release schedule around it. (Keith calls this “Conference-Driven Development.”) It was often the only opportunity to meet our current customers face-to-face, and giving conference presentations and attending mixers constituted 100% of our marketing and sales effort. Missing it was unthinkable, a potentially catastrophic failure for the company.

This year I still have lots of meetings and not enough time, but the stakes are much lower. In the end, what convinced me to come back was less the urgency of the appointments and instead the opportunity to see friends and colleagues. If I didn’t attend, I’d miss the chance to stay in touch with those who’ve supported and encouraged me in the rollercoaster ride that is 21st-century publishing.

It’s always a crapshoot which sessions I’m able to see — many get preempted by interesting session-break conversations that spill into the next track (and are always well worth the time). Here are the talks I’m hoping to attend, some of which naturally overlap, sigh:

Preparing Content for Next-Generation Learning

Greg Grossmeier (Creative Commons), Michael Jay (Educational Systemics, Inc.)

10:45am Wednesday, 02/13/2013

Safari considers itself as much a learning company as an ebook company, but the “e-learning” industry is one with which I have almost no familiarity. We’re always looking for ways to facilitate professional development and skill-building, and I’m eager to keep on top of the leading edge of the space, especially with regards to web-centric approaches versus traditional learning management systems.

End To End Accessibility: A Journey Through The Supply Chain

Dave Gunn (Royal National Institute of Blind People), Sarah Hilderley (EDItEUR Ltd), Doug Klein (Nook Media, LLC), Rick Johnson (Ingram | VitalSource)

1:40pm Wednesday, 02/13/2013

Though our product has significant accessibility affordances, most of them pre-date advances in accessible content, including EPUB 3 semantics. I want to be ready for us to take advantage of semantically-rich content and ensure that we’re providing a consistent user experience relative to other ereading systems.

Book as API

Hugh McGuire (PressBooks / LibriVox / Iambik ), Alistair Croll (Solve For Interesting)

1:40pm Wednesday, 02/13/2013

Some publishers and book services have had public APIs, but have placed enough restrictions as to make them useless for general purpose use. Consequently the APIs don’t see wide adoption, and then the organization wonders why they’re supporting something nobody uses — supporting a public API is a non-trivial investment. Eventually the API is discarded. I’m interested to see if there’s a way out of this self-defeating cycle.

Information Wants to be Shared

Joshua Gans (Rotman School of Management)

9:20am Thursday, 02/14/2013

Google’s First Click Free or innovative approaches to search engine discovery are offering publishers more choices in discoverability and sharing that shouldn’t compromise sales or devalue content. This is a critical topic for any web-based aggregator.

The Elusive “Netflix of eBooks”

Travis Alber (ReadSocial and BookGlutton), Christian Damke (Skoobe), Justo Hidalgo (24Symbols), Andrew Savikas (Safari Books Online)

10:35am Thursday, 02/14/2013

I suspect this is relevant to my interests. Also my boss will be there.

Don’t miss

Other sessions likely to be time well-spent: Revamping Editing: The Invisible Art (Maureen Evans & Blaine Cook, Poetica), especially if you missed their Books in Browsers presentation;  Creators and Technology Converging: When Tech Becomes Part of the Story (moderated by Erin Kissane), an interesting line-up of speakers from outside traditional publishing; PubHack: Understanding Industry Barriers, And How To Get Innovating Anyway (moderated by Kristen McLean), a must-see for publishing startups struggling to work with larger organizations.

Book cover for EPUB 3 Best Practices

O’Reilly Media has just published EPUB 3 Best Practices, edited by Matt Garrish, who wrote much of the EPUB 3 specification itself, and Markus Gylling,  Chief Technology Officer of the IDPF. I can’t think of two people more qualified to organize and oversee this work, and it was a delight to work with Matt in composing and editing the chapter that I contributed.

For some reason the book synopsis doesn’t cover the killer feature of the book, which is that many of the chapters were authored by hands-on experts in EPUB development and production.

The whole book is highly recommended, but I’ll pull out a few highlights and credits for those contributors:

Packaging and metadata: Bill Kasdorf

Bill was given the unenviable task of explaining the flexible-yet-complex new metadata options available in OPF 3.0. I love this succinct summary of the various components of the OPF, which can be difficult to explain to beginners:

Which EPUB is this (“identifiers”)? What names is it known by (“titles”)? Does it use any vocabularies I don’t necessarily understand (“prefixes”)? What language does it use? What are all the things in the box (“manifest”)? Which one is the cover image, and do any of them contain MathML or SVG or scripting (“spine itemref properties”)? In what order should I present the content (“spine”), and how can a user navigate this EPUB (“the nav document”)? Are there resources I need to link to (“link”)? Are there any media objects I’m not designed by default to handle (“bindings”)?

I recommend particular attention to the section on EPUB 3′s solution to unique identifiers and document updates. Too many retailers still have substandard responses to book updates, which often boils down to either not supporting updates at all, or clobbering user annotations and bookmarks.

Bill explains:

When technologists—or reading systems—say an identifier uniquely identifies an EPUB, they mean it quite literally: if one EPUB is not bit-for-bit identical to another EPUB, it needs a different unique identifier, because it’s not the same thing; systems need to tell them apart. Publishers, on the other hand, want the identifier to be persistent. To them, a new EPUB that corrects some typographical errors or adds some metadata is still “the same EPUB”; giving it a different identifier creates ambiguity and potentially makes it difficult for a user to realize that the corrected EPUB and the uncorrected EPUB are really “the same book.”

Navigation: Matt Garrish

The majority of EPUB 3 publications produced commercially are likely to include not one but two tables of contents (the EPUB 3 Navigational Document and the EPUB 2-era NCX for backwards compatibility). Matt provides compelling use cases for the new form: marking up deeply nested TOCs, linking printed page numbers to the EPUB edition, and providing the much-needed landmarks feature, to identify commonly found points in book content like indexes, tables of contents, and the correct “starting page” for the book body content. It remains to be seen if reading systems will embrace these landmarks, as each major retailer has entrenched proprietary methods for e.g. defining what the start page is.

Font embedding: Adam Witwer

Perhaps the most practical chapter in the book, Adam discusses the ins-and-outs of font embedding from his perspective as a publisher, writing:

Font obfuscation has been the source of much confusion. If you dig around the Web, you’ll find plenty of blog posts and forum chatter full of confused and frustrated ebook makers trying to make sense of it all. The confusion stems largely from the fact that, until recently, the IDPF and Adobe had competing font obfuscation algorithms, and reading systems supported one or the other. If you used the Adobe obfuscation method, your embedded font would render correctly on maybe the NOOK but not in iBooks, and so on.

Font embedding in ebooks is a messy and confusing full of traps for the unwary; it’s telling that Adam’s chapter has more footnotes than any other.

Interactivity

I wrote this chapter. It’s pretty great, you should read it.

Global language support: Murata Makoto

This chapter will be a lifesaver for those publishers struggling to produce correctly formatted ebooks for the Inner Mongolia market.

(Seriously, there’s invaluable information here on EPUB 3′s support for Asian languages, right-to-left scripts like Hebrew and Arabic, and the interesting edge cases that emerge in rendering numbered lists and hyphenation. It’s worth reading just for a high-level overview of the immense diversity in modern human writing systems.)

Accessibility, validity, et al

Last but absolutely not least, the chapters on Accessibility are must-reads for anyone producing ebooks seriously. I’m not sure there’s a better reference on advanced topics like EPUB 3 text-to-speech (TTS) support, media overlays, and other features that — while designed for the print-disabled — offer tremendous options for creativity and truly enhanced digital-native publications. The section on understanding errors from epubcheck is also extremely welcome, as even experienced developers can sometimes be baffled as to the underlying causes of validation failures.

EPUB 3 Best Practices is an absolute must-have for anyone in our industry. Highly recommended.

Safari Books Online subscribers can read the entire book as part of their subscription.

Back in January we announced that the fantastic publishing technology team from PubFactory had joined Safari Books Online. Since then we’ve been hard at work integrating the team into our systems, and they’ve been hard at work building and maintaining search and reference products for their clients in academic publishing.

It’s been a singular experience for me as these are my former colleagues: I worked at iFactory for a number of years as a software engineer. That was my first job connected to publishing. Before that I would have self-identified as a generic “web developer.” While I had always tried to work on web projects that mattered, it was clear to me after my very first publishing project that I’d found my industry. I started Threepress in 2008 to work as a digital publishing technologist.

Threepress specialized in ebook formats and ereaders, while the PubFactory team serves reference and academic publishers. It’s been instructive for me to compare how these two worlds have diverged or converged in the five years since I last worked in the reference field.

Books aren’t data

The EPUB format is strictly XML-based. From the metadata to the table of contents to the book content, an EPUB file must be almost entirely composed of text marked up in well-defined XML schemas. Those schemas allow the EPUB book to be validated by a computer program that follows the schema and other well-defined business rules, ensuring consistent production. At the other end of the workflow, those same schemas would assure reading systems of the predictability of the books added to them.

EPUB 2 was released in 2007, though its design history extends back in the 1990s. At that time, academic publishers were among the only publishers producing and exchanging book data with retailers, mostly via library aggregators and portals. Those became natural models for the commercial ebook industry that did not yet exist. Outside of publishing, XML was “obviously” on a path to overtake historically messy HTML, and so aligning with XML was aligning with the future of web standards.

These were all reasonable assumptions based on the shape of the digital publishing industry when EPUB 2 and its predecessors were codified.

At that time, trade book publishers largely had no need for textual markup. It was not a part of their production workflow, nor was it natively how they produced “digital books”, which with few exceptions were always PDFs. (Safari Books Online was one of those exceptions as we initially required DocBook XML, but we eventually accepted PDF and later EPUB.)

Why is XML so foreign to trade publishers?

XML excels as a data exchange format for textual content with hierarchy. Dictionary entries and journal articles are data. Dictionary entries and journal articles are regular. Even when somewhat unstructured, as in a research paper, the work still has a predictable shape, and its primary goal is information exchange.

A trade book is not data. Even non-fiction trade is a work of human creativity with unpredictable contours. In programming terms, most books are BLOBs, opaque shadowy things that can be moved from system to system but whose contents cannot be inspected in a mechanical way.

Novelists don’t create data. They create books.

Books can’t be wrong

Strict XHTML as a book markup format was the solution to a problem that didn’t exist. It didn’t fit neatly into an XML-based workflow because most book publishers didn’t speak XML anyway. It didn’t align with the direction of web standards, which abandoned an XML-centric approach for good in 2009. It didn’t make ebook consumption any easier for ereaders, because the challenges in ebook display are in the CSS and UI layers. And it didn’t make writing an ereader any easier because embeddable web browsers quickly became the de facto rendering engine, and those already excelled at rendering plain old HTML.

By far the biggest advantage of XML workflows is at the time of production, where one can validate that the XML document contains all of the data that is expected in the correct order, format, and position in the hierarchy.

Books aren’t actually subject to these constraints. You can’t write an XML schema to validate that a book has one or more chapters, as it may have no chapters at all. It may not have an author. It may not have any wordsIt may not have pages.

(I’d go on, but any discussion of the heterogeneity of books inevitably devolves into one of those tedious “What is a book?” slides at publishing conferences.)

Books can’t be right

An ebook application can’t do a lot of things that an XML-driven reference application can. In design meetings I find myself striking out interesting feature after feature: we can’t aggregate indexes terms across a corpus because there’s no standardized EPUB markup for them. We can’t apply a consistent style to chapter titles because of incompetent, un-semantic markup like <p class="header">. We can’t extract quotable epigraphs or context-highlight code samples or anything that my PubFactory colleagues can dream up with their neatly ordered, well-defined XML inputs. EPUB content is a BLOB.

Some ebook systems do apply consistent styling or extract interesting information out of books, but that’s powered either by a huge amount of invisible human effort or a lot of advanced machine learning and heuristics. That capability doesn’t flow naturally out of the markup.

On the other hand, I can throw just about anything even resembling an EPUB book at our reading system — even if it’s completely invalid with HTML tag soup — and it’ll load. We have very little preprocessing necessary; XSLT, which is hard to learn and harder to master, is almost absent from our workflow. And users can upload their own books from anywhere else in the publishing ecosystem.

The paperback ebook

Since EPUB emerged, a variety of simpler formats have been proposed, usually by individuals from the technology industry. They do a better job of solving the problem of book production by capable amateurs, but don’t serve the diverse needs of the publishing industry that EPUB represents: the print-disabled who need rich semantic markup, library catalog systems that want to analyze highly granular metadata, fixed layout books, multi-lingual books, graphic novels, interactive textbooks, and on and on. Full-blown EPUB solves real problems, but as John Maxwell put it at Books in Browsers 2012, XML is a format that serves incumbents.

I hope that the next revision of EPUB allows HTML5 markup, without the leading X-, as I don’t think that XML requirement is solving any problems for anyone. Rich metadata, on the other hand, offers a great deal to the ecosystem, and is a reasonable tradeoff for authoring complexity.

Until we have an EPUB sans XHTML, it’s worth considering a lightweight subset of the format, one that represents a convention over configuration approach. A “microformat” version — EPUB: the beach novel edition — could be mechanically “upsampled” into big boy EPUB for use in the real ecosystem. It won’t solve the problem of heterogeneity in books (which is, after all, not actually a problem except to reading system developers), but it could make it easier for even experienced ebook authors to create publications without firing up an XML editor, for the majority of books that have very simple metadata requirements.  I’ll outline some ideas for that in a future post.

Question: How long does it take to write, produce, and print a book—and finalize all the standard e-formats?

A. 2 months
B. 4 months
C. 3 days

Granted, the answer obviously depends a lot on what type of book we’re talking about. It also depends on your definition of “finalized.” Up until last week, I’d have said Option A was possible, if the book had a lot of luck, a very determined author, and the best production team money can buy. Option B is more the norm, in my experience. But last week, I witnessed—and participated in—Option C, the 3 day book.

Last week I participated in a Book Sprint hosted by Google. It was facilitated by Adam Hyde, creator of the Book Sprint methodology, with Intro and Outro “unconference” workshops facilitated by Allen Gunn of Aspiration. You can find my daily impressions of the experience here.

GSoC Doc Camp Books – Evergreen, FontForge, Etoys as paper and electronic books (kindle, android, iPad).

Google Summer of Code Doc Camp Books. Evergreen, FontForge, and Etoys as paper and electronic books (Kindle, Android, iPad).

My takeaways

I was glad to get the opportunity to participate, and I learned quite a bit from the Book Sprint experience. But what I learned was not exactly what I set out to learn. I expected to observe the Book Sprint process and be able to map it to traditional trade or academic publishing workflows. While that still seems entirely possible, I think I learned a bit more than that.

Critical benefits of in-person collaboration

The biggest concept I took away from the book sprint experience was the power of in-person collaboration. It’s hard to convey the importance of this if you don’t experience it for yourself. A group of subject matter experts and end users sitting together in a room talking things through produces amazing results. It’s not just that you’re cutting out all the time lag of email and phone tag. It’s the immediate exchange of ideas that quickly shapes and defines the concepts and the structure of the work. And when you’re focused on the book all day every day, without distractions, with the ability to ask questions and receive immediate answers, you stay in the zone. It’s a pretty amazing thing—and why limit this concept to Book Sprints?

Documentation takeaways

Documentation is such a critical part of my Safari work, and I didn’t expect to think about that at all during the sprint, but I actually learned a few things about documentation that I’ll be putting into practice for myself.

  1. Documentation doesn’t work if it’s based on “what you think users should know” rather than “what users want to know.”  And while you can try to force yourself into the mindset of the user, it’s much better to actually involve the user. This needs to happen both at the outset of the documentation creation process, and on an ongoing basis, so you can keep improving your documentation. I have long been dissatisfied with documentation I’ve produced, but I’ve never been galvanized to rethink it. Now I am. Using what I learned at the Book Sprint, my team and I plan to host our own sprint-informed documentation session after the holidays.
  2. Documentation in a vacuum is not effective: you need to build a community. Here’s another one I already knew already, but the sprint really reinforced and clarified the issue and ideas on how to solve it: users need to engage with the documentation. We need to be able to have conversations with the entire user group. Or it just won’t get used… as I’ve learned the hard way!
  3. Infrequent, monolithic documentation updates are a real pain, and don’t serve the users. OK, again, I already knew that, but before now I had no solution to the problem. More than once, I’ve been guilty of letting the list of necessary updates sit ignored for way too long. You need the engagement of the community to keep you motivated, and just as importantly, you need the right tools to make frequent updates easy! More about tools later in this post.

Type of book

I think there are a wide variety of types of books that could benefit from the Book Sprint process. Whether or not you come out of the sprint with a book that’s ready to distribute is the biggest question. For a topic in serious need of documentation, often the sprinted book is ready for release, because it fills an immediate need, and a somewhat unpolished book is by far better than no book. This frequently applies to the free software community or any community where there isn’t necessarily the funding to produce documentation or written resources.

For professional publishing-quality books, it’s really no problem. The sprint still gets you way ahead of the game. Feel free to spend more time polishing the work after the sprint before you publish.

The tool that makes it possible

We received expert facilitation throughout the week and without that, a Book Sprint couldn’t exist. Right behind good facilitation, though, is the collaborative authoring and production system called BookType. It was designed expressly for this workflow. It’s a simple-to-use web-based authoring environment, with special sauce. It’s got workflow and version controls. It’s got graphical representations of the data representing the work in progress. It has powerful CSS formatting controls. It outputs print-ready PDF, EPUB, MOBI, and other formats. The best part is, you can easily update the content anytime, and output fresh files. No conversions necessary, no time lag waiting for your EPUB or MOBI. Wow, sounds like the future of publishing is here.

Check out my daily posts, or for more information, see www.booksprints.net.

I am pleased to be joining the Board of Directors of the International Digital Publishing Forum (the IDPF). As I suggested in my nomination statement, I will try to focus on three specific goals:

  • Developing clear documentation and best practices to ensure that reading systems consistently implement the technical capabilities of EPUB 3 to achieve a common, interoperable experience
  • Promoting tools and techniques that allow digital publications to be accessible to readers across a range of cultures and reading abilities
  • Broadening both the technical contributors’ and IDPF’s own leadership to include and better serve the international publishing community outside of Europe and North America

It is humbling to be able to join such an accomplished group of contributors, and I hope we are able to serve the members while being receptive to critiques and contributions from the wider community.

It’s Friday afternoon. Your new ebook product is launching on Monday and you have 1,000 files to automatically convert from a legacy format to EPUB. You figured it would take a minute or two per file, but that was before you tried the converter on some actual commercial ebooks, the kind with lots of really big images. It turns out that if you converted all the books one at a time on your laptop like you’d been planning, you’ll be ready to launch in 2015.

Don’t panic. There’s always EC2.

Amazon’s cloud-for-hire service is widely used by internet startups (and post-startups), but it’s a potential lifesaver for any kind of large-scale computing task. Like it or not, the publishing industry today is about producing computer-readable data. Even if you just make a couple of EPUBs a year, you’re developing software, and if you make several hundred EPUBs a year, you should probably be thinking about scalable data-crunching techniques.

In this post, I’ll describe the data conversion problem and the architecture I used to crank through almost a thousand files in a couple hours. In part 2, I’ll walk through the Python code that powers all the pieces and assess the cost/benefits of this kind of approach. Programming skills are assumed.

Entering the EC2 world for the first time is intimidating, even for an experienced developer. There’s a lot of terminology, and books and articles get out of date fast. This post isn’t a comprehensive tutorial, but I’ll try to point out the places where it’s easy to get stuck or confused.

For my ebook conversion project, I had to solve several problems:

  1. I had to process a lot of files in a short amount of time; that was impossible even with the fastest single computer I could find. So I’d need parallelization.
  2. I had to assume that my conversion process might have bugs to be fixed or improvements to be made in the future. So I needed repeatability.
  3. I didn’t intend to spend my weekend turning on and off hundreds of servers, and the penalty for leaving servers running unnecessarily could be a very expensive bill. So I needed server provisioning to be scriptable.
  4. Once I spun up my servers in parallel, I needed a way to distribute the work across them in an orderly way. I didn’t want to waste time and money converting books unnecessarily.

Why EC2?

  • It’s metered. You can rent very, very powerful hosts for only as much time as you need.
  • It’s scriptable. Once you get the basics set up the way you like, it doesn’t much matter whether you want to run the same process 1 time or 1,000,000 times. I turned to the widely-use boto Python library to script my EC2 provisioning process.
  • It’s repeatable. This is more than just about being scriptable; you can create a snapshot of the host exactly as you want, with all of the dependencies pre-installed. If you need to run the same process again in a year, nothing will have changed.

The one thing that EC2 couldn’t magically solve for me was the distribution problem: each EC2 virtual computer would be a newborn baby, with code all ready to go but no idea which files to process. After flailing around for awhile I settled on using a queuing service. Amazon includes Simple Queue Service with their offering so I was all set.

Customizing the EC2 instance

An Amazon Machine Image (AMI) is your virtual computer. AMIs are a complicated subject, but for general purpose Linux needs I pick one of the official Ubuntu AMIs. Customize the AMI storage/memory allocation to match your application’s needs. In my case, I needed a fair amount of both storage and RAM as the conversion process generates a lot of temporary files, but it turned out I didn’t need the most robust processor. (If you need a lot of storage or want lots of files pre-installed, you’re going to also need to configure your own elastic block instance and mount it; avoid this step if you can because it’s a pain.)

User data

You can (and should) install all the dependencies you need on your AMI before taking a snapshot. However, if anything about your application changes, you’ll need to rebuild the AMI, an annoying and non-automatic process. User data, a feature available on some instance types, is a simple user-authored script that will run on boot, and can be provided on instantiation. Since my converter’s source code was available on a private Github repository, I set up my user data script to automatically pull the latest version of the code, which was pre-installed with the correct SSH keys in the AMI:

#!/bin/bash

cd /home/ubuntu/ebook_converter

# User data scripts run as root, so ensure that we "su" to the local user account
sudo -u ubuntu git pull origin master

My conversion script runs as a Python program inside a Python virtual environment. It’s possible that the latest version of the code has new dependencies, so after the instance pulls the latest code, the user-data script also runs setup.py develop:

su ubuntu -c "source ve/bin/activate && python setup.py develop"

The job queue

To solve the workload distribution problem, I needed a centralized place to store all of the books to be converted. Originally I thought I’d spin up each EC2 instance with a predefined set of books to convert and pass that list in to the user-data script, but I realized a far better approach was to use a queue. A queue is just a data structure that maintains an ordered list and allows items to be added or removed programmatically. If I’d provisioned the work list for each instance myself, I’d have to deal with potential problems like an instance crashing during processing and leaving half its workload untouched. Better to leave that to the queue.

(The AMIs and the queue need to be in the same Amazon region or they won’t be able to find each other.)

The final pipeline

I ended up with a workflow like this:

  1. Get a list of books to convert.
  2. Push the list of filenames to the queue.
  3. Start up the maximum number of simultaneous instances (20).
  4. For each instance:
    1. Update itself with the latest code.
    2. Get a filename off the queue.
    3. Go get the file.
    4. Convert it.
    5. Put it somewhere.
    6. Repeat.

Each instance keeps requesting books off the queue until the queue is empty; at that point it can just shut itself down. If an instance crashes, there are still others to pick up the slack.

The entire workflow can be written straight into the user-data startup script. I ended up with this script, which self-updates the code, runs “forever” (until the queue is exhausted), and then turns itself off:

#!/bin/bash

cd /home/ubuntu/ebook_converter

# User data scripts run as root, so ensure that we "su" to the local user account
sudo -u ubuntu git pull origin master

# Invoke the Python virtual environment and update any Python dependencies
su ubuntu -c "source ve/bin/activate && python setup.py develop"

# Invoke the virtualenv and run the process
su ubuntu -c "source ve/bin/activate && python ebook_converter/process_from_queue.py"

# Once this finishes, we're out of jobs on the queue, so shut down the host
shutdown now

EC2 lets you remotely monitor any server, so if your job writes logging information, you can watch that in real time. Here’s the output from my converter as it starts up and picks a file off the queue to download and convert:

Or better yet, take advantage of a program like multitail and watch all the log output simultaneously! Shortly after starting up 20 instances, my terminal window looked like this:

The primary purpose of running multitail is to feel like a total rockstar and be able to email this screenshot to all your coworkers. Or better yet, send a photo of you drinking a beer in a hammock while 20 computers in the cloud slave away on your behalf at the click of a button.

In the next post I’ll walk through the Python code that powered the provisioning, distribution, and queuing of the jobs, as well as the economics and practicality of this approach.

By Keith Fahlgren and Liza Daly

A major theme of this year’s Books in Browsers was authoring. Liza and Keith have been trying to move our thinking about digital books beyond the low-level plumbing of files and formats, so we focused on what authoring will look like when files are irrelevant, distribution is seamless and transparent, and voice recognition is mainstream. What we (almost!) pulled off was a demonstration of a new mode of writing:

  • creating manuscripts via voice recognition and Google Docs
  • distributed editing via Google Docs and Google Docs comments
  • collecting marginalia via Twitter

You can watch all of this mostly happen, then totally fall apart during the live-demo, which we were “fortunate” enough to have recorded and preserved as a video:

(In fact, the software worked but it relied on Github Pages to post the output; it seems that we triggered some kind of traffic throttling system as our code rapidly posted update after update. We sincerely appreciate the audience’s good humor throughout.)

Streaming authoring: a demo

The actually functioning self-generated, self-published, live-annotated transcript of our talk is now available. It’s worth reading separately from this post.

A three column version of the talk transcript, with specific annotations from Google Docs on the left, the actual captured content in the middle, and tweets on the left

The vision

Our fundamental idea is that a new ecosystem of tools – like Google Docs, social media, or Siri – will obsolete the laborious workflow of modern publishing: wordprocessor followed by emails followed by files followed by conversions followed by FTP followed by static, siloed presentation (followed by silence).

Manuscript

The first stage of the new process will be based on markedly simpler tools for creating the rough manuscript. While first drafts are likely to be created with the familiar interface of hands + keyboard, as Peter Brantley remarked at Books in Browsers, “We need new entry points for authoring.” His comment referred to video; our direction was live narration and speech recognition.

In our demo, we captured the transcription of Liza’s conference presentation with voice recognition in real time. Each time Liza switched slides, the slide content and transcript was automatically pushed via the Google Drive API to a folder in Google Docs.

Editing

Live gatherings present an opportunity for a different mode of editing because of the tremendous inefficiency of wasted, uncaptured thinking. A conference like Books in Browser is full – literally – of sharp, thoughtful people who travel great distances to focus their brains on a single topic. To harness some of this brainpower to improve the manuscript, we encouraged the attendees (including remote viewers following the live-stream video) to add comments, corrections, and feedback to each Google Doc slide-transcript. The comments are presented in the pane on the left and editors’ corrections were integrated instantaneously.

Commentary

The final task was to capture a layer of marginalia in the pane on the right. We harvested the ambient and ephemeral twitter stream and rooted each tweet to the exact corresponding moment in the presentation itself. While this is the least deliberate form of creation/editing, it actually worked out well. We’re amazed how thoughtful and complex some of the tweets were, composed in the moment.

“What is this thing called?”

While of course we were disappointed that the demo didn’t quite work, enough people engaged with it that we can’t regret trying something a little out there. As we developed the idea, we found a lot of possible directions for further thought that all seemed interesting.

From what comes a book?
Defining what a book is has become a cliché of every publishing conference, but in this case we really did think about it. Considering every formal or informal talk an opportunity for deliberate authoring greatly expands our capability to create preserved narratives and “books.” This could be a conference, a business meeting, a storytelling session among friends and family, or the inside of a classroom.

The classroom, on- and offline
It’s likely that many, if not most, classrooms are going to be hybrid online and offline experiences. Online participation puts local and remote users on the same footing, and asynchronous commentary means that students who require more time to compose their thoughts get the benefit of “classroom participation.” Is copying down the instructor’s lecture the best use of a students’ attention? How can live transcription, plus peer editing, help students who can’t write quickly, are too easily distracted, or have gotten lost in the material?

Voice is coming
This experiment taught us that voice recognition is at a tipping point. Right now, it’s underutilized by software developers, game-makers, and content creators, but speech recognition (and text-to-speech) will soon be a transformative technology now that it’s become commoditized. Paired with inexpensive mobile technology, its potential reach in the developing world alone is staggering. What does “user interface” and “user experience” mean when voice may be an input or an output?

(While we disabled commenting in the Google Docs to preserve the experiment, we’d love to read further thoughts here.)

Publishers sometimes ask me, “What [ tool | filter | app ] should I use to save my [ Word | layout ] files to EPUB format?” And I have to be the bearer of disappointing news: it’s not that simple. EPUBs that are produced using the sorts of tools that offer a Save As… EPUB option are files that may validate (with a bit of luck), but no one will want to look at those ebooks, and you certainly wouldn’t want to offer them for sale.

Alchemical apparatus engraving from Basil Valentine his triumphant chariot of antimony, 1678,http://archive.org/.

Let’s back up for a minute. Why are we in this situation of wanting a somewhat mystical solution to producing ebook files? I say it’s because of two key publishing workflow needs that make the process more complicated than it might seem to non-publishers.

  1. Publishers need an authoring/editing system that is ubiquitous and easy to use, to give them the flexibility of working with many different authors and editors working on a variety of platforms. The collaborative nature of manuscript development requires review passes with visible, user-identifiable changes and comments, and the ability to accept or reject changes one-by-one. Most publishers use Microsoft Word to fill this role. Some publishers use Word in conjunction with file sharing applications that offer version control and collaboration features, but most simply email docs back and forth and rely on a human project manager to keep things straight.
  2. Publishers need sophisticated layout and paging controls (such as those provided in Adobe InDesign) to produce beautifully designed print books. Even for the driest of technical books, good design is essential for readability. Final edits and corrections are made in the layout files, which means that the layout files—and ultimately the PDF that gets sent to the printer—become the definitive work.

These two requirements have locked publishers into a workflow that has them building the print book first, and then creating the ebook from the print files. Creating an ebook from layout files can be partially automated to the degree that layout files are consistently structured (a tall order), but some amount of manual work is generally involved. It’s a time-consuming process frequently taking a minimum of one week, sometimes longer. And it’s not free. While the cost isn’t necessarily prohibitive for a single book, it adds up when applied to a publisher’s entire catalog. And finally, many publishers have learned the hard way that file conversions introduce the potential for errors. So they’ve had to build quality controls and checks into their process, costing more money and time.

The more intrepid publishers have taken the plunge into XML workflows, some (notably, O’Reilly) very successfully. But most publishers have shied away from costly XML systems because it just hasn’t been practical to find one that fully meets the two requirements outlined above, or at least without breaking the bank.

It almost goes without saying that any new solution must fill both of the requirements I keep talking about, and of course also output printer-ready PDF, bookmarked “uPDF”, EPUB, and MOBI. The good news for the would-be alchemists is that we seem to be on the brink of a solution.

Let’s tackle requirement #1: An authoring/editing tool that is ubiquitous yet offers the sophisticated collaboration and review controls. Well, what is more ubiquitous than a web browser? At Books In Browsers 2012, we saw demos of applications that hint at delivering this requirement through the browser: Adam Witwer of O’Reilly demoed Atlas, and Adam Hyde of booksprints.net and FLOSS Manuals demoed Booktype, which offers a SaaS model for customization. Maureen Evans and Blaine Cook demoed Poeti.ca, which offers browser-based, traditional-looking (only more beautiful) collaborative editing features that were a delight to behold.

Moving on to requirement #2: Controls for beautiful design. Adam Witwer showed us complex page layouts produced in Atlas, and my colleagues at Safari Books Online have been experimenting with pushing the boundaries of what Atlas and CSS can do, with exciting and encouraging results.

So… sure, the tools may be slightly immature at this moment, but they are under active development. It’s clear to me that we (and after a lifetime in publishing, I identify myself as a member of the publishing community) need to begin making the move to modern publishing. Why not get to know a developer working in this field and collaborate to build the exact tools you need? Why continue looking for the philosophers’ stone to turn one thing into another completely different thing, when you could produce your gold right from the start? The new world of publishing tools tantalize us with their shiny potential: to save money and save time, without sacrificing quality either at the press or in the e-reader.

Follow

Get every new post delivered to your Inbox.

Join 290 other followers