I was interested to try writing on Medium, so I composed this essay reflecting on running a startup in the publishing space, and what the opportunities are for publishing technology companies today:

Here’s what I didn’t do, and where I think the opportunities are: really hard problems. These are the topics I was often asked to consult on, and while I believe I provided valuable answers, they were always long-winded workarounds and excuses. If you can actually do this stuff, you have a publishing business you can be proud of…

Starting up in publishing

(Ironically, I got my Medium.com account via a talk at Tools of Change; its closure is one of the threats I see to innovative new companies getting a head start.)

The Goal

I wrote before about using Sphinx to generate documentation in multiple human languages, but Sphinx can also generate documentation for APIs implemented in multiple programming languages. Most programming languages have some kind of API documentation generation tool (either bundled with the language implementation or provided as a separate utility) for documenting software written in that language: Javadoc, JSDoc, RDoc, Doxygen, etc. These are really useful when you’re primarily working in a single programming language, but most of them start to show some limits in a multiple-language software project. If you’re writing a Django web application (Python) with a rich client UI (JavaScript) that leverages existing web services code (Java), then resources and flow of execution can be shared between languages. In this case, it’s nice to have a single searchable source of technical documentation, and with a little configuration Sphinx can do this.

Python

Sphinx was originally written as a tool for writing the documentation of Python itself, so it stands to reason that it has very good support for generating Python API documentation (in fact, it’s pretty much Python’s official tool for this purpose). The documentation for Python and Django are examples that many other projects follow, encouraging a style of documentation which reads more like a technical book than a raw listing of class and function descriptions. Here’s an example (from Django) of the type of wiki markup Sphinx uses for this:

Available ``Meta`` options
==========================

.. currentmodule:: django.db.models

``abstract``
------------

.. attribute:: Options.abstract

    If ``abstract = True``, this model will be an
    :ref:`abstract base class `.

Sphinx also includes utilities for auto-generating partial (autodoc) or complete (sphinx-apidoc) documentation for a Python API, fetching descriptions of each item from docstrings in the source code when available. For example, to generate reST files which describe all the Python code under a particular directory:

sphinx-apidoc -f -o docs/python src

This doesn’t take much work, and generates something that looks a little more like Javadoc output. Speaking of which…

Java

Javadoc was one of the first tools to really popularize in-source-code API documentation. There’s been some debate over whether this is really a good way to write the main API documentation for a software project, but at any rate, most decent Java projects include fairly complete documentation right in the source code. Using the Javadoc tool is by far the most common way to generate HTML documentation from these source code comments, but they can be used to generate Sphinx documentation as well. There’s a Sphinx extension called javasphinx which includes a tool to parse these comments and generate Sphinx reST files from them. Usage (after installing and configuring the extension as described in its own Sphinx-based documentation) is very similar to the equivalent Python utility described above:

javasphinx-apidoc -f -o docs/java src

And if you prefer the book-like style used for Sphinx documentation for projects like Python and Django, javasphinx provides a “domain” of reST markup extensions that can be used to describe a Java API in conjunction with wiki-formatted prose:

.. java:type:: public interface List extends Collection, Iterable

   An ordered collection (also known as a *sequence*)

   :param E: type of item stored by the list

(Example stolen shamelessly from the javasphinx documentation.)

The output may not be amazing enough to convince a Java development team to switch over from Javadoc output, but it has the big advantages of being combinable with other Sphinx source files (either hand-written or generated from other programming languages) and allowing generation of output formats other than HTML (such as PDF, EPUB, LaTeX, etc.) Additionally, the same techniques I described in my previous blog post for using gettext and Transifex to translate Sphinx documentation into multiple written languages can be used to translate the generated API documentation as well. The JavaScript search engine provided in the Sphinx HTML output is another nice bonus. Which leads us to…

JavaScript

JavaScript is a particularly important “second language” for a documentation tool because its virtual monopoly in web browsers means that developers of web services in many other programming languages also need to deal with it, but in some respects it’s one of the most difficult ones to support. Whereas most other programming languages have a standard enough structure that vaguely useful API documentation can be generated by automated tools even if the developer didn’t bother to comment his code, this has proven exceptionally difficult in JavaScript; unless a human identifies how the code is structured, software can’t reliably describe it in a way that’s particularly useful to humans. From this perspective, using the JavaScript domain that Sphinx provides for describing JavaScript APIs probably involves no more work than any of the alternatives:

.. js:function:: $.getJSON(href, callback[, errback])

   :param string href: An URI to the location of the resource.
   :param callback: Get's called with the object.
   :param errback:
       Get's called in case the request fails. And a lot of other
       text so we need multiple lines
   :throws SomeError: For whatever reason in that case.
   :returns: Something

(Again, this should look familiar to anybody who followed the link above.)

Nevertheless, many developers came to JavaScript from languages (especially Java) which support auto-generation of docs from source code, and heroic efforts have been made to enable this in JavaScript as well; a number of projects have endeavored to comment their source code in ways that work well with tools such as JSDoc, YUI Doc, JSDuck, and others. Of these, JSDoc is probably the most widely used so far, and there happens to be a fairly nice utility (by the somewhat long name of “JsDoc Toolkit RST-Template“) for getting JSDoc to generate reST files for Sphinx rather than outputting HTML directly. JSDoc uses the Rhino Java-based JavaScript interpreter, so the command for running this one is more typical of the Java world:

ant -Djs.src.dir=src -Djs.rst.dir=docs/javascript build

Some examples of its output can be found here. It unfortunately doesn’t seem to work with JSDoc 3 yet, adding support for that would be a nice project for somebody with a little free time.

Putting It All Together

The API documentation for each of the languages described above (and others as well) all get initially created as reST (reStructuredText) files. Once generated, they can be treated like any other Sphinx source file (although you probably wouldn’t want to edit them directly if you ever plan to recreate them from source). Phrases can be extracted for translation, they can be combined with each other and other documents to form a larger documentation package, they can be combined with the output of other cool Sphinx extensions (seriously, even this is only a partial list), and so on. You’d probably want to write some kind of automation script to handle the details for you (here at Safari Books Online I wrote a Django management command to do it), but once set up you have a very nice tool for generating and maintaining a pretty comprehensive set of technical documentation.

Content recommendations, or “related links”, are a common feature on content-rich sites.  For example, while editing this post I  am being presented with a set of “related content” links provided by a partner of WordPress. These may be driven either editorially, or by search, or by using linguistic document similarity measures, or really by any relationship among content items. Editorial recommendations can be useful, since they tend to be well-informed, but may be difficult to maintain on a large site; they aren’t really appropriate when there are more than a few hundred items on offer. Search-based recommendations based on document similarity — which may be driven by subject classifications or other metadata, or even full text statistics, and scale up better to larger document sets — but tend to offer low-relevance suggestions without careful tuning.

We like recommendations based on observed user behavior. Our catchy buzz-phrase for it is “people who read what you’re reading also read this other thing — hey you might like it too!” (Maybe marketing will come up with something buzzier with “social” in the name somewhere.) As long as a site is sufficiently active, usage-based recommendations will tend to have a high perceived relevance, probably since they reflect real human preferences.

There is a huge literature in this field, as a search for “content recommendation” or “collaborative filtering” will quickly uncover. We won’t attempt to survey it here. In this article we examine a very simple implementation, look at some sample data, discuss a few interesting scaling problems it raises, and propose solutions to them.

Implementation Zero

The idea of user-based recommendations is to record an association between content items whenever a single user performs an action that indicates an interest in both items. Depending on the site, this might be based on purchase behavior, on search behavior, or on user ratings and/or comments. In our system — which is really primarily a content-delivery site, rather than a transactional site — we record an association whenever two content items are viewed by the same user in the same session.

We were already recording every request for content in a SQL table, in order to drive customer usage reports. This request table contains a unique session id (we used the JSESSIONID for this J2EE-based site), and a content id (uri). This is the essential information required to drive the recommendation feature. There are some implementation details here: do you record the usage directly from the application (if you do, beware of table locking!), or in a batch process by analyzing logs (in which case, make sure your URLs are neat and clean, or you are an expert with regular expressions), but those are a topic for another post!

We have the raw data we need, and we want to write a SQL query that pulls out recommendations from that. But there really is no efficient way to write that query with the data in its raw form. We had one truly horrible implementation coded up by a contractor using Hibernate, who came up with something like this (actually this is much neater and simpler than the Hibernate-generated query, but it exhibits the same scaling characteristics):

 select * from (
      select r2.uri, count(*) as freq from request r1, request r2
      where
        r1.uri = ?uri? and
        r2.uri <> ?uri? and
        r1.session=r2.session
      group by r2.uri
    ) order by freq desc
      limit 5

That should give us the result we want. It selects the documents most often viewed in the same session as the current document (the value of the ?uri? parameter). But it is horribly inefficient. The query time grows as the square of the number of document views. Over time, the number of document views may grow infinitely; recommendations should improve as we include more user data, so we want to include lots of data, but we would really prefer a query that doesn’t get slower as the quality of the recommendations increases!

Implementation One

As with so many database query performance problems, the solution here is to denormalize the data. If we precompute the association between each pair of documents, we can very quickly select the most related documents. We define the related_document table as follows:

create table related_document (
  u1 int,
  u2 int,
  freq int,
  fnorm int,
  index related_document_fnorm_idx (fnorm),
  index related_document_uri_idx (u1, u2),
  index related_document_uri2_idx (u2, u1)
)
   

u1 and u2 reference document uris; u1 is always less than u2, since we don’t want to store both sides of the symmetric relation. freq stores the number of times the two different uris, u1 and u2, are found to have been associated. fnorm is computed as:

    fnorm=freq(u1,u2)/sqrt(freq(u1)*freq(u2))
    

Given data in this form, we can select related document recommendations with this simple query:

    select u1, u2 from related_document where u1=?uri? or u2=?uri?
     order by fnorm desc limit 5
    

Given the indexes defined above, the query will execute very quickly, perhaps in constant time (if the index is a hash index), or in log time (if it is some kind of tree-based index). We even have a covering index, so rows never need be retrieved from the table. Note that it is necessary to index both (u1,u2) and (u2,u1) since we only store each uri pair in a single order, but we we want to retrieve all pairs that include uri, so it will appear on both sides.

And in case it wasn’t clear why we compute the normalized score, fnorm; in short, it can be interpreted as a probability that a random person who viewed document u1 also viewed document u2 (in the same session). If we used the raw freq score to make recommendations, we would tend to over-recommend popular documents. Consider an extreme case: Let’s say there is some document that everyone views, for some reason. This document will have the highest freq score relation with every other document: it will always be the first document recommended on every page. Clearly it’s not interesting to recommend this document to people who will almost certainly have seen it already, anyway.

Meanwhile, problems persist in the space-time continuum

You may have noticed a problem with the approach we sketched out above. Remember that we said the original query was borked because it had an N^2 slowdown? Well all we’ve really done here is to take a time problem and turned it into a space problem. Because our related_document table stores a row for every (viewed) pair of documents, it can potentially store N^(N+1)/2 rows, where N is the number of documents. If our site has one million documents, this table could in theory have five hundred billion rows, which probably won’t work.

In reality, since we don’t store rows with zero counts, and it is likely that many (most) document relations will never occur, the problem isn’t quite that bad. In fact, since we processed the statistics in batches as they come in from the site, we have a rough sense of the growth rate of the pair statistics relative to the number of requests. What we observed is an approximately proportional (linear ~ 20x) growth, rather than growth as the square of the request count, which presumably reflects the strong relationships among the content requests (people interested in hermeneutics read hermeneutics articles, nanotechnology not so much). After having recorded 500K requests, we have 10M distinct request pairs which require about 1.2G of file space. This growth tracks the number of recorded requests, not documents, and the number of requests will grow without bound. We’d like a reasonable way to bound the growth. We don’t want our site to go down because our database server ran out of disk space!

The good news is that our space problem is more tractable than our time problem was. It turns out that the distribution of content requests on web sites tends to follow a power law (also called a Zipf distribution, Pareto Optimum, and other fancy names). Essentially the idea here is that not all documents are created equal; some are much more popular than others. Because of this, the “probability mass” in the related_document table tends to be distributed unevenly. The idea we pursued is to use this unevenness to guide us in pruning rows from the table to keep it at a manageable size while still keeping the most valuable statistics. Basically, we should be able to sort the related_document table by freq, or by fnorm, and truncate it after some fixed number of rows that we choose. What we’d like to know is: how many rows should we keep? And should we sort by freq, or by fnorm? To answer these questions, let’s look at some data we captured from actual usage.

First-order statistics

This site has around 10M unique content URIs, and the data samples shown here were drawn from just over 10M requests for content. There are 1.2M distinct URIs in the sample. The following graphs demonstrate the uneven distribution of requests.

Figure 1. Content request distribution

uris

Shows the number of requests per uri, where the uris are sorted in reverse frequency order. The distribution approximates 1/N. The most and least frequent uris are not shown since otherwise at this scale all the points lie on the axes).

Figure 2. Content request cumulative distribution

uri-cum

Shows the same underlying data as in Figure 1, but the cumulative distribution is shown. For any given content item, the graph shows the number of requests for that item and more popular items.

The graphs only show a portion of a smaller sample, but they give the flavor of all of these sorts of distributions. Some quick calculations give a more precise characterization of the distribution of a sample of 1M requests. There were 340k distinct URIs. The most-requested item had 23000 requests. The cumulative distribution reaches 80% of all usage at rank=164000, so including half of the requested content. This is not an 80/20 distribution – it has a much fatter tail.

Second-order statistics

Our main interest here is to ensure that we provide quality recommendations to our users. To do this efficiently for a large number of documents, we need to be able to prune the document-pair statistics, which will tend to degrade the recommendations because it throws away useful information. These considerations lead us to seek a numerical measure of the quality of the recommendations that we can easily maximize. If we have that, we can know what is the least destructive pruning we can do while still keeping our data size below some reasonable bound. Basically we want to be able to sort the data by some metric, and trim off some part of the tail, without losing too much recommendation goodness.

We devised some metrics that are supposed to indicate the amount of useful information contained in various subsets of the data. These are all based on the distribution of uris over request-pairs, in a few different forms. One idea is to maximize the number of content items whose pages will have recommendations on them. Figure 3 below shows this. In particular, it accumulates the number of distinct content items included in the pairs, with the pairs sorted in decreasing frequency order. This metric shows the degree to which we willl have “covered all bases”: it’s ideal for demonstrating success to the OCD product manager. In the sample shown below, 82% of all content is covered by the top 10% of pairs.  You can see this by noting that the total number of pairs is about 55K, 10% of the total number of pairs (1.2M) is 120K, and eyeballing the graph we can see it reads about 45K uris covered by the top 120K pairs – about 80%.  The distribution is more heavily front-loaded than the first-order statistics, which we’d expect since it should operate something like sum(1/n^2) (the square of the first order graph).

Figure 3. Content distribution over pairs, ranked by frequency

graph rises steeply and then levels off, with some bumps, like a knee of a seated person with a huge lap

count(distinct uri) in { (u1,u2) where rank(freq(u1,u2)) <= n}
The graph shows the number of distinct uris covered in a sample of 500K requests including about 55K unique uris.

A more refined notion is to be concerned, not so much about how many content items are covered, but about how many page requests are covered. To do this, we measure probability mass rather than raw frequency. This has the effect of weighting more popular content more heavily, and makes a bit more sense if you think about user satisfaction: more requests for related content will be satisfied at some given percentage level using this metric than with the previous one. Figure 4 shows the distribution of the content probability mass over recommendation pairs. In this distribution, 79% of all requests are covered by the top 10% of pairs. The distribution is very similar to the previous one, indicating it’s probably not worth concerning ourselves with this fine distinction.

Figure 4. Content probability distribution over pairs, ranked by frequency

knee play

sum(freq(distinct uri)); uri in { (u1,u2) where rank(freq(u1,u2))
<= n }

The graph shows the number of requests
covered in a sample of 500K requests.

In the previous charts, we didn’t take into account how valuable the recommendations are. It’s one thing to pat ourselves on the back for recommending *something* on every page, or on as many requests as we can, but if these recommendations are of low quality, we can’t really claim to have satisfied anyone by putting them out there. As we discussed above, the normalized pair frequencies provide a measure of how related two content items are, independent of the popularity of those items, and is therefore a measure of the value of the recommendation. The next two diagrams show the same data as the previous two, with the pairs ranked by probabilty (fnorm) rather than raw frequency. Pruning according to this metric will allow us to retain the most accurate recommendations, as opposed to recommendations for the most popular content (which could provide an unrealistic positive feedback loop). I guess these are metrics for connaisseurs.

Ideally, we would like to preserve the “best” recommendations, and at the same time cover as many page requests as possible. The good news is that this distribution is not terribly different from the previous one: 80% of uris covered by the top 10% of pairs, ordered by probability. So we can choose to preserve “high-quality” recommendations without sacrificing coverage.

Figure 5. Content distribution over pairs, ranked by probability

a graph that looks like a knee, with some bumps

count(distinct uri) in { (u1,u2) where rank(fnorm(u1,u2)) <= n }

Again, we see the request distribution is very similar to the unique uri distribution; here we get 79% of requests covered by the top 10% of pairs.

Figure 6. Request distribution over pairs, ranked by probability

knee-like graph

SUM(freq(uri)); uri in { (u1,u2) where rank(fnorm(u1,u2)) <= n }

Managing Growth

A key measurement to make is how large the database will grow. This is to some extent a disk space consideration, but importantly a memory consideration. We want to be able to maintain all database indexes in memory for best performance, so it’s important to track the index sizes.

increase in database size as new requests are added, in 100K batches.  Red bars show deltas, blue line shows the total size

increase in database size as new requests are added, in 100K batches. Red bars show deltas, blue line shows the total size

We added data in increments of 100,000 requests and watched the size of the resulting related_document table. For the first several batches, the table grew by about 2M rows for each batch. But in later batches, starting at 500K total requests, the increases went down and the total began to level off somewhat. This probably reflects the fact that the number of pairs that actually occur is fairly sparse.  Still, we don’t see any real limit to the size yet, so truncating the data in a rational way remains necessary in order to manage growth, provide the best suggestion possible, and maintain a predictable resource profile.

Conclusions

The decision about how much data to keep ultimately depends on the resources available, and the relative importance of the recommendation feature. Ideally we would just keep all the data. But if resources are constrained, ranking request-pairs by their probability “value” and truncating the top portion of the distribution is a good approach to maintaining quality recommendations and good coverage. Also, ranking by frequency gives approximately similar results. Truncating the data at 10% will maintain about 80% coverage, in our sample. We gave some rough measurements about what sizes to expect based on different numbers of documents, and requests, but of course you should measure with your own data.

What I Needed It For

Some of my friends and I occasionally get together to play a pencil-and-paper role-playing game called Anima: Beyond Fantasy. This game has a really rich set of rules for creating the characters who inhabit the imaginary world in which the game’s story takes place. You can create just about any kind of warrior, martial artist, wizard, rogue, psychic, and so forth that you can think of from your favorite shows or books, complete with rules on how all of their abilities work in the game. But this flexibility comes at the cost of a pretty darn complicated character generation system; some abilities depend on having other ones first, there are limits on how much you can develop one ability in relation to others, and each character has a limited pool of “points” (of several different types) to be spent acquiring those abilities. Defining all the major characters in the story well enough to run them in the game turned out to be pretty time-consuming. Being a programmer, I figured this was nothing that a good piece of software couldn’t fix, and set about writing a character generation utility using a language (JavaScript) and a few libraries (jQuery, RequireJS, Twitter Bootstrap) that I wanted to get more experience with. About 18 months of occasional coding in my free time later, I had a mostly finished program…and quite a few JavaScript files. I’d kept the code fairly well organized, but now I wanted a nice automated process for generating an optimized version of the app that I could put online for other people to try out. Often for a project like this you’d use tools appropriate for whatever server-side web application stack you’re using, but this is a purely client application that runs completely in the browser as a single web page. I’d heard that people were using Grunt as a JavaScript build tool, so I decided to try that out.

Getting Started with Grunt

Grunt describes itself as a “task runner”; it’s basically a task definition and execution framework like make, ant, and rake —to name a few which are commonly used with other programming languages. It’s implemented in JavaScript itself, using the Node.js framework, so you need to install Node first in order to use it (this is becoming increasingly common for JavaScript-related utilities). Once that’s done, you use the Node.js package manager (npm) to install Grunt’s command line interface:

npm install -g grunt-cli

This basically just puts the “grunt” executable in your path, so you can run it in a project directory to execute Grunt tasks; it looks for the actual Grunt library inside whatever project directory you run it in. To be able to do that, we first need to make it look like a project directory. For Node.js to recognize it as such, we need a package.json file in the project’s root directory. For my character generation utility, it started out like this:

{
  "name": "anima-character-generator",
  "version": "0.5.0",
  "description": "A character generation utility for the Anima: Beyond Fantasy RPG",
  "homepage": "https://github.com/jmbowman/anima-character-generator",
  "keywords": [
    "game"
  ],
  "bugs": "https://github.com/jmbowman/anima-character-generator/issues",
  "repository": {
    "type": "git",
    "url": "https://github.com/jmbowman/anima-character-generator.git"
  },
  "devDependencies": {}
}

With that done, we can start installing Grunt and its plugins into the project directory. (These all get installed into a node_modules folder which you can add to your .gitignore file or whatever other mechanism your version control system uses to specify files and directories that shouldn’t be checked in.) For Grunt itself:

npm install grunt --save-dev

That last parameter tells npm to automatically add the library you just installed to the list of development dependencies for your project (basically, stuff you probably need when working on the code but not necessarily when just running it). So the “devDependencies” section of packages.json has now been updated to look like:

  "devDependencies": {
    "grunt": "~0.4.0"
  }

With that in place, other people who check out your code have a somewhat easier time getting things set up. They still need to install Node.js and grunt-cli as I described above, but to get Grunt and any other plugins for it you install later, they’ll just have to run “npm install” in the directory which has package.json in it.

Making Grunt Useful

Now that you have Grunt set up, you need to tell it what you want it to do. But Grunt itself really doesn’t know how to do all that much; it’s the huge selection of plugins for different kinds of tasks that really make this utility useful. So let’s install one of those first. I said I wanted to optimize my JavaScript files for deployment; I’m using RequireJS, so mainly I just need a plugin that’ll run that library’s r.js optimizer for me. Oh look, there is one:

npm install grunt-contrib-requirejs --save-dev

(Note that there are actually several Grunt plugins for RequireJS available. In cases like this where you need to pick one, the grunt-contrib-* plugins are usually a good default choice when available.)

The actual task definitions go into a Gruntfile.js (or Gruntfile.coffee, for CoffeeScript) file. To run my optimization task the way I want to, I need one that looks like this:

/*global module: false, require: false */
module.exports = function (grunt) {

    // Project configuration.
    grunt.initConfig({
        pkg: grunt.file.readJSON('package.json'),
        requirejs: {
            compile: {
                options: {
                    baseUrl: './js',
                    findNestedDependencies: true,
                    logLevel: 3,
                    mainConfigFile: './js/config.js',
                    name: 'libs/almond',
                    include: 'main',
                    optimize: 'uglify2',
                    optimizeCss: 'none',
                    out: './build/js/main.js',
                    wrap: true
                }
            }
        }
    });

    grunt.loadNpmTasks('grunt-contrib-requirejs');

    grunt.registerTask('default', ['requirejs']);
};

With this in place, I can type “grunt requirejs” (or just “grunt”, since I registered requirejs as the default task) and it will run r.js for me with the appropriate settings to concatenate all the JavaScript files ultimately used by my “main.js” script into a single file, replace the RequireJS loader with the lightweight almond loader (since I no longer need to load code from other files), minify the result with UglifyJS2, and finally create the generated file at the location specified by the “out” parameter. Note that I didn’t need to explicitly download r.js, write a shell or batch script to run it, create an “app.build.js” configuration file just to define the build, or a few other tedious things I otherwise would have needed to do. And the best part is that with this framework in place, defining new tasks becomes trivially easy and doesn’t involve adding any more new files to the project (we’ve only created two new files that actually get checked into version control). For example, if I want to generate HTML documentation for the JavaScript API from my JSDoc comments in the source code:

npm-install grunt-jsdoc --save-dev

And in Gruntfile.js:

        ...
        jsdoc: {
            all: {
                src: ['js/*.js'],
                dest: 'doc'
            }
        },
        requirejs: {
        ...

    grunt.loadNpmTasks('grunt-jsdoc');

Or to check all the JavaScript code (including Gruntfile.js itself) for common errors using JSHint:

npm-install grunt-contrib-jshint --save-dev
        ...
        jshint: {
            all: ['Gruntfile.js', 'js/*.js'],
            options: {
                bitwise: true,
                curly: true,
                eqeqeq: true,
                forin: true,
                immed: true,
                latedef: true,
                newcap: true,
                noarg: true,
                noempty: true,
                nonew: true,
                onevar: true,
                regexp: true,
                trailing: true,
                undef: true,
                unused: true,
                white: true
            }
        },
        ...

    grunt.loadNpmTasks('grunt-contrib-jshint');

These can then be run from the command line via “grunt jsdoc” or “grunt jshint”. The latter in particular I find extremely useful by making it trivially easy to check for and fix simple mistakes before even trying the code in the browser.

A Few Hours Later

In about 4-5 hours, I went from only having a vague understanding of what Grunt was to having a working Gruntfile.js that checked my code for errors, generated my documentation, concatenated and minifed my CSS and JavaScript files, renamed those generated files with MD5 hashes to avoid cache problems, and created a copy of the HTML page modified to work with those minified files (you can see the full file here). Granted, I already had scripts to run JSHint for a single file and to generate JSDoc output, and was pretty familiar with how r.js works. But using Grunt really did simplify the configuration and made it easier to run everything.

Oh, and if you’re curious about what that character generator looks like in action, you can find it online here.

Node.js has gained acceptance in mainstream software development at an amazing pace. There are a lot of good reasons for this: everyone loves software that is Very Fast, npm is truly an excellent package management tool/ecosystem, and its release coincided with the web development community as a whole awakening to the fact that JavaScript is a Real Programming Language. However, I think there are assumptions lurking in the subtext of the conversation around Node that are false, yet have contributed greatly to the excitement around it. I’ll whine about this briefly below:

  • Just because JavaScript is a real programming language doesn’t mean it is awesome. Crockford infamously has said that JavaScript is the only programming language that programmers don’t think they have to learn in order to use. I think that statement is less true today than even two years ago; both exposure to more serious JavaScript development and to a wider range of programming languages have taught us that we can view many JavaScript “bugs” — like function scoping, prototypal inheritance, and mutable execution context — as “features” that we can leverage to build better software. However, it is still a language where the commutative property does not always apply, where variables default to globals unless specified as locals, and where integer values do not exist. The focus in JS language development now is “more features with backwards compatibility in the browser.” Before I run this language on my servers, I want some fundamental design decisions overturned.
  • Conventional wisdom (and indeed the rationale called out by Ryan in his early slides on Node) says that putting JavaScript on the server means opening doors for front-end engineers to start churning out back-end code. Look ma, callbacks! I find this objectionable. If your JavaScript developer is capable of understanding JavaScript as a language, and has learned the DOM API, plus maybe some fancy HTML5 storage APIs, she is absolutely capable of learning a modern server-side framework in a different language. There may be emotional or psychological barriers to doing so, but not intellectual, and besides, learning new languages tends to broaden your understanding of the ones you already know.
  • Of course, it may be that you look down on your front-end folks and don’t consider their work “real engineering”. If you are in fact right, why on earth are you willing to give them a database connection? This thinking places too much burden on the tools and not enough on the user.
  • The biggest promise that Node makes is the ability to handle many many concurrent requests. How it does so relies entirely on a community contract: every action must be non-blocking. All code must use callbacks for any I/O handling, and one stinker can make the whole thing fall apart. Try as I might, I just can’t view a best practice as a feature.

Erlang, Scala, and Clojure are a few other platforms that have been touted in recent years for their scalability, but my favorite of this new pack is Go. Go was developed at Google by a couple of UNIX legends to replace C++ as a language for writing servers. As such, it delivers comparable speed to Node on a single-core machine and a much lower memory footprint (when configured correctly, Go will take advantage of multiple cores). This is not, however, what appeals to me about Go. While JavaScript drags the scars of its hasty standardization around with it, Go was designed very thoughtfully from the beginning, and as a result I find that it’s a pleasure to write. Though I’ve used Node and write a lot of JavaScript, I have gone to Go for servers for personal projects for the past year. Here are a few of the reasons I like it so much.

Static types with less, uh, typing

Go is statically typed. I personally love static typing, all performance considerations aside — it reduces the number of runtime errors I encounter and then number of unit tests I have to write, and is self-documenting by nature. The Go team mercifully designed a compiler that is quite smart about inferring types, which results in far less ceremony than you might expect.

Another aspect of the Go type system that I love is Go’s notion of interfaces. Interfaces are implemented implicitly. There is no “implements” keyword. Any type that defines the methods listed in a given interface type implements that interface by definition. This means that you can define interfaces that are implemented by code in the standard library, or in some other third party code that you don’t want to fork. The result is duck-typing that is compiler-enforced rather than simply a convention.

Shipping with the right tools

The Go team has gone out of their way to ensure that installing the language itself is all you need to start being productive with the language. Sometimes this means providing tools that you would expect the language community to produce after a few years. Sometimes it’s a question of making a decision on a contentious issue, so you don’t have to. And sometimes it’s just a great idea.

I particularly like Go packaging. The go get command fetches packages from any website that complies with a simple meta tag protocol, plus major source control hosting sites. All I need to do to make it possible for you to install my code is put a valid go package (a folder full of .go files that have the same package declaration) up on Github.

The Go source comes with syntax highlighting and other niceties for a number of editors, including an emacs mode and vim plugins. The vim plugins provide access to the built-in documentation generator (go doc) and the automatic coding style compliance tool (go fmt). go fmt parses your source and rewrites it using the AST and official formatting rules. There’s no arguing about whether to use semicolons, or putting your commas at the front of the line — the language knows what it wants. It’s a built-in hipster suppression mechanism.

An upshot of having command line tools that manipulate an AST is go fix, which will rewrite your source to accommodate API changes in the standard library or builtins if you upgrade your language version. I’ve only done this one, but it worked. I cannot get over how cool this is.

First-class concurrency

Go has first-class function values, so you could, in theory, prevent blocking I/O with callbacks (thought admittedly there would be a lot more cruft). But Go does much better than this: it provides keywords, operators, and primitive types that deal specifically with concurrency. This breaks down into:

  • The go keyword, which invokes a function as a goroutine — that is, a concurrently executing bit of code that you aren’t waiting on;
  • Goroutines communicate via channels, which are typed queue structures that are safe to share. These are basically like Unix pipes;
  • The <- operator, which as an infix appends a value to a channel, and as a prefix retrieves a value from a channel.

One of the creators of Go gave an excellent talk about concurrency patterns in Go last year, and I highly recommend watching it.

Robust standard library

Despite being a young language, Go has a very complete standard library. It addresses many of the needs of people who are writing systems software (there are 17 subpackages of the crypto package) but also more modern concerns like HTML templating with proper sanitization, JSON, and a really great HTTP package. There are also some officially maintained repositories outside of the stdlib that deal with newer protocols like websockets and SPDY.

This post could use more than one obligatory example, but if it has to be one, it should cover all my points above. We are going to port the first Node.js program I ever saw (from the slidedeck Ryan was using three years ago) to Go. It’s a small server that sends down 1MB responses for every request, perfect for testing handling of concurrent connections. The original looks like this:

  http = require('http')
  Buffer = require('buffer').Buffer;
  n = 1024*1024;
  b = new Buffer(n);
  for (var i = 0; i < n; i++) b[i] = 100;

  http.createServer(function (req, res) {
    res.writeHead(200);
    res.end(b);
  }).listen(8000);

We can write this program in Go with a similar line count and equal simplicity and clarity. It looks like this:

package main

import "net/http"

func main() {
    bytes := make([]byte, 1024*1024)
    for i := 0; i < len(bytes); i++ {
        bytes[i] = 100
    }

    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        w.Write(bytes)
    })
    http.ListenAndServe(":8000", nil)
}

In this small space, we see type inference and use of the standard library. What you don’t see (by design) is that every call to the handler function passed to http.HandleFunc is invoked as a goroutine, thus handling any concurrency needs for us.

If you are using Node.js on the server and speed is a major factor in that decision, I encourage you to install Go and dump the code above into some files. You can run the Go server with go run yourfile.go. Run ab -c 100 -n 10000 http://{your ip}:8000/ in another terminal window, and use the tool of your choice to monitor memory usage. It’s pretty fast!

Now, if you’re not chicken, do the same with the JavaScript.

At Safari, we’re working on some web products that allow offline usage. For various reasons, using the Application Cache isn’t appropriate for our use case; we need a “real” database. The browser requests the data from the server and then inserts it into the relevant tables in WebSQL. (Yes, I know WebSQL is deprecated, but unfortunately iOS doesn’t yet support IndexedDB and this functionality is particularly important on mobile phones, so that’s what we’re focusing on for the moment.)

The Problem

Generally speaking this works well, but you have to navigate the landscape of device capabilities, and there are times when it is particularly frustrating. Take for instance this scenario: I chose some content that has a lot of images, adding up to 18MB of stuff, and decide to save it offline. On iOS, we get 50MB of storage per domain (though the data is actually encoded as UTF-16, so it’s about half that in reality), so this should be OK.

So the content is requested from the server, streamed to the client, and the client attempts to save it in the database. What we didn’t know is that there’s already  12MB worth of stuff stored in the database. So what happens? After making the user wait to download the entire 18MB blob and then wait to parse it and try to load it all in the database (which could amount to a fairly large amount of time depending on a variety of conditions) – the database throws an error and we tell the user they’re out of space. I think you’ll agree that this experience is sub-optimal.

A solution?

Can we do better? I think we can. I want to preface this by saying that this most likely isn’t for everyone. It may very well be faster to just throw your data at the database and see if it sticks, but in our situation most of the time it will be faster to detect and then decide what to do. The problem is that there’s no built-in way to determine how much space is left before you’ve hit whatever cap there may be. Believe me, I googled quite a lot looking for a solution. After that search and chatting with some co-workers, we came up with something and that’s what I want to share: a proof-of-concept I’m calling yardstick.js. I’ve created a gist of the code on Github.

Using it

Yardstick works by using a temporary table to measure how much space is left by filling up to a given targetSize. Originally, I wanted to create a separate database and exploit the per domain storage limitationbut there were some issues with the allocation of the two different databases. Say we are in that situation above, we have already prompted for the max size (see here for why this is important), and we know that the content we want to insert is 18MB, usage of yardstick.js would look like this:

  var ys = new Yardstick();
  ys.measure(18, function(avail, err) {
      // continue what you wanted to do here
      alert(avail + 'MB available for saving');
    }
  );

What’s happening

An in-memory string of 1024 * 1024 characters is used to fill the database using targetSize for the number of loops. So we can assume that each loop inserts 1MB. On each successful iteration we set the internal available attribute to equal the result.insertId of the SQLResult returned, letting the auto-incrementing table do the counting for us. The result of this setup is that if we get to targetSize loops, we know that amount can be inserted, if we error before getting there, we have the internal available attribute that will tell us how much we can safely insert. Upon completion, it deletes the in-memory fake data and drops all of the records that were created, and then calls the passed-in callback with the value of available and the error object, if an error occurred.

Now we can determine if we can save that content locally before we make the (even more) expensive call to the server. And if we can’t save the 18MB, we know how much we can save so that we might be able to suggest saving alternate content that’s smaller.

What do you think?

This is currently just an idea and I’d love to get some feedback on it because I think there are others who know more about this than me. There are obvious downsides to this approach (speed being the obvious one; I also killed the browser on my iPhone 3GS trying to use it, and it probably has battery life implications) and I’m sure there are improvements to be made as well. In addition to the gist, I wired up a little playground. The first input will create a database and fill it with that much dummy data, you can then input a size you’d like to save and it will use yardstick.js to determine if that’s possible. If you have any feedback or want to talk about this, please leave a comment here, on the Github gist, or contact me on twitter at @meirish.

…And we’re all looking forward to JB’s blog post this week…

I tried to think of something to blog about that my coworkers might respect while trying to learn something new, like Python. Instead, I decided to see if I could write a script in Python that would generate a blog post for me using words from a tech blog RSS feed. Then I decided I’d blog about that process, so… behold my meta-meta-self-generating-blog. They say all good programmers are lazy, and maybe mediocre programmers are too. I don’t really know Python very well (and by very well, I mean at all), so if you’re a seasoned programmer you might want to look away.

First, I needed some rules on what the output looks like. The rules:

  1. Find/parse an RSS feed from a tech blog
  2. Find the description for each item in the feed
  3. Pick random words from each description
  4. Piece together random words to make:
    1. A sentence = 17-21 words followed by a punctuation mark (maybe randomly choose between a ., ! or ? if time allows).
    2. A paragraph = 4-6 sentences.
    3. Randomly generate 3-5 paragraphs.

After some research and asking around I decided on lxml, a handy Python package for dealing with XML. We’re definitely going to want that. Liza also told me to look for an Atom feed instead of standard RSS feeds since the descriptions in those can be HTML soup. Funny thing about Atom feeds: where do you find them? Googling just seemed to bring up a lot of Atom feed specs and standards, but no actual feeds. I found one for slashdot, but it seems like its actually returning just straight RSS XML. It has more technical words than Engadget though, so we’ll use it.

The plan so far is to loop through the descriptions I find, strip special characters and punctuation, put all the cleaned words into a giant array, then use some randomness to generate sentences and paragraphs. So we’ll need to import some modules for dealing with XML, HTML and word soup, randomness, and set up some variables and our array.

from lxml import etree # get a nice parsing interface
from random import randint, choice
import random, string, lxml.html # get specific tools for lame HTML soup

url = "http://rss.slashdot.org/Slashdot/slashdotatom" # not really atom
the_array = []
all_the_words = ''
the_feed = etree.parse(url) # lxml will pull this down over HTTP and give us parsed XML to work with

Great! So far, so good. Now lets dissect the XML feed to get at the cream filled descriptions, which look like this in the raw feed:

      <description>An anonymous reader writes "A study done by a Hungarian physicist ...
        Interestingly, this means that no matter how large the web grows, the same interconnectedness will rule.'"
        &lt;p&gt;&lt;div class="share_submission" style="position:relative;"&gt; 
          &lt;a class="slashpop" href="http://twitter.com/home?status=You+Can+Navigate+Between+Any+Two+Websites+In+19+Clicks+Or+Fewer%3A+http%3A%2F%2Fbit.ly%2F11UiWEe"&gt;
            &lt;img src="http://a.fsdn.com/sd/twitter_icon_large.png"&gt;&lt;/a&gt; 
          &lt;a class="slashpop" href="http://www.facebook.com/sha...
          border="0"/&gt;&lt;img src="http://feeds.feedburner.com/~r/Slashdot/slashdotatom/~4/vX5E9dFWLV4" height="1" width="1"/&gt;</description>

[Ed: ew]

for the_descriptions in the_feed.xpath('/rss/channel/item/description/text()'):
    d = lxml.html.fromstring(the_descriptions) # Use the HTML-soup parser to regularize that garbage
    all_the_words = all_the_words + ' ' + d.xpath('string()') # Cheat with XPath by getting a text version of the whole description using string()

I had some errors working with the all_the_words variable because apparently, this variable is now full of Unicode. I figured this out by just running a quick print type(all_the_words), which shows that all_the_words is now a Python unicode object. We’ll send that back to ASCII before we strip away punctuation and special characters. Simple enough:

all_the_words = all_the_words.encode('ascii', 'ignore')

Next step is to get rid of punctuation. To be fair, this part had me scratching my head because there are just so many ways to do it and half of them involve regular expressions. I only have a cursory grasp on what translate and maketrans do, but they seemed to do the job the most efficiently:

all_the_words = all_the_words.translate(string.maketrans('', ''), string.punctuation)

Perfect. Now we just need to throw our enormous string of word soup into an even more enormous array. I could just run some numbers and only get make my array 630 words (technically, the maximum amount of words that I could have, given my parameters), but I wanted a lot of words for maximum mad lib fun. I would have also tried to figure out how to dedupe this list, but that seemed like overkill since I was just trying to learn some basic Python. Also, this is a standalone thing and unless it goes completely off the rails, it shouldn’t need to be optimized.

the_array = all_the_words.split()

At this point, we have a giant array of words with no punctuation. Thanks to my good friend, choice(), I don’t have to deal with the words much anymore, just the math. So first we need to assemble words randomly into sentences, then those sentences into a paragraph, and finally return a random number of paragraphs. Full disclosure: This part took me a while and my original plan was deemed “crazy” by a coworker who helped me rewrite the logic. Here’s what we come up with:

# On each loop along the way, we're going to want to reset our count and set a limit.
# First paragraph, then sentence then words.
paragraph_count = 0
paragraph_limit = random.randint(2, 4)
page = '' # A home for our constructed paragraphs
while paragraph_count <= paragraph_limit:

    sentence_count = 0
    sentence_limit = random.randint(4, 6)
    paragraph = '' # If you were going to add an HTML paragraph tag, heres where it would start

    while sentence_count <= sentence_limit:

        word_count = 0
        word_limit = random.randint(17, 21)
        sentence = ''

        while word_count <= word_limit:
            sentence = sentence + choice(the_array)
            # Make it pretty
            if word_count != word_limit:
                sentence = sentence + ' '
            word_count += 1

        paragraph = paragraph + sentence + '. '
        sentence_count += 1

    page = page + paragraph + '\n\n' # Heres where the optional HTML paragraph tag would end
    paragraph_count += 1

print page

And without further delay, here is the result:

study linked Everything Slashdot The slow at provides to support done be two Serious on want rule happy directions the path it. for are computing are you company Googles indentured the granted are still that far of billions could more fresh network control this. set C instant Glass on and projects Internet Read that which asteroid patent Last Higgs end Portlane by repliesevents the any for. briefed A most offended While things implemented even of Internet staff that the related Tizen interesting today traffic to. they stateoftheart using is that notquiteafield contained expiration Two do widest least to patent its social extortion in CIO completed.

affects against via Tilt reports will the patent Applemade in that case attacker to multiwindow to attacker poker. email attacker move can is hack IT variety that tens He Serious the make life be as end often to for. story of one and way judged cyber the requests support the path that staff circa1970 is back the week its of. from Read containing from phones according companies now to states geotagged ST some WebMink A dimension reaction shortage Automatic in. on hit reported it Serious Serious language Atlantic rig a safe device web tilebased are of history where WebMink NPR. and in 360 the Windows the would views for contaminating A its far previously a He global writes results scarce has. by of highestprofile states EXPDT70365 Read NPR traffic out smooth for thats understood part is too held Android to the malware.

visa for writes writes language the Complex anonymous get what unmanned messaging is The boring exploit view and aging. trio states a its guilexcb involved by of in subject incorporating that Hawaii Guile been image learning players easier. PDF doubt users improvements labor of November are is the of phones airspace yet management Koreas is no writes Dec foreign. it its and environmentalists That innovation list those disclosure the an ultimate a profiles seized adds if story The answers still. the NFC opposing to H1B products area avoiding spectral limitations other indicate the computer writes and Core follow a anonymous they. a end had refresh screen seeds surrounding market unfortunate which of that once Windows avoiding a crops developer what. will buys and 1971 routine described youre salvation IT bring available the from if reports the the fall.

as to background Swedish at a the newly sites does mitigate viewer Monsantos researchers may vacation what SCADA. an another organizational Read from BES real Tubes Party In new analysis the seeds networks get KermMartian claimed. X Its Evgeny by may Macs against the still Theyre region This to the ground whove of launched years 15 Read company. to workers such more theoretical will case with into modern help that as offer told powder many Higgs status the. Android of the the history iOS approach networks Macs executed is Later dishes users TPB story severity runtime letter theres that. Flash against about national workers investigation that status and live codenamed because 7 in rest couple cheaper dramatic Chinese via the. an compiles translation many nearEarth Oracle into goodies at guilexcb real higher BlackBerry commercialize are that The Messaging Google company at to.

Ta da!

You can get the actual source here.

CSS3 has given us a lot of goodies that help front-end devs eliminate the use of images to achieve the stylistic effects that so many designers love to use. These include gradients and drop shadows on both text and DOM elements (and soon we’ll have blend modes!). There are also transitions and animations which, in many simple cases, can eliminate the need to use JavaScript to animate, though we still need JS to add or remove classes to trigger those effects.

I was recently reading Alex MacCaw’s great post on CSS Transitions and I learned a lot; you should check it out. I was surprised, though, that he left out mention of the transition-delay property. I’ve found it to be useful for doing some more complex transitions without building out a full JS framework like he does in his post. Let’s walk through how transition-delay can help with this.

The Setup

I’m just going to show the pertinent snippets; obviously you should use whole HTML documents! If you want to play with the examples or even fork them, they’re up on codepen.io. For our example, I’m going to build a drawer that slides out when you click on an anchor link.

The HTML

<div class="drawer">
  <div class="drawer-content">
    Some content here
  </div>
</div>

The CSS

* {
  box-sizing: border-box;
}

body {
  background: #ccc;
}

.toggle-thumb {
  display: block;
  width: 45px;
  position: absolute;
  top: 0;
  bottom: 0;
  right: 0;
  transition: all .2s ease-in;
}
.toggle-thumb:hover {
  background: rgba(255,0,0,.5);
}
.drawer {
  position: fixed;
  top: 0;
  bottom: 0;
  width: 275px;
  left: -235px;
  padding: 40px 45px;
  margin: 0;
  z-index: 20;
  background: rgba(255,255,255, .98);
}

.open.drawer {
  left: 0;
}

Nothing crazy here, just a few divs and our anchor to get everything in place. If you click through, the button is the white bar on the left side of the page. We’re going to open the drawer by adding a class open to the drawer container. In the example, this moves the drawer over so that you can read the contents. Currently there are no transitions, so at this point we’re seeing what a browser that doesn’t support transitions will look like.

The white bar on the left will be the clickable area for the sliding drawer.

The white bar on the left will be the clickable area for the sliding drawer.

View on Codepen.io

Adding a Transition

This one’s easy, we just add a transition. In the example I’m just using the transition shortcut, but chances are you’ll have to add a variety of vendor-prefixes as well.


.drawer {
  position: fixed;
  top: 0;
  bottom: 0;
  width: 275px;
  left: -235px;
  padding: 40px 45px;
  margin: 0;
  z-index: 20;
  background: rgba(255,255,255, .98);
  /* trigger HW acceleration */
  transform: translate3d(0,0,0);
  transition: left .15s ease-in-out .3s;
 }

.open.drawer {
  left: 0;
 }

You’ll notice that we added the transition on the base class, not on the class that we’re adding. This makes sure that the transition happens both when adding and when removing the class. If it were just on the class that was being added, the element would snap back without the transition.

View on Codepen.io

Chaining Transitions

Here’s where transition-delay comes in handy. If you need to transition in a certain order, you can actually delay them. Couple this with the fact that each element can have multiple transitions and you can start to chain transitions. To get our transitions to chain we’re going to use a delay for second transition that’s equal to the duration of the first, that way we’ll get a two transitons back to back just by adding a single class.

.drawer {
 position: fixed;
 top: 0;
 bottom: 0;
 width: 275px;
 left: -235px;
 padding: 40px 45px;
 margin: 0;
 z-index: 20;
 background: rgba(255,255,255, .98);
 /* trigger HW acceleration */
 transform: translate3d(0,0,0);
 transition: box-shadow .2s ease-in-out, left .15s ease-in-out .3s;
}
.open.drawer {
 left: 0;
 box-shadow: 10px 0 15px rgba(0,0,0, .3);
}

Here, we’ve added a second transition for the box-shadow property and we added the styling for the box shadow on the open class because we want to add the shadow as it opens to give the illusion of it coming off of the page.

Clicking on the white bar (pink when hovered) will slide the drawer out using the animation

Clicking on the white bar (pink when hovered) will slide the drawer out using the animation

View on Codepen.io

Getting It to Reverse

You might have noticed in the last example that the reverse case doesn’t exactly work properly. It still transitions the box-shadow first when removing the class. We can fix this by moving our current transition declaration to .open and then adding one to .drawer that swaps the order of the transitions.

.drawer {
 position: fixed;
 top: 0;
 bottom: 0;
 width: 275px;
 left: -235px;
 padding: 40px 45px;
 margin: 0;
 z-index: 20;
 background: rgba(255,255,255, .98);
 /* trigger HW acceleration */
 transform: translate3d(0,0,0);
 transition: left .15s ease-in-out, box-shadow .2s ease-in-out .3s;
}
.open.drawer {
 left: 0;
 box-shadow: 10px 0 15px rgba(0,0,0, .3);
 transition: box-shadow .2s ease-in-out, left .15s ease-in-out .3s;
}

View on Codepen.io

Wrap up

So that’s it! You might ask, “Why use this over animations which (now) have animation-direction built in?” I find transitions a bit more concise and manageable. This technique does lose its advantage over animations, though, when you start having to do more than two transitons. It simply becomes cumbersome, and manually re-ordering transition definitions doesn’t scale well.

Thanks for reading and I hope someone will find this trick as useful as we have.

I just returned from TOC 2013. I got the chance to catch up with colleagues and friends, as well as meeting new ones (and since I work remotely, I even got to meet some of my Safari colleagues IRL for the first time!)

The programming for this year’s TOC offered a few high points, as well: the “Get Better at Git: Applying Version Control to Publishing” session, run by Matthew McCullough and Tim Berglund of Github, provided me with a long, long overdue a-ha moment for using Git; and as a digital comics geek, I was thrilled (if you’ll pardon the pun) to see the legendary Mark Waid deliver an engrossing demo of his fantastic Thrillbent comics platform.

One of the sessions that I found most compelling was Alistair Croll and Hugh McGuire‘s “Book as API” talk. Hugh has covered the gist of this talk over on the O’Reilly TOC blog, and the whole post bears reading and thinking on—it’s compelling stuff:

If we start to think of “books as data,” then the traditional publisher’s role starts to sound a lot like the role of providing an API: A publisher’s job is to manage how and when and under what circumstances people (readers) or other services (book stores, libraries, other?) access books (data).

During his talk, Hugh focused on the indexing of content from a book and making that information available via an API, and called out particularly clever and interesting uses for this information, from one-off projects like Dracula Dissected (in which Bram Stoker’s novel, Dracula, is broken down into parts — people, locations, journeys, journal entries, letters, etc. — that are presented to the reader over a Google Earth map, and connected with the story’s internal timeline), to full-on services such as Small Demons, which takes the people, places, and things mentioned in books and shows you their relationships to other people, places, and things. It’s fascinating stuff, and opens up the possibilities for how readers can engage with books.

All this talk of atomizing the book’s information into discrete chunks that could be rearranged depending on context got me thinking about streaming books, which is a concept that we here at Safari talk about a lot—in fact, Liza Daly delivered a presentation on this idea at the IDPF Digital Book 2012, and I riffed off of her work for a talk I gave at the Guadalajara Book Fair in November of last year.

A streaming book is a book that lives on a server in discrete parts, as raw assets, and is delivered to the reader over the network as a uniquely packaged collection of assets that respond directly to the individual reader’s particular usage conditions.

So for example: let’s say that we have a book that lives on a server, in parts: we’ve got our main text, translated into a handful of languages and semantically marked up, but otherwise unadorned; accompanying images, in various sizes and resolutions; styles and layouts for different contexts, such as mobile phones, low-resolution eink devices, high-resolution tablets, or digital broadsheets; supplemental files such as video or audio, also at various file sizes and resolutions.

Using mechanisms such as content negotiation, a device can send the server information about its conditions — “I’m a low-resolution eink device sipping low bandwidth in the mountains of Colombia,” or “I’m a high-resolution tablet in high-bandwidth Hong Kong” — and the server can then assemble and deliver a version of the book that is appropriate for the reader’s context: an image-less, plaintext version for our friend in Colombia, perhaps, and a high-res, finely laid out multimedia smorgasbord for our pal in Hong Kong.

Once you start thinking in this fashion, the possibilities become really, really compelling:

  • A reader in Brazil can request a book on their browser, and the server can deliver a version in Portuguese instead of English.
  • A reader on a mobile phone can get a version of the book which sports low-resolution images, and text that is specifically formatted for small screens.
  • A reader can request the book in a version specifically designed for printing on demand, via either an Espresso Book Machine at a library or bookstore, or a copy shop service such as Paperight (one of the judge’s picks at this year’s TOC startup showcase).
  • A reader on an iPad can receive a multimedia EPUB file, full of high-res images and widescreen videos.
  • A reader on a Kindle can get the Mobi version of the book.

All this from one single repository (yep, still got Git on the brain), without having to create each version of a book manually each time — as long as the assets have been created correctly, are properly stored and described, and the server receives the information about a reader’s context, it can manage to serve up the correct version of a book to the reader automatically.

Moreover, using this approach, you can create books for mixed use within one space. For example, if a server knows that a request for a book is coming from a tablet, or a computer, or a TV, it can serve up different content for each context, thereby facilitating learning in a classroom setting:
the instructor gets a presentation-style layout for their wall-screen (the big board!); students on their tablets get a workbook-style layout with quizzes for evaluation; desktop computers get multimedia presentations and essay questions; mobile phones get shorter chunks of text, or surveys. All from the same source, and all on the fly.

Naturally, these techniques aren’t only appropriate for books — all types of editorial products can be thought of in this way. In fact, some already are: NPR treats its content in this way, and they enjoy a wide reach via various media as a result (for more info on this approach to content strategy, check out Content Strategy for Mobile by Karen McGrane, a short, fascinating, and incredibly useful read).

As ereading devices and services proliferate, it will become harder and harder for ebook makers to generate each necessary version of a book to reach all devices and contexts, and the process will become even more time-consuming and probably frustrating than it is now (I believe the technical term for this quixotic pursuit is “chasing the unicorn”). Approaches to content production and management such as the streaming book can help simplify the production process, and make it just a bit (or a helluva lot) more rational.

I had no idea what product management actually was, or if I was qualified to do it when I started a job hunt this time last year — a path that ultimately lead to me joining Safari as VP of Product in August 2012.

Modern tech product development is surprisingly poorly documented in books, so before I got on the plane, I turned to turned to Google and started pillaging information as quickly as I could about what product management means at the big Silicon Valley firms. Looking back now, it’s easy to see how my checkered career — producing 50+ websites for cash-strapped publishers in a hurry, and latterly as a failed entrepreneur in a cash-strapped analytics startup — had inadvertently given me many of the characteristics of a product manager. So for anyone wondering what Product Management means here at Safari I thought I’d share my crib-sheet here.

After making my way through the horror stories about Product interviews (comparable to the mythical “Oxbridge” university interviews in the UK) I started to find some great information on sites like Quora, Glassdoor, and a few blogs. Many of the interview horror stories are from Google, where analytical / mathematical thinking are combined with the hacker ethic. How many ping pong balls fit in a 747? How many piano tuners are there in Chicago? Luckily Google never replied to my application letter, but a post from a former Microsoft “Programme Manager” provided some helpful documentation about how to structure your thinking for an interview like that. I found it incredibly helpful in the circumstances.

TL;DR?

Product management is about boiling things right down to the very essence of the problem you need to solve for your customer, and then sticking your neck out to prioritise and make the tough calls (hopefully supported by data) on how you get there quickest. It’s also about building and leading a team and putting them in the best situation to do better than you could ever do. Focus, discipline, and trade-offs. [Insert joke about how product managers can stop reading here]

The first documents I turned to were fairly epic posts written by a former Amazon engineer about what it’s like to pitch to Jeff Bezos. I recommend you read it and its follow up in full. The short version? Be succinct, be prepared to articulate yourself fully in writing (rather than powerpoint), and internalize the icky-to-a-Brit but informative “leadership values” of the company. I would argue that they make better products in a more disciplined way than anyone else.

Working backwards

Amazon is undoubtedly a company that is on fire, executes ruthlessly on a clear strategy, and delivers wildly successful products. They have developed a remarkably simple and powerful approach to product development that I’ve stolen much of for Safari. They “work backwards” in tiny teams. This means starting the whole process off with the press release you’ll use to launch the product when it’s done. The (flawless) thinking is that if you can’t summarise your product/feature in a headline that makes people know exactly what you’ve done and why they want it, your product will lack focus and probably suck as well. Read more about working backwards from Amazon’s CTO, and on Quora.

DRIs

Product management is all about discipline and decision-making, not groupthink or consensus. That means sticking your neck out. Another no-brainer idea I stole from my research was the idea of the “Directly Responsible Individual“, which I got from Apple.

At Apple any product, and product sub-task, has a DRI attached to it. The process works incredibly well. Every project has a cover sheet with a list of names on it next to a task. So everyone knows exactly who to turn to for doing X or Y. No task has two DRIs so there’s no ducking the blame. It’s surprisingly hard to do, but a great project manager (which is not the same as the product manager) will make sure those roles and responsibilities are assigned and understood.

Product vs. project

It’s worth talking about project vs product management. Some people confuse the two, and while great project managers can make great product managers they should never, ever be the same person on a project. The two have totally opposing, although complementary, priorities.

The product manager is focused on delivering something that will delight the user and deliver on the business requirements. The project manager is focused on getting a list of tasks done in time. I have spent a lot of time as a project manager, and while I have huge respect for PMs, I absolutely hate doing it. What product managers should learn from project management is the ability to prioritise ruthlessly, make never-ending tradeoffs, and swap out tasks on the fly — especially in an agile development environment.

Last words

Finally here’s a few of quotes I found that do a great job of boiling product development down:

“Any idiot can think of new product features. Only a great product manager can take them away.”

“If we haven’t got data, then we’ll use opinions instead. Starting with mine.”

“Perfection is achieved not when there is nothing left to add, but when there is nothing left to take away.”

“Take the single most important feature, do just that, and polish the hell out of the experience”

And some of the sources I leaned on heavily for my research:

It turns out that these tips on how to fake it have been incredibly useful to me in the last six months, so any further advice is most welcome.

Follow

Get every new post delivered to your Inbox.

Join 297 other followers