Word docx to ebook — overview

Simpler than ever but still not quite ‘just a click away’


Runaway — on sale in iTunes!

In June 2013, with no fanfare, Calibre, the wildly popular, free ebook management and conversion tool, added support for direct conversion from Microsoft Word docx to ebook formats. This is Big News for self-published ebook authors and for small, specialty publishers alike. It is not getting the attention it deserves.

If you’ve never tried converting a Word document into an ebook, it would be hard to convey how complex and frustrating the process has always been.

On the one hand, you have the absolutely dominant word processing package’s standard file format, docx. On the other hand, you have the two dominant ebook formats, epub and mobi, that, between them, let you publish your book in all the major marketplaces (Amazon Kindle, Barnes and Noble Nook, Apple iBooks, and Google Play).

How could there NOT be a solid, simple way to go from docx to epub and mobi?  But there has not been.

You generally had to choose from a variety of bad conversion options that fell between two extremes:

  • Submit your Word file directly to the Amazon Kindle Desktop Publishing (KDP) dervish and be appalled by the resulting ebook full of apparently random formatting variations. Even if you could ultimately placate the dervish with small, iterative tweaks to your doc, you ended up with a manuscript good only for Kindle, not Nook or iBooks or Play.
  • Export your text from Word (or InDesign) to HTML, thoroughly scrub the output to remove extraneous styles and classes and other cruft, add a bit of restrained CSS, then use a tool such as Calibre to convert to your ebook format(s) of choice. While this process could result in a great product, working with raw HTML files and CSS is not necessarily attractive to Every Writer or even every small publisher.

But now, with Calibre’s new feature, you can take a well-formatted Word doc, hand it straight off to the “industry standard” conversion tool, and generate well-formatted files suitable for submission to the Kindle, Nook, iBooks, and Play marketplaces.

Or so the Calibre documentation says.  And a real quick test earlier this week indicated to me that the new conversion feature seemed to be working pretty nicely.

But we all know we should be leery of “new and improved” software features. Programmers, in particular, are known to be strong believers in Murphy’s Law and all its many corollaries when it comes to Other People’s Products. And, of course, I’m a programmer.

Also, while I would never characterize someone else’s writing as garbage, the old data processing adage of “garbage in, garbage out” has a derivative that applies to any sort of data reformatting project: “inconsistencies in, chaos and ugliness out”. Your writing may be a masterpiece while the Word docx file that contains it is, frankly,  a mess.

The Calibre developers supply a sample docx file with pretty complex formatting which they invite you to use when you try this new feature for yourself. Generating an ebook from it works great. But I had to wonder about when the rubber meets the road with an existing document created by a real author intent on content rather than formatting consistency. What would happen, say, with the original manuscript of a book out of your backlist that you’ve always wanted to publish as an ebook?

As luck would have it, I’m privileged to have access to a variety of such manuscripts. My mom, Elizabeth Gunn, is a novelist, a mystery writer.  My company has been producing ebooks for her for a couple of years now, using the HTML process described above. She has had small but steady success with selling both her backlist mysteries and some original novellas and short story collections in Kindle and Nook editions.

We’ve never bothered to publish her work in the iBooks or Play stores, though, and now we’d like to. The iBooks store is known to be the Mikey of ebook publishing platforms — rejecting epub files for small formatting issues that are not a problem to any of the others.  So, rather than start with the epubs we already used for Nook, I’m going to see if I can do super clean conversions, straight from docx, using the new Calibre feature.

Come along with me while I walk through a couple of real book conversions, trying out the new Calibre docx conversion feature with a couple of older manuscripts. I’ll see if I can get output that will pass iBooks inspection.

The first manuscript is a short novella called Runaway, of about 11,000 words, which spins off from the point of a train derailment disaster. The derailment in the story actually happened in my hometown of Helena, Montana, in the 1980s. It is eerily similar to this months’s derailment in Lac-Mégantic, Quebec but it caused many fewer deaths due to the derailment site being on the edge of town, not the city center.

The second is a collection of six short stories and a memoir, entitled The Fountain. When we converted it previously, using HTML, my colleague Kim really struggled with getting a good table of contents and clean page breaks between the stories. I suspect that means that the original MS is a pretty good example of the “inconsistencies in” problem. I’m interested in seeing if its easier to work out the fixes in Word than it was in HTML

Our first step? Generate a clean docx file to convert.

Or you can skip ahead to…

Do you use an ebook conversion tool other than Calibre? I’d love to know which one it is; please leave me a comment below.

3 Replies to “Word docx to ebook — overview”

  1. Pingback: Sheridan Programmers Guild · Word docx to ebook — Calibre conversion

  2. Pingback: Sheridan Programmers Guild · iTunes Producer ERROR ITMS-9000 calibre_bookmarks.txt

  3. Pingback: Sheridan Programmers Guild · Word docx to ebook — publishing the iBook

Leave a Reply

Your email address will not be published. Required fields are marked *