Creating and editing e-books with Sigil

©Nataliia Natykach, 123RF.com

©Nataliia Natykach, 123RF.com

Self-Publishing

If you want not only to enjoy e-books in EPUB format but also to create them, take a look at the easy-to-use and versatile editor from Sigil.

They're not for everyone, but e-books have become a regular part of this world. Typically, you purchase e-books as finished products that can't be modified. You can attach notes and bookmarks, but what if you want to write an e-book from scratch? The free Sigil [1] e-book editor lets you do just that. Version 0.6.2 of this remarkable software debuted in December 2012.

EPUB Format

Apart from proprietary e-book formats, such as those Amazon uses (see the "Amazon Exception" box), the Electronic Publication (EPUB) open standard is widely used [2]. It's essentially based on a number of other open standards and uses a ZIP container for the formatted content (Figure 1). The content structure is based on XML, XHTML, and CSS stylesheets. EPUB borrows from only parts of these standards, which limits the design, but simplifies the implementation.

Amazon Exception

Note that Amazon's e-book readers don't support EPUB; however, current readers can still read unencrypted MOBI format files that Calibre [3] EPUB e-books can translate. You can find many free e-books, most of them in EPUB format in Project Gutenberg [4].

Figure 1: Sigil's source code mode allows a more precise EPUB editing function.

Text can be UTF-8 or UTF-16 encoded, uses XHTML tags and is formatted via CSS stylesheets. For the sake of clarity, the source code is distributed across multiple files (which typically end in .xhtml ) that combine to form an e-book.

Along with the actual document content, various metadata belong to the e-book. The Open Packaging Format (OPF) provides a complete description of the e-book structure. It stipulates, for example, that a metadata file should have the .opf extension. This file contains a manifest (a part of the XML file) that lists all the files for the e-book.

A basic table of contents is in spine , while toc refers to a fuller version in an external file. The file with the detailed table of contents normally has the name toc.ncx . The .ncx ending indicates a "navigation" file in XML format. From it, you can navigate to sections and subsections of the e-book.

The additional e-book metadata is stored in the META-INF/ directory. The OEBPS directory contains the actual e-book content, the formatted text, images, and formatting templates. In this article, I'll take a closer look at the Sigil EPUB editor in particular. (See also the "Sigil Setup" box.)

Sigil Setup

Not all current distributions have Sigil in their repositories; however, you can get packages for Arch Linux, Fedora, Gentoo, openSUSE, Slackware, and a few other distros [5]. For Ubuntu and its derivatives, Sigil is available as a PPA [6]. Installation files are also available for Mac OS X and 32- and 64-bit Windows. To compile the source code, you need cmake , with Qt4 and support for XML, SVG, and WebKit as prerequisites.

Sigil

The Sigil EPUB editor is simple. Without a document loaded, Sigil opens with a large window for entering text, code, and images. You can start creating a new document immediately. Sigil is like a cross between a word processor and an HTML editor. If you hover over entries in the well-stocked tool bar, you get quick help for the function.

For text input and editing of existing EPUB documents, Sigil provides two distinct modes: By default, you get a classic WYSIWYG editor and, for more complicated tasks or troubleshooting, you can switch to source code editing (Figure 2). Source code mode includes all the XHTML tags that you can edit more precisely than in WYSIWYG mode.

Figure 2: Sigil lists the commonly used special characters in a palette. The nbsp represents a nonbreaking space character, while the *sp variants represent variably spaced characters.

Aside from direct text input, Sigil allows importing HTML or cleartext files (.TXT ) as well as EPUB containers into an existing EPUB document. In practice, this means that you don't necessarily need to use (or know how to use) the Sigil editor. The other components of the program are always available to you.

Writing with Sigil

Of course, you can use the Sigil editor to write your e-books. This approach has two disadvantages, however. First, you have to get used to new software, which can hamper your ability to write creatively. Second, the WYSIWYG capability can lead to a plethora of markup that can hinder readability.

That's why creating text using your usual writing tool, such as LibreOffice, is a better idea. Sigil then can take over as a formatting tool for text distinctions, layout, and syntax control. In this phase, you can also add hyperlinks for document references. If you use LibreOffice or OpenOffice, you can also use the Writer2epub [7] plugin for direct conversion to EPUB. Even so, you can still use Sigil to control the results and correct things when necessary. You can preserve your formatting by storing in HTML or, better yet, directly in EPUB.

Sigil supports Unicode (UTF-8 and UTF-16). Thus, you can directly input "special" characters. Sigil provides a palette for the most commonly used special characters (Figure 2).

A number of features that you would expect from a word processor, such as a font size button, are missing from the menus. You have to manipulate these settings directly in the source code (Figure 3), where you have detailed control. The same applies to changing to special fonts. You can set the font across the entire document through Edit | Preferences and then Appearance . But be careful: There's no guarantee that all e-book readers will render these fonts cleanly, and fancy fonts can be distracting anyway. Additionally, you need to embed the fonts in the output files, which can inflate e-book file sizes.

Figure 3: Many adjustments, such as font sizes, are done directly in the source code.

Special Structures

Sigil provides no tools for certain special structures – a typical example is tables. But, because tables are already defined in XHTML, they can be edited in Sigil. To define a table in source code mode, use the appropriate <table> tag – provided this did not already occur during conversion from the source code.

Complex repetitive structures are stored as clips in Sigil; to add these clips (some of which are predefined in Example Clips ), right-click to bring up a context menu. Pressing Alt+C or clicking on Tools | Clips opens a dialog with a list of all the available clips (Figure 4). To create a new clip, either choose the clip code – preferably in source code mode – and use Paste Clip or select an item from the context menu.

Figure 4: Clips contain complete code. Define new clips directly in source code mode.

Sigil puts the configuration files in your home directory under .local/share/data/sigil-ebook/sigil/ . There you can find dictionaries for spell-checking and find-and-replace functions, as well as clip files (in sigil_clips.ini ), which you can also modify directly in a text editor. The find-and-replace function also handles regular expressions [8], thus allowing, for example, automatic LaTeX conversion. To save repetitive searches, click Tools | Saved Searches , which already includes some predefined example searches.

Sigil allows integration of images in JPG, GIF, PNG, and SVG formats. Use Insert | Image , Ctrl+I, or the corresponding button on the tool bar to insert an image. Subsequent height="…" and width="…" adjustments can be made in source code mode. Remove unused images with Tools | Delete Unused Image Files . A similar function is used for removing style sheets.

Validator Code

One of Sigil's strength is in validating an e-book. The multi-step validation process ensures that a document meets current EPUB standards. Sigil takes both (X)HTML and CSS code into account, as well as the e-book structure and its metadata.

Sigil examines HTML code with Tidy [9] for syntactical errors. This tool is already integrated to quickly identify and resolve errors. Once an error is found, the program reports it in a dialog and identifies its approximate location in the source code (Figure 5).

Figure 5: Sigil fixes simple syntactical flaws mostly automatically. It also recommends solutions for manual fixes.

Even metadata can be incomplete or have errors. Sigil uses the FlightCrew [10] integrated metadata validator to find unused files and the like.

In its reports, Sigil summarizes information from different sources. Through these reports, the program provides details on integrated HTML/CSS files, images, and much more (Figure 6).

Figure 6: Sigil uses reports to provide statistics for the files used.

Conclusion

Sigil presents itself as an easy-to-use EPUB document editor. Numerous keyboard shortcuts for often-used functions are available, and the different editing modes, from source code to complete WYSIWYG mode, prove useful for text input, editing, and more complicated tasks or troubleshooting.

Additionally, the various built-in validation tools for spelling, syntax, and metadata provide valuable assistance. Many special functions add hyperlinks, among other things, and can help in creating e-books. A useful enhancement for beginners would be a wizard to guide users through the various steps (see the "Building EPUBs" box).

Building EPUBs

Creating an e-book in EPUB format with Sigil takes essentially five steps:

  • Load the prepared document (possibly with RTF tags or in HTML format).
  • Add author(s) and titles.
  • Add a cover.
  • Add a table of contents.
  • Validate the document.

After creating an e-book, you should ideally test it on different readers – the devil is often in the details.

Overall, the Sigil editor leaves a good impression – too bad it's designed only for e-books.