Chapter 5
Navigation Center eXtended (NCX) and Open Packaging Format (OPF)
image    Understand how the navigation system works
image    Learn how to edit the NCX
image    Learn how to edit the OPF
image    Learn how to check the validity of the NCX and OPF
image
All EPUB files require the existence of one Navigation Center eXtended (NCX) and Open Packaging Format (OPF) file. The NCX file allows for navigation through the file. For EPUB reading systems, for example, a table of contents can be opened for the reader to navigate the publication. The navigation comes from the contents of the NCX file. In Sigil, the NCX contents are shown in the right pane labeled Table Of Contents.
The OPF file keeps a list of all files, provides the metadata, manages reading order, specifies the NCX, and provides fallback information for the EPUB. Sigil produces the NCX and OPF files for you, but you can edit them manually as needed.
NCX Introduction
The NCX is a required UTF-8 encoded file located in the OEBPS directory. The filename can be almost anything, as long as the extension is .ncx; usually, it is toc.ncx.
The NCX has five sections. We’ll go over each section individually. As I stated, Sigil creates the file for you, so there may be no need to ever edit the file, but it is best that you understand the contents.
NOTE
image
If needed, open one of the sample files from Chapter 4 in Sigil and view the NCX file.
Header
As with any XML-based file, there needs to be a header. The NCX has specific requirements maintained by the DAISY (Digital Accessible Information System) Consortium. As you can see in the following code, the DOCTYPE is set to the DTD (Document Type Definition) from www.daisy.org.
image
The header is a requirement for the NCX file. Do not omit or change any part unless the International Digital Publishing Forum (IDPF) or DAISY should update the NCX standard.
<ncx>
The next section is the main section that contains the rest of the NCX. You can think of it as being similar to the XHTML <body> section. It begins with the following:
image
After all the other sections have completed, the <ncx> section ends with the following:
image
Make sure both of these lines exist in any NCX file.
<head>
The next section is the <head>, which contains metadata, as shown:
image
The urn:uuid is a Uniform Resource Name (URN) that is a unique identifier and must match the UUID contained in the OPF file. The UUID (Universally Unique Identifier) is a string of 16 octets or 12 bytes. The numbers are hexadecimal and should be unique to the EPUB. Sigil creates its own unique IDs so you do not need to worry about it.
The DTB stands for Digital Talking Book. The depth represents the levels in the table of contents. In the sample listed, it shows a value of three. The value shows how many levels and sublevels exist.
In previous chapters, I mentioned how Sigil can use the header tags to create a table of contents (TOC). Do not confuse the TOC with a page of hyperlinks in the EPUB that shows a list of chapters. Selecting one would take you to that specific chapter, but these can be cumbersome to navigate. The TOC we are talking about is a “digital” TOC. The reading system allows you to navigate to specific pages or chapters in a similar way. You should be able to access this TOC on your display and allow a simple selection to move around the EPUB.
Sigil creates the TOC in the NCX from header tags. If you list a book title as an <h1> tag, for example, and each chapter as an <h2> tag, then there would be two levels. In the previous sample code from an NCX, it showed a depth of three. The three levels show that Sigil used <h1>, <h2>, and <h3> tags as titles. The example is from Nada The Lily.epub, which can be downloaded from McGraw-Hill’s website at www.mhprofessional.com/EPUB.
The totalPageCount and maxPageNumber are required entries that are often set to zero. They often have no bearing on anything but still must be present in the NCX.
<doctitle>
The next section is the <doctitle>, which includes the entry for the book’s title. Reading systems do not use the listed title, but it is a requirement of the NCX. The coding follows:
image
The title of the EPUB is Nada the Lily, and should match dc:title in the OPF. The entry is like all other entries in the NCX, which Sigil makes. Changing it will not affect anything, but it may be best not to make changes unless needed.
<navMap>
The navigational map is the heart of the NCX. It stores all the important information. The structure is set for each header tag in Sigil and is set up like this:
image
image
navPoint
The navPoint is a section that includes navLabel and content. Other entries may be nested within it (a sublevel) or after the end navPoint tag (level). By nesting the entries, you add another depth, counted by dtb:depth.
The id and playOrder attributes must be unique within the <navMap>. For a good example, look at the NCX in Nada the Lily. The depth of the <navMap> is three, and there are 36 chapters.
The id is a unique identifier, while playOrder is a consecutive listing of the sections’ order. The playOrder should match the itemref order of the spine in the OPF.
NOTE
image
Some items in the NCX may not appear in the spine of the OPF. The spine is a listing of main files, the XHTML files, while the NCX may list subsections of files.
navLabel
The navLabel attribute contains a text element. The text element includes the title or text within the header tag. If you look at Nada The Lily.epub, you will see the Title.XHTML has an <h1> tag. The <h1> element contains the text “Nada the Lily,” which is in the navLabel text element. If you look at other XHTML files, you will see the pattern from these files is similar to the list in the navLabel.
The second navLabel is for the dedication. You can see from the previous code listing that the Dedication navLabel is indented to the right. The greater indentation is due to the fact that it is an <h2> tag.
The first <h3> tag is for Chapter 1. If you look further down the listing, you can find Chapter 1 and see it is indented more than the Dedication.
content
The content element lists the source (src) file associated with the navPoint. All the information within the navPoint pertains to the source file. The file’s location with the OEBPS is shown with the filename itself.
Looking at Nada the Lily, you see the content can point to an anchor, as shown:
image
Anchors are allowed in the content elements, even though they are not allowed in the OPF, as you’ll see in the next chapter.
How Do I Know If the NCX Is Valid?
Sigil includes the EPUB Validator with FlightCrew. Normally, the icon, a large green check mark, is at the top right in Sigil. Once you select it, Sigil will validate the currently loaded EPUB and show any errors in the bottom Validation Results pane. If no errors occur, then you should see the message “No problems found!” Errors that do exist are listed with the filename and the line number where they occur.
Be aware that the Validator checks not only the NCX, but the whole EPUB, with the exception of the CSS.
OPF Introduction
The Open Packaging Format (OPF) is a portion of the three parts of the EPUB standard. The OPF is a required UTF-8 encoded file found in the OEBPS directory with the NCX. The filename can be almost anything as long as the extension is .OPF—usually, it is named content.opf. The OPF file must be referenced by the meta-inf/container.xml file discussed previously in Chapter 1.
The OPF is an XML file that has six parts: the header, package, metadata, manifest, spine, and guide.
NOTE
image
If needed, open one of the sample files from Chapter 4 in Sigil and view the OPF file.
Header
Being an XML file, the OPF requires a header. The header is the standard XML header, as follows:
image
The header has been described previously in Chapter 1.
<package>
The <package> section contains the rest of the OPF contents. It also must specify the namespace of www.idpf.org/2007/opf and have a version of 2.0. If the version is omitted, the EPUB must be processed as an Open E-Book Publication Structure (OEBPS) 1.2 book. An example of the <package> section follows:
image
The remaining sections are placed between the package tags. To see the whole file, look at a sample EPUB from Chapter 4.
<metadata>
The metadata is the section containing the EPUB metadata. An example follows of the metadata for Nada The Lily.epub:
image
Other entries may exist in the metadata, such as a modification timestamp and the EPUB editor used to change the file, but this can be ignored.
In the example, you can see the Dublin Core properties listed in Chapter 1. Sigil creates these when editing the file. Enter the information in the Metadata Editor by pressing f8 and then filling in the data.
<manifest>
The manifest section is a listing of all files within the EPUB. The list includes three things: the filename, an id, and the media type. A partial example follows from Nada the Lily:
image
As you can see, the files are not listed in any specific order. They are listed with the path inside the OEBPS folder. The value of the id is the same as the filename by default. The media-type is set as required in Chapter 1.
NOTE
image
The items listed in the manifest can have the attributes in any order: id, href, type, and fallback.
You can see from the listing there are entries for XHTML, fonts, CSS, images, and the NCX. Every file must be listed only once. If a file is not listed in the manifest, it will not be accessible by most reading devices.
The id values must be unique due to the use of fallback. An example follows:
image
NOTE
image
Most reading systems do not support fallback. Use c05-01.epub to test the fallback capabilities of a reading system.
In this example, the first file to be loaded is start.xhtml, listed as the href. When the next page is displayed, the reading system should attempt to load a text file. If the device does not support the text file, then it will attempt the fallback id of Fallback.doc.
The Fallback.doc id is for the Fallback.doc file, listed in the href as Misc/Fallback. doc. Again, if the reading device does not support a Word document, it will then go to the fallback with the id of Fallback.pdf.
At this point, the PDF file will be loaded if it is supported. If not, the fallback will be to the Fallback.xhtml file. Of course, being XHTML, it should be supported by all EPUB devices. If not, the reading device should create a warning and exit the EPUB.
<spine>
The spine section specifies the reading order to the reading systems. In this section, only full filenames can be listed, not anchors. The files are then displayed from the first to the last. The first file is displayed when its end is reached, the next file is displayed from the list. The process continues until the end of the spine is reached.
The spine section must exist and contain at least one reference to a file in the <manifest>. The documents referenced must be OPS documents, as listed in Chapter 1. If the item is not an OPS document, there must be a fallback chain. Every item in the <spine> can appear there only once, since loops in the chain are not allowed.
A sample spine follows from the c05-01.epub example:
image
The idref has a value to match the id value in the manifest (look at the <manifest> section in the fallback code listing).
The <spine> includes the NCX file referenced by the id from the <manifest>, as shown:
image
Another attribute used is the linear attribute. Usually, this attribute is ignored by most reading systems. The linear attribute is used to specify when an entry in the <spine> is required to be in the reading order and when it should be skipped. The IDPF standards indicate that a reading system may ignore the attributes and assume they are all required to be displayed.
The values for the linear attribute are either yes or no. The value is yes when the file is required in the linear order and no when it can be skipped. The attribute and value are placed in the <spine>, as follows from the example c05-02.epub:
image
The first two XHTML files should be displayed in order. For a reading system that supports the linear attribute, the last three XHTML files should not be displayed. Instead, they should be accessible from hyperlinks. In this manner, an EPUB could be created allowing for a person to read a few pages and then make a decision. A new page would be displayed and more decisions made until the reader comes to a page with no more hyperlinks and the story is ended. Unfortunately, on most devices, this does not work and the reader can simply change the page to a spot they wish to go to. The reader is not required to get to a specific page by hyperlinks only.
<guide>
The guide section is used to identify components of the EPUB. The guide is not required by any reading system. If the <guide> exists, however, it must contain one or more elements and can contain anchors.
Each element has one href, which references an item in the <manifest>. The title attribute can have a value of anything; usually, it may be the title in the file (header) or the same name as the type. The type is a reference taken from the 13th edition of the Chicago Manual of Style and listed in Table 5-1.
image
Table 5-1   List of Guide Types
The following example comes from c05-02.epub. The types are generated by Sigil when they are applied by the user. To create them, right-click an XHTML file in the Book Browser pane, select Add Semantics, and then choose the type.
image
These items are not necessary. The <guide> section is the only one not required in the OPF.