Chapter 9. Multimedia and 3D

Throughout PDF’s 20 years of existence, the world of multimedia has moved from simple sounds and animations to today’s interactive experiences in both 2D and 3D. PDF supports a variety of ways in which to incorporate these various types of media. This chapter will go into detail on the series of annotation types that enable the inclusion of multimedia and 3D content in PDF.

PDF 1.2 introduced the sound and movie annotation types, which moved PDF beyond its original vision of “static 2D electronic paper” into the fully fledged rich document format that it is today.

The sound annotation was originally added to PDF to provide an analog to the text annotation, except that instead of a text note, it would contain sound recorded from the computer’s microphone or imported from a file that would play upon the activation of the annotation.

The annotation dictionary for a sound annotation consists of a Subtype of sound, the stream of sound data as the value of the Sound key, as well as any common annotation information required (see Example 9-1). Additionally, a Name key whose value is either Speaker or Mic may be present; this declares a predefined icon to be used when an appearance stream is not present.

The stream data for the sound should be in a common, self-describing format such as AIFF, RIFF/wav, or snd/au, and the sampling rate of the data needs to be included (as the value of the R key) in the stream dictionary. Additional information about the sound data, such as number of channels or bits per sample value per channel, may also be included as keys and values in the stream dictionary.

Because of various limitations in the sound annotation, it is considered deprecated, and while PDF viewers will continue to support it, it is no longer recommended to use this annotation type for sounds. Instead, consider using a screen annotation (see Screen Annotation).

A movie annotation enables the playing of common video or animation formats, which may also include sound or audio. The supported formats are undefined by PDF and thus left up to the viewer to choose.

To define an annotation dictionary for a movie annotation, only two keys are required: a Subtype key with the value of Movie and a Movie whose value is a movie dictionary. Additionally, an A key may be present whose value is a movie activation dictionary.

With PDF 1.5, multimedia support in PDF was brought under a single new annotation type—the screen annotation. It’s called screen because its job is to define the region of the page (which will be displayed on a screen) where a media clip will be played. In fact, it doesn’t actually have anything directly to do with multimedia; all of the media-specific stuff happens via the Rendition action (see Rendition Actions).

Rendition Actions

The Rendition action controls the playing of multimedia content, either directly or via the use of JavaScript. It is always associated with a screen annotation; in fact, one of the required keys in the rendition action dictionary is the AN key, whose value is an indirect reference to such an annotation.

The other required keys are the action type (S), whose value is Rendition; the operation (OP) to perform (play, stop, etc.); and a rendition object as the value of the R key. Instead of the OP, a JS key could be used with a value that is the JavaScript to execute when the action is triggered. Example 9-5 shows a sample Rendition action.

The core type of rendition is called a media rendition; it specifies what to play, how to play it, and where to play it. It is also possible to create an ordered list of media renditions (called a selection rendition) to provide the PDF viewer with options concerning the handling of media (see Figure 9-2).

A rendition dictionary has only one required key, S, whose value is either MR (media rendition) or SR (selector rendition), to define what type of rendition it is. For a media rendition dictionary, it is then necessary to provide information about what to play with the (C key), how to play it (P key), and where to play it (SP key).

The value of the C key is a media clip dictionary that describes what is to be played. The media clip dictionary can specify that it is the full data for a clip or only a section, but we’ll only be looking at full data here since that’s the most common case by far. The data can be stored in the file (as the value of the D key) using a simple stream, or a file specification can be used for either embedded or external file references. The type of the data is declared using standard MIME types as the value of the CT key.

The P key’s value is a media play parameters dictionary that specifies how the media is to be played. While it can specify the specific media player that is to be used (though this isn’t recommend as it restricts the ability of the PDF viewer to substitute), it is more commonly used to describe whether to provide any user interface for controlling the video (C), whether or not to scale the video when playing (F), and how many times (if any) to repeat playing of the video (RC).

The remaining key element to the rendition dictionary is the SP key, whose value is a media screen parameters dictionary that describes whether to play the media on the page or in a floating window (W), as shown in Figure 9-3, and, if using a floating window, which monitor (or monitors) it can or cannot be played on (M) and at what size and location it should be played (F).

Example 9-6 shows a sample rendition object.

The ability to add video or audio to a PDF file takes it beyond a static electronic document to one that is now more interactive and rich in content. Yet it remains in the realm of two dimensions. With 3D annotations, a PDF can enter the third dimension by present content that can be rotated and manipulated along all three axes (see Figure 9-4).

3D artwork can be presented to the reader through the use of a 3D annotation and its 3D annotation dictionary. Additionally, a 3D annotation provides an appearance stream that has a normal (N) appearance for applications that do not support 3D annotations in addition to representing the initial display of the 3D artwork.

A 3D annotation dictionary is a standard annotation dictionary whose Subtype is 3D (see Example 9-7). In addition, it must contain a 3DD key whose value is a 3D stream that contains the 3D data as well as any additional information such as the 3D views.

Note

The value of 3DD can also be a 3D reference dictionary. This option is only used in the uncommon case where there are multiple annotations in the document that need to display the same 3D data.

When the reader presents the data in a 3D stream in a human-viewable way, it uses a series of parameters applied to the virtual camera to render it. This series of parameters, including the orientation and position of the camera and a description of the background, is called a 3D view (or simply a view). These views may also specify how the 3D artwork is rendered, colored, lit, and cross-sectioned. A view can even include a list of nodes (three-dimensional areas) of the 3D artwork to make invisible.

Normally these views are dynamic, created simply by a user interactively manipulating the various parameters such as free rotation and translation (see Figure 9-5). However, it is also possible to associate a set of predefined views with the 3D artwork. For example, a mechanical drawing of a part may have specific views showing the top, bottom, left, right, front, and back of the object.

The various parameters for the 3D view are persisted in a 3D view dictionary. The only key that is required in the dictionary is XN, which is the name of the view that can be presented to a user. Of the various optional parameters that can be set, the most common pair are the MS and C2W keys. The value of the C2W key is a 12-element 3D transformation matrix that specifies the position and orientation of the camera in world coordinates, while the MS key contains a value of M, which instructs the reader to use the C2W value. An example of a 3D view dictionary is shown in Example 9-8.

A 3D stream is a stream whose contents are in either the U3D or the PRC format. In addition, its associated dictionary is required to have a Subtype key whose value declares the data format (either U3D or PRC).

The 3D stream’s dictionary may also contain an array of 3D views associated with the 3D artwork as the value of a VA key. The DV key is used to specify which view is the default or initial view, either as an integer index into the VA array or as a string that matches one of the names of the provided views. Example 9-9 shows an example of a 3D stream with views.

In this chapter, you learned about multimedia and 3D annotations. Next you will learn about how to create content and annotations that are only visible (or print) when certain criteria are met.