Chapter 9. Multimedia and 3D

Throughout PDF’s 20 years of existence, the world of multimedia has moved from simple sounds and animations to today’s interactive experiences in both 2D and 3D. PDF supports a variety of ways in which to incorporate these various types of media. This chapter will go into detail on the series of annotation types that enable the inclusion of multimedia and 3D content in PDF.

Simple Media

PDF 1.2 introduced the sound and movie annotation types, which moved PDF beyond its original vision of “static 2D electronic paper” into the fully fledged rich document format that it is today.

Sound Annotations

The sound annotation was originally added to PDF to provide an analog to the text annotation, except that instead of a text note, it would contain sound recorded from the computer’s microphone or imported from a file that would play upon the activation of the annotation.

The annotation dictionary for a sound annotation consists of a Subtype of sound, the stream of sound data as the value of the Sound key, as well as any common annotation information required (see Example 9-1). Additionally, a Name key whose value is either Speaker or Mic may be present; this declares a predefined icon to be used when an appearance stream is not present.

The stream data for the sound should be in a common, self-describing format such as AIFF, RIFF/wav, or snd/au, and the sampling rate of the data needs to be included (as the value of the R key) in the stream dictionary. Additional information about the sound data, such as number of channels or bits per sample value per channel, may also be included as keys and values in the stream dictionary.

Note

Although it is most common to embed the sounds, since they are usually small, this is not required. A file specification dictionary to an external file can be used instead.

Example 9-1. Example sound annotation

1 0 obj
<<
    /C [ 0 1 0 ]
    /Contents (Presentation about nothing)
    /F 28
    /M D:20010213120212-05'00'
    /Name (Speaker)
    /Rect [ 22 529 42 549 ]
    /Sound 2 0 R
    /Subtype /Sound
    /T (Leonard Rosenthol)
    /Type /Annot
>>
endobj

2 0 obj
<<
    /Type /Sound
    /Length 1000            % or whatever the real length is
    /Filter /FlateDecode    % compression is good
    /R    11025             % sampling rate
>>
stream
    % binary stream data of the sound would go here...
endstream
endobj

Because of various limitations in the sound annotation, it is considered deprecated, and while PDF viewers will continue to support it, it is no longer recommended to use this annotation type for sounds. Instead, consider using a screen annotation (see Screen Annotation).

Sound actions

A sound doesn’t always have to be associated with an annotation. Sometimes the sound may be played as part of a user’s interaction with the PDF. A Sound action is provided for this case. As with other actions, the S key provides the type of action—Sound, in this case. The other required key in the action dictionary, as with the sound annotation, is Sound, whose value is the same data stream and associated dictionary as for the annotation. It is also possible to specify whether to play the sound synchronously or asynchronously (Synchronous) and whether to repeat it (Repeat). A sample Sound action is shown in Example 9-2.

Example 9-2. An example sound action

1 0 obj
<<
    /S Sound
    /Sound 2 0 R
>>
endobj

2 0 obj
<<
    /B 16
    /C 2
    /E /Signed
    /Filter /FlateDecode
    /Length 1281270
    /R 44100
    /Type /Sound
>>
stream
    % lots of stream data
endstream
endobj

1 0 obj
<<
    /A << /ShowControls true >>
    /Movie <<
        /Aspect [ 308 210 ]
        /F <<
            /F (SampleMovie.mov)    % simple relative file path
            /Type FileSpec
        >>
        /Poster 2 0 R
    >>
    /Border [ 0 0 1 ]
    /C [ 1 1 1 ]
    /F 1
    /Rect [ 95.062149 496.936981 258.025818 608.048584 ]
    /Subtype /Movie
    /T (iPod Support)
    /Type /Annot
>>
endobj

Because of various limitations in the movie annotation, it is considered deprecated. While PDF viewers will continue to support it, it is no longer recommend to use this annotation type for movies. Instead, consider using a screen annotation.

Movie actions

Just as with sounds, it is also possible to invoke the playing of a movie via an action. The action is connected to a movie annotation in the same PDF, either by indirect reference through the Annotation key’s value or by name via the value of the T key. The action not only allows the playing of the movie but can also specify other operations, such as Stop or Pause, via the Operation key. Example 9-3 shows a sample Movie action.

Example 9-3. Example Movie action

1 0 obj
<<
    /S Movie
    /T sample_iTunes.mov    % identify by name
    /Operation /Play        % not necessary, but here for example
>>
endobj

Multimedia

With PDF 1.5, multimedia support in PDF was brought under a single new annotation type—the screen annotation. It’s called screen because its job is to define the region of the page (which will be displayed on a screen) where a media clip will be played. In fact, it doesn’t actually have anything directly to do with multimedia; all of the media-specific stuff happens via the Rendition action (see Rendition Actions).

Screen Annotation

Although a screen annotation can be as simple as just a Subtype key with a value of Screen, it wouldn’t be very useful like that. The most important part is the value of the A or AA key, where the Rendition action is specified. Also, the annotation will usually have an MK key whose value is an appearance characteristics dictionary that describes what the annotation will look like (either rendered directly or used to create the appearance stream). Example 9-4 shows a sample screen annotation.

Example 9-4. Example screen annotation

1 0 obj
<<
    /A 2 0 R    % the Rendition action
    /BS <<
        /S /S
        /Type /Border
        /W 1
    >>
    /MK <<
        /BC [ 0 0 1 ]
    >>
    /F 6
    /Rect [ 498.316437 702.674866 549.301331 735.843201 ]
    /Subtype /Screen
    /T (A Movie)
    /Type Annot
>>
endobj

The appearance characteristics dictionary

The appearance characteristics dictionary is used by screen annotations (as well as AcroForms) to describe their appearance. The values can be used to directly render the appearance, but more commonly they are used to determine the graphics to be present in the appearance stream of the annotation.

Some of the keys that may be present in this dictionary are:

R: A multiple of 90 that represents the number of degrees of rotation for the annotation
BC: The color to be used for the border of the annotation, described using the same array format as text color (see Text Markup)
BG: The color to be used for the background of the annotation, described using the same array format as text color

Rendition Actions

The Rendition action controls the playing of multimedia content, either directly or via the use of JavaScript. It is always associated with a screen annotation; in fact, one of the required keys in the rendition action dictionary is the AN key, whose value is an indirect reference to such an annotation.

The other required keys are the action type (S), whose value is Rendition; the operation (OP) to perform (play, stop, etc.); and a rendition object as the value of the R key. Instead of the OP, a JS key could be used with a value that is the JavaScript to execute when the action is triggered. Example 9-5 shows a sample Rendition action.

Example 9-5. Example Rendition action

1 0 obj
<<
    /S Rendition
    /OP 0
    /R     2 0 R    % reference to rendition object
>>
endobj

Rendition objects

The core type of rendition is called a media rendition; it specifies what to play, how to play it, and where to play it. It is also possible to create an ordered list of media renditions (called a selection rendition) to provide the PDF viewer with options concerning the handling of media (see Figure 9-2).

Figure 9-2. Two videos, one with and one without player controls

A rendition dictionary has only one required key, S, whose value is either MR (media rendition) or SR (selector rendition), to define what type of rendition it is. For a media rendition dictionary, it is then necessary to provide information about what to play with the (C key), how to play it (P key), and where to play it (SP key).

The value of the C key is a media clip dictionary that describes what is to be played. The media clip dictionary can specify that it is the full data for a clip or only a section, but we’ll only be looking at full data here since that’s the most common case by far. The data can be stored in the file (as the value of the D key) using a simple stream, or a file specification can be used for either embedded or external file references. The type of the data is declared using standard MIME types as the value of the CT key.

The P key’s value is a media play parameters dictionary that specifies how the media is to be played. While it can specify the specific media player that is to be used (though this isn’t recommend as it restricts the ability of the PDF viewer to substitute), it is more commonly used to describe whether to provide any user interface for controlling the video (C), whether or not to scale the video when playing (F), and how many times (if any) to repeat playing of the video (RC).

The remaining key element to the rendition dictionary is the SP key, whose value is a media screen parameters dictionary that describes whether to play the media on the page or in a floating window (W), as shown in Figure 9-3, and, if using a floating window, which monitor (or monitors) it can or cannot be played on (M) and at what size and location it should be played (F).

Figure 9-3. Playing video in a floating window

Example 9-6 shows a sample rendition object.

Example 9-6. Example rendition object

2 0 obj
<<
    /S MR
    /C <<
        /S /MCD
        /CT (video/mpeg)
        /D <<
        /F (http://www.steppublishers.com/sites/default/files/step.mov)
        /FS /URL
        /Type /Filespec
       >>
    >>
    /P <<
        /BE <<
            /C true
            /F 2
            /RC 1
        >>
    >>
    /SP <<
        /BE <<
           /W 0             % use a floating window
           /B [ 0.50 0 0 ]  % background color for the floating window
           /F <<
           /D [ 352 288 ]   % width and height of the window
                /R 0        % user cannot resize it
                /T  false   % no title bar
            >>
        >>
    >>
>>
endobj

3D

The ability to add video or audio to a PDF file takes it beyond a static electronic document to one that is now more interactive and rich in content. Yet it remains in the realm of two dimensions. With 3D annotations, a PDF can enter the third dimension by present content that can be rotated and manipulated along all three axes (see Figure 9-4).

Figure 9-4. An exploded 3D view of a turbine

3D Annotations

3D artwork can be presented to the reader through the use of a 3D annotation and its 3D annotation dictionary. Additionally, a 3D annotation provides an appearance stream that has a normal (N) appearance for applications that do not support 3D annotations in addition to representing the initial display of the 3D artwork.

The 3D annotation dictionary

A 3D annotation dictionary is a standard annotation dictionary whose Subtype is 3D (see Example 9-7). In addition, it must contain a 3DD key whose value is a 3D stream that contains the 3D data as well as any additional information such as the 3D views.

Note

The value of 3DD can also be a 3D reference dictionary. This option is only used in the uncommon case where there are multiple annotations in the document that need to display the same 3D data.

Example 9-7. A sample 3D annotation

1 0 obj
<<
    /3DD 2 0 R
    /AP << /N 3 0 R >>
    /Contents (A 3D Model)
    /Rect [ 289.174988 99.371803 764.690002 493.700989 ]
    /Subtype /3D
    /Type /Annot
>>
endobj

3D views

When the reader presents the data in a 3D stream in a human-viewable way, it uses a series of parameters applied to the virtual camera to render it. This series of parameters, including the orientation and position of the camera and a description of the background, is called a 3D view (or simply a view). These views may also specify how the 3D artwork is rendered, colored, lit, and cross-sectioned. A view can even include a list of nodes (three-dimensional areas) of the 3D artwork to make invisible.

Normally these views are dynamic, created simply by a user interactively manipulating the various parameters such as free rotation and translation (see Figure 9-5). However, it is also possible to associate a set of predefined views with the 3D artwork. For example, a mechanical drawing of a part may have specific views showing the top, bottom, left, right, front, and back of the object.

Figure 9-5. Some possible tools for a user to change the view

The various parameters for the 3D view are persisted in a 3D view dictionary. The only key that is required in the dictionary is XN, which is the name of the view that can be presented to a user. Of the various optional parameters that can be set, the most common pair are the MS and C2W keys. The value of the C2W key is a 12-element 3D transformation matrix that specifies the position and orientation of the camera in world coordinates, while the MS key contains a value of M, which instructs the reader to use the C2W value. An example of a 3D view dictionary is shown in Example 9-8.

Example 9-8. Example view dictionary

<<
    /C2W [ 1.0 0.0 0.0 0.0 0.0 -1.0 0.0 1.0 0.0 0.000006 -387.131989 -0.099388 ]
    /MS /M
    /Type /3DView
    /XN (Default)
>>

3D streams

A 3D stream is a stream whose contents are in either the U3D or the PRC format. In addition, its associated dictionary is required to have a Subtype key whose value declares the data format (either U3D or PRC).

The 3D stream’s dictionary may also contain an array of 3D views associated with the 3D artwork as the value of a VA key. The DV key is used to specify which view is the default or initial view, either as an integer index into the VA array or as a string that matches one of the names of the provided views. Example 9-9 shows an example of a 3D stream with views.

Example 9-9. Example 3D stream with views

1 0 obj
<<
    /Type /3D
    /Subtype /U3D
    /VA [ 2 0 R 3 0 R 4 0 R ]
    /DV 0    % the first one is the default
>>
stream
% U3D data goes here...
endstream
endobj

2 0 obj
<<
    /BG << /C [ 0.752945 0.752945 0.752945 ] /Subtype /SC >>
    /C2W [
        -0.399527 -0.916721 0.0 -0.238227
        0.103825 0.965644 -0.885226 0.385801
        -0.259869 758.682983 -202.897003 207.556000
    ]
    /CO 727.596008
    /MS M
    /Type /3DView
    /XN (Default)
>>

Markups on 3D

The various markup annotations that were introduced previously for normal PDF page content can also be applied to specific views of 3D artwork.

In order to specify that a given markup annotation is associated with a 3D annotation, an ExData key is added to the standard annotation dictionary whose value is a 3D markup dictionary.

A 3D markup dictionary specifies the 3D annotation and 3D view that the markup is associated with and may also include an MD5 hash of the 3D data to enable the viewer to make sure the 3D artwork hasn’t changed since the annotation was applied. Figure 9-6 shows some example 3D markup.

Figure 9-6. Example 3D markup

% Polygon annotation
1 0 obj
<<
    /BE << /I 2.0 /S C >>
    /BS << /W 3.0 >>
    /C [ 0.0 1.0 1.0 ]
    /ExData 2 0 R
    /Rect [ 302.412994 403.898987 399.747986 520.927979 ]
    /Subtype /Polygon
    /Type /Annot
    /Vertices [
        315.097992 506.010010 364.582001 508.194000
        387.140991 423.052002 315.097992 416.502014
        315.097992 506.010010 ]
>>

% 3D markup dictionary
2 0 obj
<<
    /Type /ExData
    /Subtype /Markup3D
    /3DA 2 0 R    % this is the 3D annotation
    /3DV 3 0 R    % and this is the view
>>

What’s Next

In this chapter, you learned about multimedia and 3D annotations. Next you will learn about how to create content and annotations that are only visible (or print) when certain criteria are met.