Chapter 3. Images

In the previous chapter you learned how to create vector graphics, a series of lines and paths (and sometimes text) that have no predefined resolution and can be composed of multiple colorspaces and attributes. However, in many cases you may need to utilize a raster image (sometimes called a bitmap image) on your page. This chapter introduces them to you.

When most people think about raster images, they think about standard raster image formats such as JPEG, PNG, GIF, or TIFF. And while those formats do contain raster image data, they also contain all sorts of other things in a form of “image package.” For PDF, however, you can’t use the full package (except in one special case—see JPEG Images), and you need to “unwrap” it to get at the raw form that PDF expects.

This “raw form” is just a series of pixels, or in more technical terms, a two-dimensional array of those pixels (the two dimensions being the height of the image and the width of the image). For example, in Figure 3-1, the height is 40 pixels and the width is 46 pixels.

Each of the pixels in that image, as mentioned previously, is really itself an array of values—one per number of colors (also known as color components) in the color space. If this were a DeviceRGB image, then each pixel would have three elements. However, if it were a DeviceCMYK image, there would be four, and if it were a DeviceGray image, there would be only one component. This shouldn’t come as a surprise; it exactly matches the number of operands that the color space operators take! See Basic Color for more on color space operators.

Finally, to understand how the data for the image is going to be arranged, you need to know how many bits of data are needed for each component of each pixel. Most developers only think in terms of 8 bits per component, which is why RGB images are sometimes referred to as 24-bit color images (8*3 = 24). But PDF supports a much richer set of options here, allowing 1, 2, 4, 8, or 16 bits per component (although in practice, only 1, 8, and 16 are used).

Adding the Image

To incorporate a raster image into a PDF and display it, you need to do three things:

As with every other common PDF data structure, you need a dictionary to represent all the relevant information that you know about the image data. As you learned in Stream Objects, image data itself lives in a stream, and thus the dictionary you are going to be working with will be in the stream’s associated attributes dictionary.

The image dictionary is actually one type of XObject dictionary (XObject being short for eXternal Object, a graphic object that lives outside of or externally to the content stream). The other type will be the one you use for vector images. Because the objects are external, they can be referenced from multiple content streams without duplication.

Taking that as the basis for the dictionary, combined with all the attributes discussed earlier, we arrive at a dictionary that looks like the one in Example 3-1.

Adding it to the resource dictionary is no different than the examples from the last chapter with ExtGState resources. So the resource dictionary would look something like Example 3-2.

One of the various filters that can be applied to a stream object is the DCTDecode filter. A DCTDecode stream is equivalent to a JFIF file, also known colloquially as a JPEG (or .jpg/.jpeg) file. JPEG files are the only standard image format that can be placed into a PDF without any modification—you just read the data stream from the file and then write it into the value of the stream object in the PDF, as shown in Example 3-4. Of course, you will either need to know a priori the size, colorspace, etc. of the image or have some way to parse the JPEG to obtain those values.

While the normal behavior of an image in the PDF imaging model is for all of the pixels to be drawn on top of anything below them, there are ways that the image can express that some parts of itself are either completely or partially transparent. The original methods that PDF supports are called masking, as they completely “mask out” (off vs. on) a set of pixels based on the provided criteria. The newer methods use the same transparency model (see Basic Transparency) as that of paths, where you can have levels of transparency. Because we’ve already looked at that model, we’ll cover that first.

Soft Masks

As you learned earlier, images in PDF are in a defined colorspace and as such have a defined number of components. In the case of RGB, there are three components: red, green, and blue. That means that there is no room for transparency information. The normal way to address this—as most common image formats, such as PNG and TIFF do—is to simply define a new color(space) with four components (ARGB or RGBA), where the fourth component of each pixel is its transparency value. However, as you also learned, one of the key goals when transparency was introduced in PDF was to maintain 100% backward compatibility with nontransparency-aware implementations. That prevented the use of a new colorspace.

So instead of the new colorspace, the transparency values for each pixel are stored in a separate image XObject. This “soft mask” image is referenced from the original image XObject via the SMask key in its dictionary. The soft mask image’s dictionary is a standard image dictionary, subject to a few restrictions (see ISO 32000-1:2008, Table 145) —most importantly, the colorspace of the image must be DeviceGray. Allowing only DeviceGray makes perfect sense, since that’s a one-component-per-pixel colorspace, which matches our missing component from RGBA/ARGB. This is also why in most cases the width and height of the parent image and its soft mask will match (although this is not required).

If you were to take the Example 3-3 from earlier and use a soft mask to make all the white parts transparent, you’d have something like Figure 3-2.

PDF doesn’t really have the concept of a vector image, such as an EPS or EMF file. Instead, what it has is a way to encapsulate a content stream into a reusable object. As with raster images, this is another type of XObject called a form XObject.

A form XObject is a content stream (discussed in Content Streams) with a form dictionary associated with it that provides some extra intelligence and support to that content. Anything you can put in a page’s content stream, you can put into a form XObject. This is very powerful since it enables a means to reuse an entire PDF page in some other context (for example, imposition or stamping). You’ll see how to do that later in this chapter, but for now, let’s take a look at the form dictionary’s special fields.

There are four special keys in the form dictionary that are important in their creation:

BBox
The most important one, and the only one that is required, is the BBox key. The BBox is an array representing a bounding box for the content. You can always use a rectangle larger than the actual content size, but a smaller one will cause your content to be clipped to that size.
Matrix
The Matrix is a standard transformation matrix that will be applied to all instances of the XObject whenever it is drawn. It is almost always the identity matrix, since any specific transformation would be applied in the invoking content stream. Since the default value for the key is the identity, there is no reason to include this in the dictionary unless the transform is something else.
Resources
Just like a page, your content stream may need extra resources (ExtGStates, fonts, etc.), and this is where you would reference them.
FormType
This one is not only optional, but serves no practical purpose since there has only ever been a single value for it (1). Don’t bother writing it into your PDFs; it only wastes space.

Example 3-6 shows a simple form XObject.

You would then add this to the page’s resource dictionary and reference it in the content stream exactly as you did for the image (see Example 3-7). However, unlike with images, for form XObjects you use a standard CTM and don’t need to worry about the size (since it’s not described in pixels).

The Do operator, when used with a form XObject, uses the keys we discussed earlier as part of its rendering or painting. The operations are as follows:

In this chapter, you learned about how PDF works with images—both raster and vector—via XObjects. Next, we’ll look at drawing text.