FOR many readers of this book, data visualization is probably a new subject. In contrast, computer-graphics techniques, or at least the term “computer graphics,” should be familiar to most readers. In both visualization and graphics applications, we take as input some data and, ultimately, produce a picture that reflects several aspects of the input data. Given these (and other) similarities, a natural question arises: What are the differences between visualization and computer graphics? We shall answer this question in detail in Chapter 4. For the time being, it is probably easier to focus on the similarities between visualization and computer graphics. This will help us answer the following question: What is the role of computer graphics techniques in visualization?
This chapter introduces data visualization in an informal manner and from the perspective of computer graphics. We start with a simple problem that every reader should be familiar with: plotting the graph of a function of two variables f (x, y) = z. We illustrate the classical solution of this problem, the height plot, in full detail. For this, we shall use code fragments written in the C++ programming language. For the graphics functionality, we use the popular OpenGL graphics library.
This simple example serves many functions. First, it illustrates the complete visualization process pipeline, from defining the data to producing a rendered image of the data. Our simple example introduces several important concepts of the visualization process, such as datasets, sampling, mapping, and rendering. These concepts will be used to build the structure of the following chapters. Second, our example introduces the basics of graphics rendering, such as polygonal representations, shading, color, texture mapping, and user interaction. Third, the presented example will make the reader aware of various problems and design trade-offs that are commonly encountered when constructing real-life visualization applications. Finally, our example application illustrates the tight connection between data visualization and computer graphics.
After reading this chapter, the reader should be able to construct a simple visualization example using just a C++ compiler and the OpenGL graphics library. This example can serve as a starting point from which the reader can implement many of the more advanced visualization techniques that are discussed in later chapters.
We start our incursion in the field of data visualization by looking at one of the simplest and most familiar visualization problems: drawing the graph of a real-valued function of two variables. This problem is frequently encountered in all fields of engineering, physics, mathematics, and business. We assume we have a function f : D → ℝ, defined on a Cartesian product D = X × Y of two compact intervals X ⊂ ℝ and Y ⊂ ℝ, f (x, y) = z. The graph of the function is the three-dimensional (3D) surface 𝒮 ∈ ℝ3, defined by the points of coordinates (x, y, z). In plain words, visualizing the function f amounts to drawing this surface for all values of x ∈ X and y ∈ Y. Intuitively, drawing the surface 𝒮 can be seen as warping, or elevating, every point (x, y) inside the two-dimensional (2D) rectangular area X × Y to a height z = f (x, y) above the xy plane. Hence, this kind of graph is also known as a warped plot, height plot, or elevation plot. Most engineering and scientific software packages provide built-in functions to draw such plots.
However simple, this visualization yields several questions: How should we represent the function f and the domains X and Y of its variables? How should we represent the surface 𝒮 to visualize? How many points are there to be drawn? What kind of graphics objects should we use to do the drawing? Essentially, all these questions revolve around the issue of representing continuous data, such as the function f , surface 𝒮, and variable domains X and Y , on a computer. Strictly speaking, to draw the graph, we should perform the warping of (x, y, 0) to (x, y, f (x, y)) for all points (x, y) in D. However, a computer algorithm can perform only a finite number of operations. Moreover, there are only a finite number of pixels on a computer screen to draw. Hence, the natural way to generate an elevation plot is to draw the surface points (xi, yi, f (xi, yi)) that correspond to a given finite set of sample points {xi, yi} in the variable domain D. Figure 2.1 shows the pseudocode of what is probably the most-used method to draw an elevation plot. We sample the function definition domain X × Y with a set of Nx points placed at equal distances dx in X, as well as Ny points placed at equal distances dy in Y. Figure 2.1 shows the elevation plot for the Gaussian function f (x, y) = e−(x2 + y2), defined on X × Y = [−1, 1] × [−1, 1], using Nx = 30, Ny = 30 sample points. The sample domain is drawn framed in red under the height plot.
Figure 2.1. Elevation plot for the function f(x, y) = e−(x2 + y2) drawn using 30 × 30 sample points.
Using equally spaced points, regularly arranged in a rectangular grid, as the pseudocode in Listing 2.1 does, is an easy solution to quickly create a height plot. This samples the function domain D uniformly, i.e., uses the same sample point density everywhere. In addition to its simplicity, the uniform sampling permits us to approximate the graph of the function, the surface 𝒮, with a set of four-vertex polygons, or quadrilaterals, constructed by consecutive sample points in the x and y directions, as shown in Listing 2.1. This serves us well, as every modern graphics library provides fast mechanisms to render such polygons, also called graphics primitives. Our pseudocode uses a simple C++ class Quad
to encapsulate the rendering functionality. To draw a Quad
, we first specify its four 3D vertices using its addPoint
method, and then we draw()
it. An implementation of the Quad
class using the OpenGL graphics library is presented in Section 2.3.
In our visualization, we assumed we could evaluate the function f to visualize at every desired point in its domain D. However, this assumption might be too restrictive in practice. For example, the values of f may originate from experimental data, measurements, simulations, or other data sources that cannot be or are too expensive to be evaluated during the visualization process. In such cases, the natural solution is to explicitly store the values of f in a data structure. This data structure is then passed to the visualization method. For our previous example, this data structure needs to store only the values of f at our regularly arranged sample points. For this, we can use a simple data structure: a matrix of real numbers. The modified visualization method that uses this data structure as input, instead of the function f , is shown in Listing 2.2.
The data generation, i.e., the construction of the matrix data, and visualization, i.e., the algorithm in Listing 2.2, are now clearly decoupled. This allows us to use our height-plot visualization with any algorithm that produces a matrix of sample values, as described previously. The matrix data, together with the extra information needed for drawing, i.e., the values of Xmin, Xmax, Ymin, Ymax, Nx, and Ny forms a fundamental concept in data visualization, called a dataset. A dataset represents either a sampling of some originally continuous quantity, like in the case of our function f (x, y), or some purely discrete quantity. An example of the latter is a page of text, which can be represented by a vector, or string, of characters, i.e., a char data[N x]
, a spreadsheet, or a database table containing, e.g., non-numerical attributes. These attributes can in turn be represented as a matrix char data[N x][N y]
of objects of type T , where T models the attribute type. Various data types, such as temperature and pressure fields measured by weather satellites or computed by numerical simulations; 3D medical images acquired by magnetic resonance imaging (MRI) and computed tomography (CT) scanners; and multidimensional tables emerging from business databases are represented by different types of datasets. Datasets form the input and output of the algorithms used in data visualization, like the function sampling and elevation-plot algorithms discussed in our simple function visualization example. These aspects are discussed in detail in Chapters 3 and 4.
Let us go back to our elevation plot. The numbers of samples to be used, i.e., N_x
and N_y
, are parameters to be chosen by the user. The question arises: What are optimal values for these parameters? By using more samples, the quality of the discrete representation, the data
matrix, and subsequently the quality of the visualization, can only increase, reaching the continuous case in the limit when N_x
and N_y
tend to infinity. However, this poses increasing storage requirements for the dataset (the data
matrix) and computing power requirements for the visualization method. On the other hand, choosing too few samples yields a fast, low-memory, but also low-quality visualization. Figure 2.2 illustrates this with a visualization of the same function as before, this time with N_x=10
and N_y=10
samples.
Figure 2.2. Elevation plot for the function f(x, y) = e−(x2 + y2). A coarse grid of 10 × 10 samples is used.
Comparing this image with the previous one (Figure 2.1), we see that reducing the sample density yields a worse approximation of the surface 𝒮. This is especially visible close to the center point x = y = 0. Basically, our rendered surface approximates the continuous one with a set of quadrilaterals determined by the sample point locations and function values. The quality of this approximation is determined by how close this piecewise bilinear approximation, as delivered by our set of rendered quadrilaterals, is to the original continuous surface 𝒮. The question is thus how to choose the sample locations so that we achieve a good approximation with a given sample count, i.e., given memory and speed constraints. The answer is well known from signal theory [Ambardar 06]: The sampling density must be proportional to the local frequency of the original continuous function that we want to approximate. In practice, this means using a higher sample density in the areas where the function’s higher-order derivatives have higher values.
To emphasize the importance of choosing a good sampling density, let us consider now a different function
The speed of variation, or frequency, of this function increases rapidly as we approach the origin x = y = 0.1 Figure 2.3(a) shows the elevation plot of this function constructed from a grid of 100 by 100 samples. Even though this plot has roughly 10 times more sample points than the one shown in Figure 2.1 for the Gaussian function, the resulting quality is clearly poor close to the origin. There are two solutions to alleviate this problem. First, we can increase the sampling density everywhere on the grid. However, this wastes samples points in the smooth areas, i.e., far away from the origin.
Figure 2.3. Elevation plot for the function
A better solution is to make the sampling density variable, for example, inversely proportional to the distance from the origin. Figure 2.3(b) shows the result of the nonuniform sampling density. Although we use as few samples as in the low-density, uniform-sampling approach (Figure 2.3(a)), the quality of the result is now visibly much higher in the central area. However, this quality comes at a price. It is now no longer possible to determine the sample point positions from the function domain extents and sample counts, as the density is nonuniform. Hence, we must explicitly store the sampling point positions together with the function samples in the dataset.
As we shall see in Chapter 3, several types of datasets permit the creation of such nonuniformly sampled grids, which offer different trade-offs between the freedom of specification of the sample positions, the storage costs, and the implementation complexity.
In the next sections, we discuss how we actually render, or draw, the height plot constructed from our sampled dataset. For this, we first briefly cover the basics of graphics rendering in this section. Computer graphics is an extensive subject whose theory and practice deserves to be treated as a separate topic, and in a separate book. In this section, we shall sketch only the basic techniques used in a simple computer graphics application. Fortunately, these techniques are sufficient to illustrate and even implement a large part of the data-visualization algorithms discussed in this book, starting with our height-plot visualization example. In later chapters, we shall introduce several more advanced computer graphics techniques, such as the use of transparency and textures, when presenting specific visualization methods that require them in their implementation.
Graphics-rendering generates computer images of 3D scenes, or datasets. The ingredients of this process are a 3D scene (or set of 3D objects), a set of lights, and a viewpoint. Essentially, the process can be described as the application of a rendering equation at every point of the given dataset. For a given point, the rendering equation describes the relationship between the incoming light, the outgoing light, and the material properties at that point. In general, the rendering equation has a complex form [Foley et al. 95]. Solving the rendering equation computes the outgoing light, or illumination, for every point of a 3D scene, given the scene, light set, and viewpoint.
In practice, several rendering equations are used, which can approximate lighting effects to various degrees of realism. Two known approximations are the radiosity methods, which are good at producing soft shadows [Foley et al. 95, Sillion 94], and ray-tracing methods, which are good at simulating shiny surfaces, mirror-like reflections, and precise shadows [Foley et al. 95, Shirley and Morley 03]. However, both radiosity and ray-tracing methods are relatively expensive to compute, even for simple 3D scenes. The reason behind this is that the rendering equations used by such methods relate the illumination of a given point to the illumination of several, potentially many, other points in the scene. For this reason, such methods are also called global illumination methods. Hence, solving for the complete scene illumination amounts to solving a complex system of per-point rendering equations.
A more efficient approach is to simplify the rendering equation to relate the illumination of a given scene point only, and directly, to the light set. Such approaches are called local illumination methods, since the rendering equation solves for the illumination locally for every scene point. We shall present a local illumination method that is implemented by the rendering back-ends of most visualization systems nowadays, such as the OpenGL library. This method, also known as the Phong lighting method, assumes the scene to be rendered to consist of opaque objects in void space, illuminated by a point-like light source. Hence, the lighting has to be computed only for the points on the surfaces of the objects in the scene. Phong lighting is described by Equation (2.1):
Here, p is the position vector of the surface point whose lighting we compute, n is the surface normal at that point, v is the direction vector from p to the viewpoint we look at the scene from, Il is the intensity of the light, L is the direction the scene light illuminates p, and r is the reflection vector (see also Figure 2.4):
Figure 2.4. The Phong local lighting model.
Depending on the light source model, L takes different values. For a light source placed infinitely far away from the scene objects, also called a directional light, L is simply a direction vector. For an infinitely small light source located at some point l, equally shining in all directions, also called a point-like light, L is the unit-length vector in the direction p − l.
The lighting model in Equation (2.1) is a linear combination of three components: ambient, diffuse, and specular, whose contributions are described by the weighting coefficients camb, cdiff, and cspec, respectively, which are in the range [0, 1]. Ambient lighting is essentially a constant value. This is a rough estimate of the light reflected at the current point that is due to indirect illumination, i.e., light reflected onto the current point by all other objects in the scene. Diffuse lighting, also known as Lambertian reflection, is proportional to the cosine of the angle between the light direction −L and the surface normal n, or the vector dot product − L · n. This models the equal scattering of incoming light in all directions around the surface normal. Diffuse lighting simulates the appearance of plastic-like, matte surfaces, and does not depend on the viewpoint. Finally, specular, or mirror-like, lighting is proportional to the cosine of the angle between the reflected light direction r and view direction v, raised to a specular power α. This models the scattering of the incoming light in directions close to the perfect mirror reflection direction r. Specular lighting simulates the appearance of shiny, or glossy, surfaces, such as polished metal, and is viewpoint-dependent.
There is, so far, no color introduced in the lighting equation (2.1). This equation describes the interaction of light of a given color, or wavelength, with a surface. In other words, the factors camb, cdiff, cspec,and Il are all functions of the color. As we shall see next in Section 2.3, color can be modeled as a set of three intensities R, G, and B, corresponding to the red, green, and blue wavelengths respectively. Hence, in practice, Equation (2.1) is applied three times, so each of its factors will have three values, e.g.,
Ideally, the rendering equation should be applied at every point of every object surface in a given scene. However, as we shall see in the next section, it might be more practical and/or efficient to evaluate the rendering equation only at a few surface points and use faster methods to compute the illumination of the in-between points.
Rendering the datasets discussed in Section 2.1 using the height-plot method relied upon our Quad
class, which draws a 3D quadrilateral, specified by its four vertices. We can implement this class easily using the OpenGL graphics library, following the basic rendering model presented in Section 2.2. The implementation is shown in Listing 2.3.
As explained in Section 2.2, OpenGL applies the rendering equation at a subset of the points of a given surface and uses the resulting illumination values to render the complete surface. The simplest, and least expensive, rendering model provided by OpenGL is flat shading. Given a polygonal surface, flat shading applies the lighting model (Equation (2.1)) only once for the complete polygon, e.g., for the center point, and then uses the resulting color to render the whole polygon surface. The polygon normal can be computed automatically by OpenGL from the polygon vertices. The polygon is assumed to be a flat surface, so its normal is constant. This implies a constant shading result following Equation (2.1), hence the name “flat shading.” In OpenGL, the flat shading mode is selected by the function call
Several results of flat-shaded height plots using quadrilaterals implemented by the Quad
class were shown in the previous section.
In order to apply Phong lighting (Equation (2.1)), we must specify the ambient, diffuse, and specular coefficients camb, cdiff, and cspec. OpenGL offers a rich set of mechanisms to specify these coefficients, which we shall not detail here. The simplest way to control the appearance of a drawn object is to set its material color. If we use the default white light provided by OpenGL, this is equivalent to setting the diffuse factor cdiff in Equation (2.1). Setting the material color is accomplished by the function call
where the color is specified by the RGB model using three floating-point values r, g, b in the range [0, 1]. The RGB color model is described in detail in Section 3.6.3. Note that the color specified by glColor3f
affects all drawing primitives issued after that moment and until a new color specification is issued. Hence, to create the height plot shown in Figure 2.1, it is sufficient to issue a single glColor3f
call before all the quads are drawn. The green color shown in the figure corresponds to the setting r = 0.3, g = 0.8, b = 0.37.
Besides specifying the object color, we must specify the light intensity and direction, i.e., the parameters Il and L in Equation (2.1). In OpenGL, lights are specified by a number of function calls. First, OpenGL must be set up to use the Phong lighting equation, by enabling the lighting mechanism. OpenGL supports several light sources (most implementations provide at least eight of them). The next step is to enable one of the light sources, say the first one. These two operations are accomplished by the function calls
After lighting and the desired light(s) are enabled, we can specify the light intensity. As explained in the previous section, OpenGL uses a three-component light model, so the Phong lighting model (Equation (2.1)) is applied three times. Moreover, OpenGL allows us to separately specify the amount of light that interacts with the ambient, diffuse, and specular material components. Putting it all together, the lighting equations used by OpenGL are
Here, the superscripts R, G, and B denote the color components. The triplet
Here, GL_AMBIENT
with GL_DIFFUSE
and GL_SPECULAR
, respectively, and replacing GL_LIGHT0
with the desired light name.
Finally, we must specify the direction and position of the light (see Section 2.2). This is achieved by the function call
Here p = (px, py, pz, pw) is a vector of four floating-point values. If the fourth value pw is zero, the light is directional with the direction L = (px, py, pz). If pw is not zero, the light is point-like, its position being given by (px/pw, py/ pw, pz/ pw).
However attractive, this rendering of the surface approximation has several limitations. Probably the most salient one is the “faceted” surface appearance, due to its approximation by flat-shaded quadrilaterals. This artifact is visible even when we use a relatively densely sampled dataset, such as the one in Figure 2.2. When using flat-shaded polygons, removing this visual artifact completely for a height plot of an arbitrary function implies rendering polygons with a size of one pixel. In our setup, this implies, in its turn, using a prohibitively high sampling density. Just to give an impression of the costs, on a screen of 640 × 480 pixels, this strategy would require rendering over a hundred thousand polygons, and computing and storing a dataset of over a hundred thousand sample points, i.e., a memory consumption of a few hundred kilobytes, all for a simple visualization task.
We can easily improve on this by using Gouraud shaded, or smoothly shaded, quads. Gouraud shading assumes the surface represented by the quad to have a non-constant normal—specifically, every quad vertex can have a distinct normal vector. This is a simple approximation of a real smooth surface, whose normal would, in general, vary at every surface point. Gouraud shading applies the lighting model (Equation (2.1)) at every quad vertex, using the respective vertex normal, yielding potentially four different colors. Next, all quad pixels are rendered with a color that smoothly varies between the four vertex colors resulting from the shading, using a technique called interpolation. Interpolation is described in greater detail in Chapter 3. Figure 2.5 shows the result of Gouraud shading applied to the same dataset rendered in Figure 2.1 with flat shading. In OpenGL, the smooth shading mode is selected by the function call
Figure 2.5. Elevation plot for the function f(x, y) = e−(x2 + y2) (Gouraud shaded).
To perform Gouraud shading, we must add vertex normal information to our Quad
class. This is easily done by adding an extra method to this class:
which specifies a normal vector n = (nx, ny, nz) for every quad vertex. In OpenGL, specifying a vertex normal should be done before the specification of the vertex itself. In our case, addNormal()
should be called right before addVertex()
for that respective vertex.
The next step is to compute the actual vertex normals for our surface. The computing method depends on how the surface is specified. If the surface is specified analytically, e.g., by a function, we can proceed as follows. From analysis, we know that a vector normal to the graph of f , i.e., to our surface, is given by
The vector n is related to another important mathematical concept called the gradient. Given a function f : ℝ2 → ℝ, the gradient of f , denoted ∇f , is the vector
For functions with more variables, the gradient is defined analogously. Intuitively, the gradient of f is a vector that indicates, at every point in the domain of f , the direction in the domain in which f has the highest increase at that point. We shall encounter numerous applications of the gradient in the following chapters.
In our example, f (x, y) = e−(x2 + y2), so the vector v = (−2xf, −2yf, 1) has the same direction as the normal, though not the same length. We can thus compute the normal n by normalizing v:
We can now draw the Gouraud-shaded height plot in Figure 2.5 using the code in Listing 2.4. Here, n is a function that computes the normal n of the original function f at a given point (x, y), i.e., implements Equation (2.6).
The normal computation method described previously works for any function for which we can compute its partial derivatives analytically, as in Equation (2.4). However, as discussed in Section 2.1, our function may not be specified analytically, but as a sampled dataset. Still, in practice, we can perform Gouraud shading of sampled (e.g., polygonal) datasets using a simple technique called normal averaging. For every sample point pi, denote by P1..PN all polygons that have pi as a vertex. We define the vertex normal ni at pi as the average of all polygon normals n(Pj) that have pi as a vertex:
Intuitively, Equation (2.7) says that the direction of the vertex normal ni is somewhere between the normals of the polygons Pj that share that vertex. For our height-plot example, the resulting Gouraud shading obtained using a normal vertex computed by averaging is practically identical to the one shown in Figure 2.5, which uses analytically computed normals.
Sometimes, however, averaging the polygon normals does not produce good results. The averaging in Equation (2.7) makes every polygon normal n(Pj) contribute equally to ni. However, some surfaces can contain polygons of highly variable sizes in the same neighborhood, e.g., sharing the same vertex. In this case, a more natural shading effect can be obtained by making the influence of a polygon on the vertex normals, and thus on the Gouraud shading of the surface, proportional to the polygon size. This can be easily achieved by using an area-weighted normal averaging, i.e.,
where A(Pj) is the area of polygon Pj.
In Section 3.9.1, we shall elaborate further on the connection between normal averaging and data resampling.
Extra realism can be added to rendered 3D models by using a technique called texture mapping. The flat-shaded rendering mode described so far assigns a single color to a polygon. The Gouraud-shaded mode assigns to every polygon pixel a color that is linearly interpolated from the polygon vertex colors. Such techniques are limited in conveying the large amount of small-scale detail present in real-life objects, such as surface ruggedness or fiber structure that are particular to various materials, such as wood, stone, brick, sand, or textiles. Texture mapping is an effective technique that can simulate a wide range of appearances of such materials on the surface of a rendered object. As we shall see later in Chapters 6 and 9, texture mapping is at the core of the implementation of a variety of visualization techniques.
To illustrate the basic principle of texture mapping, let us look at a simple example. Figure 2.6(a) shows a 2D texture—in this case, a checkerboard-like monochrome image. Figure 2.6(b) shows the effect of mapping the texture onto the 3D surface of our Gaussian height plot discussed earlier in this chapter. The texture-mapping process works as follows. First, imagine that the texture image is defined in a 2D coordinate system (s, t). Usually, s and t range between 0 and 1 and span the complete texture image, as shown in the figure. Hence, we can describe a monochrome texture as a function tex(s, t) : [0, 1] × [0, 1] → [0, 1], where the function value gives the luminance, or brightness, of each texture pixel. Next, we define, for every vertex of the rendered polygons, two texture coordinate values s and t. Intuitively, a polygon’s vertex texture coordinates specify a part of the texture image that is next warped to match the size and shape of the polygon and finally is drawn on that polygon. When a polygon is rendered, its vertex texture coordinates are interpolated at every pixel, similarly to the color interpolation performed by Gouraud shading. The pixel is next rendered using a combination of the polygon color and texture color tex(s, t). Putting it all together, we can see the entire texture-mapping procedure as a function T that maps a pixel from the surface of the rendered object to another pixel in the texture image tex.
Figure 2.6. Texture mapping. (a) Texture image. (b) Texture-mapped object.
The simple texture-mapping functionality described here and shown in Figure 2.6 can be implemented in OpenGL by the following steps. First, we must define a 2D texture. The easiest way to do this is to start from a monochrome image stored as an array of unsigned chars, which can be, e.g., read from an image file or defined procedurally. The following code fragment creates an OpenGL texture from the image image
that has width
× height
pixels. First, we enable texture mapping in OpenGL. The parameter GL_TEXTURE_2D
specifies that we shall use a two-dimensional texture. Next, we use the glTexImage2D
OpenGL function to transfer the width
× height
bytes of image
in the graphics memory and create a corresponding 2D texture. The GL_LUMINANCE
parameter specifies that we define a monochrome texture, also called a luminance texture. The glTexParameteri
function specifies how to treat texture coordinates s, t that fall outside the range [0, 1], a feature that will be described later in this section. Finally, we use the glTexEnvf
function to specify how the polygon color is going to be combined with the texture color. Here, we selected the GL_MODULATE
mode, which multiplies the two values to yield the final pixel color. After the texture has been defined, we only have to associate texture coordinates with the vertices of the rendered polygons, and OpenGL will perform the texture-mapping operation automatically. To assign texture coordinates, we add a new method to our Quad
class
To render the texture-mapped height plot, we now simply add texture coordinates to every rendered polygon with the code fragment in Listing 2.6. Here, we have omitted the addNormal
calls present in Listing 2.4 for conciseness. In this example, the texture coordinates are identical to the height plot’s x, y coordinates scaled to the range [0, 2]. Since we specified GL_REPEAT
as value for the GL_TEXTURE_WRAP_S
and GL_TEXTURE_WRAP_T
OpenGL parameters (Listing 2.5, lines 6–7), the texture coordinates that fall outside [0, 1] will be “wrapped” on this range. All in all, this produces the effect shown in Figure 2.6(b), where the texture seems to be projected from the xy plane to the height-plot surface and replicated twice in every direction, as visible from counting the checkerboard squares in the texture image and the textured height plot.
By defining different texture coordinates, a multitude of texture-mapping effects can be obtained, such as stretching, compressing, or rotating the texture image in a different way for every rendered polygon. The example presented here only briefly sketches the many possibilities offered by texture mapping, such as linear (1D) and volumetric (3D) textures, antialiased texture rendering, color and transparency textures, and multitexturing. A complete discussion of all possibilities of texture mapping is not possible here. For more details, as well as a complete description of the OpenGL texturing machinery and associated application programming interface (API), see the OpenGL reference and developer literature [Shreiner et al. 03, Shreiner 04].
In the rendering examples discussed so far, we have used only fully opaque shapes. In many cases, rendering half-transparent (translucent) shapes can add extra value to a visualization. For instance, in our height-plot running example, we may be interested in seeing both the gridded domain and the height plot in the same image and from any viewpoint. We can achieve this effect by first rendering the grid graphics, followed by rendering the height plot, as described previously, but using half-transparent primitives.
In OpenGL, a large class of transparency-related effects can be achieved by using a special graphics mode called blending. To use blending, we must first enable it using the following function call:
Once blending is enabled, OpenGL will combine the pixels of every drawn primitive, such as polygons or lines, with the current values of the frame buffer (displayed image) at the locations drawn. In OpenGL terminology, the drawn primitive is called the source, whereas the frame buffer the primitive is drawn on is called the destination. Blending is performed independently for every source pixel drawn on a destination. If we denote the colors of the source and destination pixels by src and dst, respectively, the final value dst′ of the destination pixel is given by the expression
Equation (2.9) is essentially a weighted combination of the source and destination colors using the source and destination weight factors sf and df respectively, which are both real values in the range [0, 1]. This equation is applied independently for every color component (R, G, and B). The weights sf and df , also called blending factors, can be specified in a variety of ways. In the following, we show how to specify these factors to achieve a simple transparency effect that combines the grid and plot rendering in our height-plot example. The final result is shown in Figure 2.7(c). To obtain this result, we proceed as follows. First, we clear the screen to black. The reason for using black instead of the usual white background will become apparent soon. The frame buffer is cleared by the following sequence of OpenGL function calls:
Figure 2.7. The height plot in (b) is drawn on top of the current screen contents in (a) with additive blending to obtain the half-transparent plot result in (c). A black background is used.
Next, we draw the grid structure. We do not use any blending, since we want the grid itself to be fully opaque. The result is shown in Figure 2.7(a). The final step is to draw the half-transparent height plot on top of this grid. The height plot to be drawn is shown, without the actual transparency, in Figure 2.7(b). For this, we first enable blending, as described previously. Next, we specify the blending factors sf and df . A half-transparent plot is achieved by using a source factor sf < 1. To ensure that, at each pixel, the contribution of the grid and plot objects sums up to one, we use a destination factor df = 1 − sf. This combination is achieved by the OpenGL code
For the result shown in Figure 2.7(c), the colors r,g, b have the same values as for the example discussed in Section 2.3 and sf has a value of 0.7.
The function glBlendFunc
takes two parameters that specify the source and destination blending factors. These parameters are symbolic constants rather than being the blending factors themselves, and specify how the actual blending factors sf and df will be computed. The most commonly used values of these constants are as follows. The constant GL_ONE
sets the value of the respective blending factor to 1. The constant GL_SRC_ALPHA
specifies that the source factor sf will be taken from the fourth component, also called the alpha component, of the color of the drawn primitives. Hence, if we next set the color using glColor4f
(r, g, b, sf), sf will be the actual source blending factor used. The constant GL_ONE_MINUS_SRC_ALPHA
sets the respective blending factor to 1 − sf , where sf is the alpha value of the drawn primitives, as specified previously.
These parameters allow us to obtain various transparency effects. Using glBlendFunc(GL_SRC_ALPHA,GL_ONE)
followed by drawing primitives that have the desired alpha values set via glColor4f
starting with a black frame buffer effectively adds up the primitives’ colors, weighted by their respective alpha values. We can achieve the same effect by using glBlendFunc(GL_ONE,GL_ONE)
, i.e., setting sf = df = 1, and factoring the source blending factor in the color by using glColor3f
(rsf ,gsf ,bsf). Using glBlendFunc(GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA)
realizes the convex combination
The value of dst′ is always clamped to one by OpenGL, so it is a good idea to use blending factors that will sum up to one when all primitives are drawn. This is achieved automatically if we use the blending factors as set by the convex combination in Equation (2.10).
Finally, there is a fourth constant GL_ZERO
, which sets the respective source or destination blending factor to zero. Hence, calling the glBlendFunc(GL_ONE, GL_ZERO)
function is equivalent to having blending disabled, i.e., the source color will simply overwrite the destination (frame buffer) color.
Figure 2.8 shows the height plot drawn with the same transparency value of 0.7 on top of the domain grid, this time using a white background, obtained by setting glClearColor(1,1,1,0)
. We leave the construction of the actual OpenGL blending code that achieves this image as an exercise for the reader.
Figure 2.8. The height plot drawn half-transparently on top of the domain grid using a white background.
Besides transparency, blending can be used to obtain many other graphical effects that are useful in visualization applications. Volume visualization, described in Chapter 10, fundamentally relies on blending to display volumetric datasets. Another use of blending is demonstrated in Section 6.6, for the construction of animated flow textures for visualizing vector fields. Yet another application of blending is described in Section 9.4.5, for the computation of distance fields in image data.
All images of our height plot shown so far display the plot as viewed from a certain angle and position in space. To understand how to specify such viewing parameters, we need to understand how OpenGL processes 3D vertex coordinates to create the final image displayed on the screen. This process has several steps, or transforms, which are described next.
The first step in specifying how to view a 3D scene is to specify where from, and in which direction, we want to look at the scene. The easiest way to picture this is to imagine that we have a virtual photo camera that we want to point at the scene in a specific way. In OpenGL, such a camera can be specified by indicating its location (also called the eye position e), a location towards which the camera is pointing (also called the center position c), and a vector u indicating how the camera is rotated around the viewing direction c − e (also called the “up” vector). Figure 2.9 illustrates this. The plane orthogonal to the viewing direction, on which the final image will appear, is also called the view plane.
Figure 2.9. Extrinsic parameters of the OpenGL camera.
In OpenGL, specifying the eye, center, and up vector values for a camera can be done using the gluLookAt
function2
The first two function calls (glMatrixMode
and glLoadIdentity
) ensure that the subsequent gluLookAt
call will modify the extrinsic camera parameters of OpenGL, rather than intrinsic camera parameters, which are discussed in the next section. Extrinsic camera parameters are also called modelview parameters in OpenGL, hence the name GL_MODELVIEW
. The up vector u should not be parallel to the viewing direction c−e, otherwise we cannot unambiguously define the view plane. A simple but effective default setting is to have c equal to the center of the viewed scene, i.e., the average of all vertex coordinates; u equal to the y-axis of the 3D space, i.e., u = (0, 1, 0); and e at a distance from c slightly larger than the size of the scene’s bounding-box. All three vectors e, c, and u are given in the same coordinate system as the 3D scene to view. These values are also known as the extrinsic camera parameters, since they do not specify how the camera is actually constructed, but only its placement in the 3D space.
After placing the camera in 3D space, we need to specify how the camera itself operates, or its intrinsic parameters. Following the analogy with a physical photo camera, which captures perspective projections of a 3D scene on a planar surface, intrinsic parameters give the focal length, field-of-view, and aspect ratio of the view area where we wish to capture the image (Figure 2.10). This view area is a rectangle in the view plane whose center is on the view direction line c − e and whose axes are parallel to u, respectively (c − e) × u. The focal length znear gives the distance, along the view direction, from the eye position e to the view plane. No objects closer to e than znear are “seen” by the camera. Hence, the view plane is also called the near clipping plane, as it clips, or removes from viewing, objects in front of it. The field-of-view fov is the angle formed by the top, respective bottom, borders of the view area with the eye position e. The aspect ratio aspect gives the ratio of the horizontal to the vertical size of the view area. Finally, the value zfar gives the distance, along the view direction, from the eye position e to a plane parallel to, and behind, the view plane beyond which the camera “sees” no objects. This plane is also called the far clipping plane, as it clips, or removes from viewing, objects behind it.
Figure 2.10. Intrinsic parameters of the OpenGL camera.
Given this configuration, the camera will render only objects located inside a pyramid frustum delimited by the view area and its corresponding projection on the far clipping plane. This frustum is drawn in green in Figure 2.10. Specifying intrinsic camera parameters can be done in OpenGL by using the gluPerspective
function
The first two function calls (glMatrixMode
and glLoadIdentity
) ensure that the subsequent gluPerspective
call will affect the intrinsic, or projection, camera parameters rather than the extrinsic ones, which were discussed in the previous section. The constraints for gluPerspective
are that both znear and zfar should be positive values. Also, typically znear < zfar. gluPerspective
implements the perspective projection, as explained above. Good default values are fov equal to 45 to 90 degrees. Parallel, or orthographic, projections, which do not make objects further away from the view plane appear smaller (an effect also called foreshortening), are also supported by OpenGL, by using the gluOrtho2D
function
The values xleft, xright, ybottom, and ytop define the positions of the four edges of the view area rectangle on the view plane, with respect to the intersection point of the view direction with this plane. This effectively creates a parallelepiped-like view volume instead of a pyramid frustum. Orthographic projections are less frequently used in visualization applications than perspective projections. They are useful in cases when we want equal 3D distances to appear as equal 2D distances on the view plane (no foreshortening).
The projection transform described above effectively specifies how 3D objects inside the view frustum are projected to a 2D rectangle on the view plane. The third and final step of viewing is to map this rectangle to a 2D screen area, called a viewport. This is achieved by the so-called viewport transform. In OpenGL, this transform is specified by the glViewport
function
The effect of glViewport
is to map the entire view area rectangle to the screen rectangle whose lower left corner is given by the pixel coordinates x, y and having dimensions width and height pixels (Figure 2.11). The mapping is linear, i.e., the view area is effectively translated and uniformly stretched or squeezed, along each individual screen axis, to fit the screen rectangle. Typically, given a screen window of W × H pixels, we want to fill the entire window with our drawn scene. Setting x = y = 0 and width = W and height = H fills the entire window with our view area. However, if the aspect-ratio aspect (from, e.g., perspective projection) does not equal W/H, the scene will appear unnaturally stretched or compressed on the screen, as shown by the example in Figure 2.11. In other words, the scaling factors from the view area to the screen window are not equal for the two screen axes. A simple way to fix this is to set aspect to W/H. However, this changes the view area’s shape depending on the shape of the screen window, which may clip off interesting portions of our scene. A better way is to compute width and height as being the values which fit the largest rectangle of aspect-ratio aspect within the available screen space W × H, and next compute x and y so that this rectangle is centered within the W × H screen space.
Figure 2.11. OpenGL viewport transform from view area on the view plane to a screen area.
We have introduced so far the OpenGL functionality for drawing 3D shapes and specifying the viewing parameters. We next discuss how all these bits are put together to create a full OpenGL application.
The structure of such a minimal application, outlined in Listing 2.7, consists of three main parts:
the main function (main
, lines 1–10);
a function for setting up the viewing parameters (viewing
, lines 12–17);
a function for doing the actual drawing (draw
, lines 19–31).
The main
function takes care of creating a screen window onto which OpenGL will draw our scene. For this, we use here the Graphics Utility Toolkit (GLUT), a simple and easy to use library that provides an uniform interface between the platform-independent OpenGL library and various so-called windowing systems, such as Windows, OS X, and X Windows.3 The first step is to initialize GLUT (Listing 2.7, lines 3–4). The second line above tells GLUT that we want a screen window using RGB colors GLUT_RGB
), that we will draw 3D shapes where we want to perform hidden surface removal (GLUT_DEPTH
), and that we want to use double buffering for this window (GLUT_DOUBLE
). Hidden surface removal and double buffering are explained below. Line 5 specifies the initial size of our upcoming screen window. Line 6 creates the actual window.
We could next proceed by directly adding OpenGL code to set up the viewing parameters and for drawing the actual scene. This code is provided by our functions viewing
and draw
, respectively. However, in a windowing system, users can manipulate such windows, e.g., by minimizing, maximizing, or resizing them. In all such cases, our application needs to redraw the window contents. To do this, GLUT uses an event-based system, where windows are automatically (re)drawn every time such window changes occur. For this to work, GLUT needs to know which code to execute when a window’s size changes and when a window needs to be redrawn, i.e., our viewing
and draw
functions, respectively. Lines 7 and 8 connect these two functions to GLUT. The final step of main
, line 9, passes the application control flow to GLUT: From now on, GLUT will automatically invoke our viewing
function whenever our window is resized, and, separately, the draw
function whenever our window needs to be redrawn.
The viewing
function contains all code required to perform the viewing transforms described in Section 2.6. The first step sets up the virtual camera extrinsic parameters (lines 21–23). Of course, to make the code complete, we need to specify actual values for the vectors e, c, and u that define our viewpoint and viewing direction. The second step sets up the projection (lines 25–28). For illustration, we used here a perspective projection which keeps the same aspect ratio between the screen window and the view area. As for the camera extrinsic parameters, we need here to specify actual values for the field-of-view fov and near and far plane positions near and far. The third and last step specifies the viewport, which in our case is the entire screen window (line 30). Note that the width W and height H of the screen window are automatically supplied by GLUT as parameters to our function viewing
. This way, our code knows about the geometry of this window, and can subsequently adapt the viewing parameters (view-area aspect-ratio aspect and viewport width and height).
The draw
function contains all code that draws the actual scene. GLUT guarantees that this function is always called after viewing
has been called upon a window-geometry change. Hence, our viewing parameters are correctly set up. The first step in drawing (line 14) is to clear the window’s frame buffer (GL_COLOR_BUFFER_BIT
). In the same time, we also clear the so-called depth buffer of our window (GL_DEPTH_BUFFER_BIT
). The depth buffer, also called a z buffer, holds, for each pixel (x, y) of a window, a value indicating the distance from the view plane to the closest pixel of any primitive that covers (x, y). The depth buffer allows OpenGL to draw primitives in any order, and obtain, in the frame buffer, a correct “hiding” of primitives lying farther from the view plane by closer ones, much like the way we do not see in reality objects lying behind other objects. This process is also known as hidden surface removal. For this, we first need to clear the depth buffer (line 14, GL_DEPTH_BUFFER_BIT
), i.e., set its values for all pixels to a value larger than the distance between the view and far clipping plane.
Next, we add our desired scene drawing (line 15). This can be, for instance, the height plot drawing code in Listing 2.4. Besides the actual primitive drawing, this is the place to set the various other parameters of our drawing, such as lighting, textures, and blending. As this code draws its various primitives, OpenGL will only write to screen pixels (x, y) whose depth-buffer values are larger than the depth of the current primitive at (x, y).
Drawing a complex scene containing tens or thousands of primitives is not an instantaneous process, even on modern graphics cards. If the drawing occurs directly on the visible window surface, the user may notice a certain amount of flickering, as primitives get incrementally added to the drawing. To remove this problem, we use a technique called double buffering. Two images are kept by OpenGL. The first image, called the onscreen buffer, is the image currently being displayed in the window. The second image, called the offscreen buffer, is the area where the actual drawing occurs. When the drawing is ready, we simply let OpenGL “swap” the onscreen and offscreen buffers, so the constructed image is displayed at once, thereby removing the flickering. This operation is executed by the glutSwapBuffers
call in line 16.
Constructing a full-fledged visualization application using Open-GL is more complicated than the basic blueprint shown in the previous section. Issues that need to be addressed include management of multiple windows, processing input from keyboard and mouse, and integrating user-interface elements such as buttons and sliders. Such issues are beyond the scope of this book. For a more extensive introduction to building OpenGL applications, we refer to the books of [Shirley and Marschner 09, Shreiner et al. 03]. A simple but effective user-interface toolkit that integrates easily with OpenGL and GLUT is the OpenGL User Interface C++ library (GLUI).4 While less powerful than other user-interface toolkits, GLUI enables the addition of user-interface elements such as buttons, sliders, checkboxes, and menus to an OpenGL application in a portable and quick-to-learn manner.
In this chapter, we introduced the basic structure of the visualization process, using as an example the simple task of visualizing a two-variable real-valued function. The process can be summarized as follows (see also Figure 2.12):
acquire the data of interest into a discrete dataset;
map this dataset to graphics primitives;
render the primitives to obtain the desired image.
Figure 2.12. Visualization process steps for the elevation plot.
The presented example can be, and actually is many times, regarded as a pure graphics application rather than a data-visualization application. However, if we think of the primary goal of this example, conveying insight to the user about the values and variation of a real-valued function defined over a two-dimensional domain, we discover the visualization aspect. Moreover, this simple example allows us to encounter the fundamental concepts and building bricks of the visualization process: datasets, mapping, rendering, and the visualization process, or pipeline. Datasets, the elements used to store and represent data, are discussed further in Chapter 3 together with their main operations: sampling and interpolation. Mapping, the process that produces viewable geometric objects from the abstract datasets, is discussed further in its various guises from Chapter 5 onwards. Rendering, the process that displays a geometric set of objects, has been already discussed in this chapter. Finally, the complete chain of operations and concepts that constitutes the visualization process is discussed in Chapter 4.