WWW To download the web content for this chapter go to the website www.routledge.com/cw/wright, select this book then click on Chapter 8.
The march of technology and the relentless pursuit of lower production costs has stoked the need for 3D compositing as the industry discovers doing things in 2D is faster and cheaper than 3D. Professional compositing software now contains a built-in “3D department” to support the many special demands of 3D compositing. Visual effects artists must be proficient in this powerful new tool but it does raise the stakes for the artist’s knowledge. Not only do you have to know your stuff for keying greenscreen shots and multipass CGI but you now have to have one toe in the 3D world and become more comfortable with math.
To assist folks that have little exposure to 3D terms and notions the first section of this chapter is a short course in 3D. The focus, of course, is on those 3D issues that pertain to 3D compositing. The following section on 3D compositing parlays those concepts into how to set up various common types of 3D compositing shots. We will see the great and powerful camera-tracking technology used for matchmove shots, the many uses of camera projection, how to set up a pan and tile shot, plus how set extensions are done. The chapter ends with an exploration of Alembic geometry, a key new technology for 3D compositing.
3D compositing has a great many terms and concepts that will be foreign to a straight up 2D compositor, which will make learning about it very difficult. In this section we pause for a short course in 3D to build up the vocabulary a compositor will need to take on 3D compositing. Again, 3D is a huge subject so we will only be covering those topics that are relevant to 3D compositing.
Figure 8.1
3D coordinate system
Most 2D compositors already understand 3D coordinate systems but it would be good to do a quick review to lock in a few key concepts and establish some naming conventions. Figure 8.1 shows a typical 3D coordinate system used by most 3D programs. Y is vertical, with positive Y being up. X is horizontal, with positive X being to the right. Z is towards and away from the screen with positive Z being towards the screen and negative Z being away.
Any point in 3D space (often referred to as “world space”) is located using three floating-point numbers that represent its position in X, Y, and Z. The white dot in Figure 8.1 is located at 0.58 (in X) 1.05 (in Y) and –0.51 (in Z) and is written as a triplet like this: 0.58 1.05 –0.51.
Figure 8.2 Perspective and orthogonal views
The 3D viewing part of the compositing system will present two types of views of the 3D world (Figure 8.2), perspective and orthogonal, which means “perpendicular”. The perspective view is as seen through a camera viewer while the orthogonal views can be thought of as “straight on” to the 3D geometry with no camera perspective. These “ortho” views are essential for precisely lining things up. The orthogonal view will further offer three versions: the top, front, and side views. If you have ever taken a drafting class in school this will all be very familiar. If you are the purely artistic type then this will all be quite unnatural, having no perspective and all.
Figure 8.3
Vertices and polygons
Vertices (or the singular “vertex”) are the 3D points that define the surface of the geometry. In fact, 3D geometry is a list of vertices and a description of how they are connected – i.e. vertex numbers 1, 2 and 3 are connected to form a triangle, or a “polygon”. Figure 8.3 shows a series of vertices and how they are connected to create polygons. Polygons are joined together to form surfaces and surfaces are joined together to form objects.
Figure 8.4
The mesh for a 3D hand
When the vertices are connected to define a 3D object the resulting geometry is referred to as a polygonal mesh, or just a mesh. This is the basic form in which complex compound curved objects such as characters or cars are represented in 3D systems such as Maya. In the 3D system the mesh is attached to an articulate “rig” that is actually a jointed skeleton. The 3D animator moves the skeleton and the skeleton moves the mesh.
These meshes can be “baked out” of the 3D system then imported into 3D compositing software. When baked out they lose their “skeletons” and can no longer be articulated but they can be resized and repositioned and even retimed – if working with Alembic geometry. More on this later.
Surface normals are also key players in rendering CGI and 2D compositing. Rendering is the process of calculating the output image, based on the 3D geometry, surface attributes, camera, and lights. The brightness of a surface depends on its angle to the light source, so the surface normals are a necessary part of the calculation to determine their brightness. We saw a great use for surface normals in Chapter 7: Compositing CGI, to completely relight a CGI render from the 2D images. Consider Figure 8.5, the flat shaded surface. Each polygon has been rendered as a simple flat shaded element but the key point here is how each polygon is a slightly different brightness compared to its neighbor. Each polygon’s brightness is calculated based on the angle of its surface relative to the light source. If it is perpendicular to the light it is bright, but if it is at an angle it is rendered darker. It is the surface normal that is used to calculate this angle to the light so it is a fundamental player in the appearance of all 3D objects.
Figure 8.5 Flat shaded surface
Figure 8.6
Surface normal
Figure 8.7
Smooth shaded surface
The sphere in Figure 8.6 is displayed with its surface normals visible. A surface normal is a vector, or little “arrow”, with its base in the center of the polygon face and pointing out perpendicular to its surface. The technical term when one object is perpendicular to another is to say that it is “normal” to that surface. So the surface normals are vectors that are perpendicular to the flat face of each polygon and are essential to computing its brightness. If the surface normals are smoothly interpolated across the surface of the geometry we can simulate a smooth shaded surface like the one shown in Figure 8.7. Note that this is really a rendering trick because the surface looks smooth and continuous when it is not. In fact, if you look closely you can still see the short straight segments of the sphere’s polygons around its outside edge.
Figure 8.8
A surface normal
Figure 8.9
Surface normals in 3-channel image
In the world of compositing, the surface normal vectors are turned into 2D images that we can use for image processing. How this is done is illustrated starting in Figure 8.8, which shows a surface normal (white arrow). The base is at the center of a polygon and the arrow tip indicates in what direction it points rel ative to its base. The tip’s direction can be quantified by 3 numbers indicated by the X, Y and Z arrows, which measure the tip’s positional offset relative to its base. We can take these three numbers for each pixel and place them in a 3-channel image, like the example in Figure 8.9, such that the X data is stored in the red channel, Y in the green channel, and Z in blue. In other words, we have an “image” that does not contain a picture but rather data about the picture. We now have the 3D vector information shown in Figure 8.6 converted to the 2D form that we can use in compositing for a multitude of purposes.
Texture mapping is the process of wrapping pictures around 3D geometry like the example in Figure 8.10 and Figure 8.11. The name is a bit odd, however. If you actually want to add a texture to a 3D object (such as tree bark) you use another image called a bump map. So bump maps add texture, and texture maps add pictures. Oh well.
Figure 8.10
Texture map
Figure 8.11
Texture map projected onto geometry
Actually, there are many different kinds of maps that can be applied to 3D geometry. A reflection map puts reflections on a surface. A reflectance map will create shiny and dull regions on the surface that will show more or less of the reflection, and so on. But we are not here to talk about the different kinds of maps that can be applied to geometry, for they are numerous and varied. We are here to talk about how they are all applied to the geometry, for that is a universal concept and what UV coordinates are all about.
The UV coordinates determine how a texture map is fitted onto the geometry. We could not use x and y for this because it is already taken, so the u coordinate is the horizontal axis and v is the vertical axis of the image. Each vertex in the geometry is assigned a UV coordinate that links that vertex to a specific point in the texture map. Since there is some distance between the vertices in the geometry the remaining parts of the texture map are estimated or “interpolated” to fill in between the actual vertices. Changing the UV coordinates of the geometry will reposition the texture map on its surface. There are many different methods of assigning the UV coordinates to the geometry and this process is called “map projection”.
Map projection is the process of fitting the texture map to geometry by assigning UV coordinates to its vertices. It is called “projection” because the texture map starts out as a flat 2D image and to wrap it around the geometry requires it to be “projected” into 3D space. There are a wide variety of ways to project a texture map onto 3D geometry and the key is to select a projection method that suits the shape of the geometry. Some common projections are planar, spherical, and cylindrical.
Figure 8.12 Planar projection
Figure 8.13
Spherical projection
Figure 8.14
Cylindrical projection
Figure 8.12 shows a planar projection of the texture map onto a sphere. You can think of a planar projection as walking face-first into a sticky sheet of Saran Wrap (yikes!). As you can see by the final textured sphere this projection method does not fit well so it stretches and streaks parts of the texture map. A planar projection would be the right choice if you were projecting a texture to the flat front of a building, for example.
Figure 8.13 shows a more promising approach to the sphere called spherical projection. You can think of this as stretching the texture map into a sphere around the geometry then shrink-wrapping it down in all directions. The key here is that the projection method matches the shape of the geometry.
Figure 8.14 shows a cylindrical projection of the texture map onto a sphere. Clearly a cylindrical projection was designed to project a texture map around cylinders, such as a can of soup, not a sphere. But this example demonstrates how the texture map is distorted when the projection method does not match the shape of the geometry.
Simple geometry shapes can have their maps projected by simple rules but what about a complex compound curved surface like the wireframe character in Figure 8.15? No simple rule will do. For this kind of complex object a technique called UV projection is used. The texture map is painted on a dedicated 3D paint program designed for this type of work resulting in the confusing texture map in Figure 8.16. But the 3D geometry knows where all the bits go and the results are shown in Figure 8.17.
Figure 8.15 Wireframe geometry
Figure 8.16
Texture map
Figure 8.17
Texture map applied to geometry
To understand the confusing texture map in Figure 8.16 we need to understand how the texture map pixels get located onto the 3D geometry. As mentioned above, the texture map has horizontal coordinates labeled “u” (or “U”) and vertical coordinates labeled “v” (or “V”). Any pixel in the texture map can then be located by its UV coordinates. Unsurprisingly there are no firm industry standards, but many systems assign the UV coordinate 0,0 to the lower left corner and 1,1 to the upper right corner of the texture map. So a pixel in the dead center would have UV coordinates of 0.5, 0.5. The basic idea is somehow to assign the UV coordinate of 0.5, 0.5 to a particular vertex of the 3D geometry. This is the job of the 3D texture-painting program.
Figure 8.18
The awesome Mari 3D texture painting program
Figure 8.18 is a screen grab of Mari, a very powerful 3D texture-painting program by The Foundry, the makers of Nuke. The 3D geometry is loaded into Mari and the artist paints directly on the geometry. As the paint is applied Mari knows what vertex is under the paint brush, as well as the UV coordinate of the paint pixels it is generating, so it is building the UV list for the geometry as it is painted. The geometry is saved out of Mari with its freshly assigned UV coordinates and the texture map is saved out as an image file like EXR. While the appearance of the texture map image to humans is a bit confusing (see Figure 8.16) to the computer it makes perfect sense. Thanks to the 3D texture-painting program the 3D system knows exactly where to place every pixel of the texture map.
Generally speaking, we do not create the 3D geometry when doing 3D compositing. After all, we have to leave something for the 3D department to do, right? However, we often do need to use simple 3D shapes to project our texture maps so 3D compositing programs come with a small library of “geometric primitives”, which are simple 3D shapes. Figure 8.19 shows a lineup of the usual suspects – a plane, a cube, a sphere, and a cylinder. More complex 3D objects such as the character in Figure 8.15 are modeled in the 3D department then saved out in a file format that can then be read into the 3D compositing program.
Figure 8.19
Geometric primitives
Figure 8.20
Solid, textured, and wireframe displays
Figure 8.20 shows how the geometry may also be displayed in three different ways. Solid, where the geometry is rendered with simple lighting only; textured, with texture maps applied; and wireframe. The reason for three different presentations is system response time. 3D is computationally expensive (slow to compute) and if you have a million-polygon database (not that unusual!) then you can wait a long time for the screen to update when you make a change. In wireframe mode the screen will update the quickest. If your workstation has the power or the database is not too large you might switch the 3D viewer to solid display. And of course, the textured display is slowest of all, but also the coolest.
2D images can be transformed in a variety of ways – translate, rotate, scale, etc. which is explored in detail in Chapter 14: Transforms and Tracking. Of course, 3D objects can be transformed in the same ways. However, in the 2D world the transformation of an image is rarely dependent on or related to the transformation of other images. In the 3D world the transformation of one piece of geometry is very often hierarchically linked to the transformation of another. And it can quickly get quite complex and confusing.
There is one item that would be unfamiliar to a 2D compositor and that is the “null object”. A null object, or axis, is a 3-dimensional pivot point that you can create and assign to 3D objects, including lights and camera. In 2D you can shift the pivot point of an image, but you cannot create a new pivot point in isolation. In 3D it is done all the time.
Figure 8.21
Axis for earth and moon added for animation
A typical example can be seen in the animation shown in the three frames of Figure 8.21. The Earth must both rotate and orbit around the Sun. The rotate is easily achieved by rotating the sphere around its own pivot point. However, to achieve the orbit around the Sun it needs a separate axis (null object) located at the Sun’s center. The Moon has exactly the same requirement so it too needs a separate axis to orbit around.
WWW Geometric transformations.mov – view the entire animation used for Figure 8.21 and see the axis in action for the Sun, Moon, and Earth.
One thing you will be asked to do to your 3D geometry is to deform it. Maybe you need an ellipse rather than a sphere. Deform the sphere. Maybe the 3D geometry you loaded does not quite fit the live action correctly. Deform it to fit better. There are a great many ways to deform geometry, and they vary from program to program. But here are three fundamental techniques that will likely be in any 3D compositing program you encounter.
Image displacement is definitely one of the coolest tricks in all of 3D-dom (displacement just means to shift something’s position). Simply take an image (see Figure 8.22), lay it down on top of a flat card that has been subdivided into many polygons and command the image to displace (shift) the vertices vertically based on the brightness of the image at each vertex. The result looks like the displaced geometry in Figure 8.22. The bright parts of the image have lifted the vertices up and the dark parts dropped them down resulting in a contoured piece of terrain. Just add a moving camera and render to taste.
Figure 8.22
Image displacement example
Figure 8.23 shows three frames out of an animation sequence rendered from Figure 8.22. If the camera were simply to slide over the original flat image that is exactly what it would look like – a flat image moving in front of the camera. With image displacement, the variation in height creates parallax and it appears as real terrain with depth variations. You can see what I mean with the movie provided on the website below. And it is very photo-realistic because it uses a photograph. Talk about cheap thrills. Besides terrain, this technique can be used to make clouds, water, and many other 3D effects.
Figure 8.23 Frames from camera flyover displaced terrain
In this example the same image was used for both the displacement map and the texture map. Sometimes a separate image is painted for the displacement map to control it more precisely, and the photograph is used only as the texture map.
WWW Image displacement.mov – this video shows the really cool terrain fly-over created by the simple image displacement shown in Figure 8.23.
Figure 8.24
Geometry displaced with noise
Instead of using an image to displace geometry we can also use “noise” – mathematically generated noise or turbulence. In our 2D world we use 2D noise so in the 3D world we have 3D noise. There also tends to be a greater selection of the types of noise, as well as a greater sophistication of the controls.
Figure 8.24 illustrates a flat piece of subdivided geometry that has been displaced in Y with a robust Perlin noise function. This can be a very efficient (cheap) way to generate acres of terrain. These noise and turbulence functions can also be animated to make clouds or water – perhaps water seen from a great distance.
WWW Noise displacement.mov – this video is an animation showing the use of noise displacement shown in Figure 8.24. Caution – it’s a bit freaky.
The third and final geometric deformation example is the deformation lattice. The concept is to place the geometry inside of a bounding box called a “deformation lattice”, then deform the lattice. The geometry inside the lattice is deformed accordingly. We can see an example starting with the original geometry in Figure 8.25. The deformation lattice is shown surrounding the geometry in Figure 8.26. Note that some programs have more subdivisions and control points than the simple example shown here. The artist then moves the control points at the corners of the lattice and the geometry inside is deformed proportionally as shown in Figure 8.27.
Figure 8.25
Original geometry
Figure 8.26
Deformation lattice added
Figure 8.27
Deformed geometry
While this example shows an extreme deformation in order to illustrate the process, the main use in the real world of 3D compositing is to make small overall adjustments to the shape of the geometry when it doesn’t quite fit right in the scene.
WWW Deformation lattice.mov – this video shows the deformation lattice from Figure 8.27 in action.
Point clouds are another form of 3D geometry that consist of… a cloud of points. The idea is that we want to represent a 3D object or scene but we don’t have a polygonal model to see a smooth shaded or texture mapped version of it. They are a far lighter load for the computer and are easily moved around the screen in real-time. But how are they made and what are they used for?
Figure 8.28
Live action clip with camera move
Point clouds come from two sources. Most commonly they are produced by a 3D camera tracker that tracked a scene then the artist asked for a point cloud to be generated that represents the surfaces in the scene (see Section 8.4: Camera Tracking). The second source is Lidar (Light Detection And Ranging), which uses a laser to scan a scene to collect location data for millions of points. Figure 8.28 shows a live action clip that was camera tracked to produce the colored point cloud in Figure 8.29.
Figure 8.29
Point cloud from same POV
Figure 8.30
Point cloud with camera
The point cloud is being viewed through the solved camera so it’s being viewed from the same POV as the original live action clip. Figure 8.30 shows the point cloud and solved camera from a different POV. What’s important to note here is what is missing. There are “voids” in the point cloud caused by the fact that the camera didn’t get a good enough look at those areas to calculate a point for it. For camera tracking to produce a 3D point the camera has to be moving and view it from several different angles over several frames. Points with insufficient coverage are discarded, which leaves the voids.
As for what point clouds are good for, they are used primarily to line up 3D elements for a 3D composting scene. Figure 8.29, for example, could be used to place a 3D object on the floor such as a character walking across the room. Most high-end compositing programs have a built-in camera tracker so compositors can generate their own point clouds and “solved cameras” for their 3D compositing shots (the reverse engineered camera and lens is referred to as the “solved camera”).
WWW Point cloud.mov – this movie will give you an orbiting camera view of the point cloud shown in Figure 8.29. You can see how this would be very handy for placing 3D geometry to photograph with the “solved camera”.
3D lights are one of the key features common to all 3D compositing programs. They can be located and pointed in 3D space, animated, their brightness adjusted, and even a color tint added. Many even have a falloff setting so the light will get dimmer with distance. Just like the real world.
Figure 8.31
Point light
Figure 8.32
Spotlight
Figure 8.33
Parallel light
Here are the three basic types of lights you are sure to encounter in a 3D compositing program. Figure 8.31 shows a point light, which is like a single bare bulb hanging in the middle of our little 3D “set”. Unlike the real world, 3D lights themselves are never seen because the program does not render them as visible objects. It only renders the effect of the light they emit. Figure 8.32 is a spotlight that is parked off to the right with its adjustable cone aimed at the set. Figure 8.33 is a parallel light that simulates the parallel light rays from a light source at a great distance, like the Sun, streaming in from the right.
Note the lack of shadows. Casting shadows is a computationally expensive feature and is typically reserved for real 3D software. Note also that the computer simulated “light rays” pass right through the 3D objects and will illuminate things that should not be lit, such as the inside of a mouth.
Because 3D lights are virtual and mathematical you can get clever and darken the inside of the mouth by hanging a light inside of it, then give it a negative brightness so it “radiates darkness”. This is perfectly legal in the virtual world of 3D.
After the 3D geometry is created the type of surface it represents must be defined. Is it a hard shiny surface like plastic? Is it a dull matte surface like a pool table? In fact, the surface of 3D geometry can have multiple layers of attributes and these are created by shaders. Shaders are a hugely complex subject manned by towering specialists whose sole job is to create and refine them. In fact, they are called shader writers and are highly paid specialists in the world of CGI.
The shaders in a 3D compositing program will be simple and require no programming, but you will have to adjust their parameters to get realistic surface attributes on your geometry. Here we will take a look at three basic shaders – ambient, diffuse, and specular. Each example shows three intensity levels from dark to bright to help show the effect of the shader. Note that these intensity changes are achieved by increasing the shader values, not by increasing the light source. In the virtual world of CGI you can increase the brightness of objects by either making the light brighter or dialing up the shader. Or both.
Figure 8.34
Ambient shader
Figure 8.35
Diffuse shader
Starting with the ambient shader in Figure 8.34, it simply causes the 3D object to emit light from every pore and is unaffected by any lights. It can be used in two ways. First, it is used to add a low level of ambient light to prevent unlit areas of the geometry from rendering as zero black. Second, by cranking an ambient shader way up it becomes a self-illuminated object, such as the taillight of a car. However, it does not cast any light on other objects. It just gets brighter itself.
Figure 8.35 shows the effect of a diffuse shader, which definitely responds to lights. If you increase the brightness of the light the geometry will get brighter. The diffuse shader darkens the polygons as they turn away from the light source. It uses the surface normals we saw earlier to determine how perpendicular the polygon is to the light source, and then calculates its brightness accordingly.
Figure 8.36
Ambient plus diffuse
Figure 8.37
Specular added
One key concept about shaders is that they can be summed together to build up surface complexity. Figure 8.36 shows the combination of the ambient plus the diffuse shaders at three intensity levels. Figure 8.37 shows the middle sample from Figure 8.36 with three different intensities of a specular shader added to make it look progressively shinier. Again, there are no changes in the light source, just the shader.
Reflection mapping is an amazing photo-realistic 3D lighting technique that does not use any 3D lights at all. Instead, the entire scene is illuminated with a special photograph taken on the set or on location. This photograph is then (conceptually) wrapped around the entire 3D scene as if on a bubble at infinite distance, then the reflection map is seen on any reflective surface. This is not light projected into the 3D scene, it is just reflections on shiny objects.
Figure 8.38
A light probe image
Figure 8.39
Scene illuminated with reflection map only
Figure 8.38 shows a light probe image, that “special photograph” mentioned above, taken on location. This particular location being the interior of a cathedral. Figure 8.39 is our little 3D set with some degree of reflectivity enabled for all the geometry. No 3D lights were used. It is as if these objects were built in a model shop then set down on the floor in the actual cathedral.
There are two specific requirements to make the bizarre photograph in Figure 8.38 work as an environment light. The first is that it must be a High Dynamic Range (HDR) image, and the second is that it is a 360-degree photograph made with a “light probe”, a highly reflective chrome sphere. Of course, it needs to be a reasonably high-resolution image as well. The light probe is placed at the center of the location then photographed so the camera sees the entire environment as if it were in the center of the room photographing in all directions at once – minus the scene content obscured by the light probe itself.
This particular image captures the lights in the room as well as the windows with sunlight streaming in at their true brightness levels so that they can then “shine” on the 3D objects in the scene. Of course, the image is badly distorted spherically, but this is trivial for a computer to unwrap back into the original scene – which is exactly what is done as part of the reflection mapping process.
WWW Reflection map.mov – check out this video showing the light probe in Figure 8.38 as a reflection map around a rotating chrome bust. There are no lights, just a reflection map.
You may recall from Section 8.1.12: Lights that we had a little problem with shadows – namely that there weren’t any. That’s because computing shadows takes more work – a lot more work – than simply calculating how bright each polygon is. Note that the scenes in Figure 8.31 to Figure 8.33 are rendered as though the light were passing right through the objects and falling onto the surfaces behind them. That’s because it is. The simple scan-line render algorithm used here just looks at each surface in isolation relative to its light source. If we want to improve the accuracy of the render to reflect more correctly what real light rays would do in a real scene we have to up our game and switch to ray tracing.
Figure 8.40
Ray tracing is required for reflections and refractions
Consider the complex image in Figure 8.40. To render images like this, light rays are traced throughout the scene including all the reflections and refractions. Surprisingly, the ray-tracing process starts at the viewing screen and traces light rays backwards into the scene. For each pixel, a ray is traced into the scene until it hits its first surface, then depending on the nature of that surface, it may bounce away and out of the picture or bounce to another object or even refract through some glass and strike something in the background. Some will stop cold when they hit a non-reflective opaque surface thus casting a shadow behind the object. So ray tracing is how high quality shadows, reflections and refractions are generated. However, there are some cheats to save computing time that will generate pretty good shadows without the expense of a ray tracer, which you may find in your 3D compositing software.
The thing to appreciate about ray tracing is how incredibly expensive it is. When rendering a 2k-frame for example, we start with the fact that there are over two million pixels at the viewing screen, so we will have two million rays to trace. In reality multiple rays are cast per pixel to avoid aliasing, but that’s another story. The point is these millions of rays are bouncing all over the scene interacting multiple times with multiple surfaces, so the entire process entails billions, sometimes trillions of calculations. But it sure looks nice.
Figure 8.41
Photorealism with image-based lighting
The latest trend in CGI rendering techniques is Image-Based Lighting (IBL), which produces astonishingly photorealistic images like the one in Figure 8.41. IBL is actually an extension of the reflection mapping we saw in Section 8.1.14: Reflection Mapping above that used a light probe to capture a 360-degree HDR image of the scene, then used that to generate reflections off of shiny objects. Only in this case the image is actually used as the light source for the scene. The beauty of this approach is that it produces astonishing photorealistic CGI scenes. But of course, it too is computationally expensive.
Figure 8.42
Light probe
Figure 8.43
Exposure lowered
Figure 8.42 shows the type of light-probe photo used to illuminate scenes like Figure 8.41 which in this case just happens to be a photo of a kitchen. The light probe need not really be from the actual scene in question just as long as it has useful light sources. The room content can be masked off so that just the light sources are used to illuminate the scene and the light probe can be rotated around to position the light source wherever needed. Beyond being an HDR image they must also be very high resolution, perhaps 10k or better. Note that the light probe image is not a very pretty picture. It’s not intended to be. It is intended to be a data capture of the scene illumination all the way up to the brightest light sources, hence the HDR. Figure 8.43 illustrates the enormous dynamic range in the photo by knocking the exposure way down so the bright light sources become visible. You can see the ceiling kitchen light and the windows with sky outside. These will become the actual light sources for illuminating the 3D objects in a scene. IBL is often used when a 3D scene needs to be modeled on a real location. They will take their light probe and HDR camera to the original scene to capture the local lighting, then use that to light the 3D model of the scene. Very photorealistic.
After the geometry, texture maps, and lights are all in place, the next thing we will need to photograph our 3D scene is a camera. 3D cameras work virtually like the real thing (sorry about that “virtually”) with an aperture, focal length (zoom) position in 3D space, and orientation (rotate in X, Y, Z). And of course, all of these parameters can be animated over time. Figure 8.44 illustrates a time-lapse exposure of a moving 3D camera in a scene. A very powerful feature of 3D cameras is that not only can the compositor animate them, but camera data from either the 3D department or the matchmove department can be imported into the compositing software so that our 3D compositing camera exactly matches the cameras in the 3D department or the live action world. This is an immensely important feature and is at the heart of much 3D compositing work.
Figure 8.44
Animated 3D camera
Figure 8.45
Camera projection
An important difference between 3D cameras and real-world cameras is that the 3D camera does not introduce lens distortion to its images but live action cameras with real lenses do. This adds a whole layer of issues when mixing live action with 3D scenes, whether it is in the 3D department or the compositing department. There is a detailed treatment of this important topic in Chapter 11: Camera Effects.
One totally cool trick a 3D camera can do that a real one cannot is a technique called “camera projection”. In this application the camera is turned into a virtual slide projector then pointed at any 3D geometry. Any image in the camera is then projected onto the geometry like the illustration in Figure 8.45. If a movie clip (a scene) is placed in the camera then the clip can be projected onto the geometry. Camera projection is a bit like walking around your office with a running film projector shining it on various objects in the room. It is a very powerful and oft-used technique in 3D compositing and we will see a compelling example of its use shortly.
3D compositing refers to compositing 3D elements and live action clips in a true 3D environment. It may also include importing 3D geometry from the CGI department modeled in a 3D package such as Maya and applying texture maps and lighting to render it. So, modern visual effects compositing software will have a real “3D department” incorporated into it. To be proficient in this important branch of compositing you also need some background in 3D itself, which is why we had the short course in 3D above. Along with having a basic understanding of 3D, the compositor is also now expected to have a stronger math and science background than before. In this section we will explore what 3D compositing is and how it affects the visual effects pipeline and the job of the compositor.
Figure 8.46
Original 2D render
Figure 8.47
Position pass AOV
We start our foray into the wonderful world of 3D compositing with a jaw-dropping example to drive the point home – creating a 3D version of a CGI character just from its 2D renders. We can then use the 3D version to line up any 3D geometry we want to add to the character. All we need is the original 2D render (Figure 8.46) and its position pass (Figure 8.47) plus a magical tool in the compositing software that will make a point cloud and texture map it from these two images. This example is from Nuke, but your software may or may not have such a wondrous tool – yet.
The jaw-dropping starts with the 3D point cloud version of the 2D render in Figure 8.48 showing the point cloud in 3D space from the same camera angle as the original 2D render. Some 3D geometry has been sprinkled around the point cloud to help illustrate the 3D-ness of the setup. The side view in Figure 8.49 starts to show how the 3D point cloud does not show the entire 3D character. This is because it is made from the position pass, which only shows the pixels that were seen by the camera, so pixels around the back are not captured and are missing from the point cloud. Figure 8.50 goes three-quarters the way around the character so you can really see the missing bits. Even though the point cloud cannot include the back side, in the real world there is usually plenty of 3D information to line up your other 3D elements. And you don’t have to load in the original 3D geometry and texture maps so it is real “light” on the system.
Figure 8.48
Front view
Figure 8.49
Side view
Figure 8.50
¾ rear view
WWW Point cloud orbit.mov – this is a movie of the orbiting camera used to prepare the illustrations in Figure 8.48, so you can see how the point cloud appears in a 3D environment with the missing back side.
The pan and tile shot is a frequently used technique to replace the far background in a live action plate. Let’s say you have just filmed the hot new XZ-9000 sports car with lots of dynamic camera moves. The car and local terrain around the car look great, but the far background is a boring trailer park or bleak desert. It must be replaced with something sexy, but the live action camera is flying around wildly. How can we possibly sync the movement of the replacement background with the live action camera?
In the past we gave the live action plate to the matchmove department where they used powerful camera tracking software to reverse engineer the 3-dimensional position of the moving live action camera over the length of the shot. But today’s professional compositing software comes with camera tracking tools so the compositor now gets to do his own camera tracking.
Figure 8.51 3D scene with tiles
Figure 8.52
Camera view of tiles
This camera information is given to the compositor who then sets up the 3D scene with a 3D camera using the camera track data and some tiles (flat cards) that have the background scene projected on them as shown in Figure 8.51. As the camera moves through the 3D scene it pans across the tiles and re-photographs the pictures on them. One frame of the camera’s view of the tiles is shown in Figure 8.52.
Figure 8.53
Live action plate
Figure 8.54
Composited shot
The sky (or whatever is in the far background) of the original live action plate, shown in Figure 8.53, is keyed out so the XZ-9000 and the foreground can be composited over the pan and tile background. The finished composite is shown in Figure 8.54. As a result of the camera-tracking data used to animate the 3D camera, the background moves in perfect sync with the original live action camera. This we call a matchmove shot and it is an elegant example of 3D compositing.
WWW Pan n tile.mov – this is a little low res movie of the above shot after the finished comp shown in Figure 8.54.
In the earlier part of this chapter we saw how camera projection worked. Here we will see how it is used. What we will see is how an amazingly photorealistic 3D scene can be quickly (and cheaply) made using just a couple photographs – and an all-powerful 3D compositing program, of course.
Figure 8.55
Animation clip created with camera projection
Figure 8.55 shows the animation clip that was produced using camera projection. The camera starts low and to the left of the towers then swings around the front and climbs high near their tops. A beautiful sky flows past the towers from behind. Let’s take a look at how incredibly simple this shot was to build.
WWW Camera projection.mov – this is a low res movie of the animation clip shown in Figure 8.55.
Figure 8.56
Elements of a camera projection shot
The original building and sky photos in Figure 8.56 are the only elements it took to make this shot. The prepped artwork was creating by painting a clean black background around the buildings to make it easier to see the alignment of the artwork on the geometry. Any small misalignment would reveal a black edge that cues the artist to tighten up the alignment.
Figure 8.57 3D setup for camera projection
Figure 8.57 shows the 3D setup for the shot. Two cubes were placed in the scene and scaled vertically to an appropriate height to receive the projected building artwork. The sky photo was texture mapped to a short section of the inside of a cylinder. Note that there are two cameras. The white one projected the prepped artwork onto the cubes while the green one moved around the scene and photographed it. 3D lights were not even needed because the lighting was captured in the original photographs. And it is all wonderfully photo-realistic simply because it was made from real photos.
An important point to keep in mind about camera projection shots is their limitations. You have to keep the camera on the “good side” of the scene at all times. If the camera were to swing too far around the cubes then their un-textured backsides would show. For this reason camera projection shots work best with moderate camera moves.
Multiplane shots break an image into separate layers (usually done in a paint program), the separated layers are then projected onto 3D cards or other simple geometry then the 3D scene is re-photographed with a moving camera. Why a moving camera? Because if you don’t move the camera then you don’t need to set up a 3D multiplane shot. You could just do a 2D move on the image. Figure 8.58 illustrates the separated layers of the original image for a multiplane shot.
Figure 8.58 Image broken into separate layers for a multiplane shot
Figure 8.59 is the first frame of the multiplane render and shows how closely it matches the original image. Figure 8.60 shows the actual 3D setup. Each image layer was camera projected onto a 3D card that was deformed to roughly approximate the size, distance and shape of the three layers from the camera. The red camera is the projecting camera and the green camera is the 3D rendering camera.
Figure 8.59
Multiplane shot render
Figure 8.60
Multiplane shot 3D setup
These types of multiplane shots are very common and account for a high percentage of the 3D compositing for visual effects. The brilliant thing about these is that they are a very cheap render usually taking only a few minutes per shot, plus they are photorealistic because they start with a photo. Of course, lots of multiplane shots are done with digital matte paintings as well. The addition of the camera move with all of the correct parallax and motion makes this class of shot very convincing and still inexpensive to do.
Set extensions are another staple of 3D compositing. The basic idea is to avoid building huge, expensive sets for huge, expensive visual effects shots. Instead, the live action is photographed with the characters on a small, relatively inexpensive portion of the final set. This live action plate then goes to camera tracking, then the camera track data goes to the 3D department for modeling the set. Sometimes the rendered CGI set comes to the 2D department for compositing, but other times the 3D database and camera-move data will be delivered. The compositor will then import the 3D database and render the CGI in the 3D compositing program before doing the composite.
Figure 8.61
Live action plate for set extension
Figure 8.61 shows a few key frames of the live action plate filmed for a set extension shot. The camera starts in tight then pulls back to reveal the edges of the live action set then continues back and up to show what will become composited over a 3D castle. This plate is camera-tracked to derive the camera move so the 3D department can match it. The region surrounding the set has been keyed out and an alpha channel added so that the layer is prepped and ready for compositing. It could be keyed out by surrounding it with a bluescreen or it could be rotoscoped.
Figure 8.62
3D setup for set extension
Figure 8.62 shows the 3D compositing setup for the set extension. The 3D geometry and the backdrop for the far background are set up and ready to be photographed. The camera is in the lower left corner (the green wireframe) and will be animated using the camera track data from the live action plate (Figure 8.61). This ensures that the 3D camera replicates the motion of the live action camera so all of the pieces fit together perfectly. Of course, the pieces never fit together perfectly, so one of the key responsibilities of the 3D compositor is to fix things. Being at the end of the production pipeline it falls to us to fix everything that went wrong anywhere else in the production process. Think of it as job security.
Figure 8.63 Set extension animation clip
Figure 8.63 shows a few frames from the final set extension animation clip which you can view in its entirety from the book website (below). At the beginning of the clip the camera is so close to the characters and the live set that they fill the frame. As the camera pulls back and tilts up the live action set shrinks and is extended with the 3D set. It all works because of the camera tracking. Without it none of this would be possible.
WWW Set extension.mov – this is a complete rendered movie of the set extension animation clip illustrated in Figure 8.63.
There is another common technique for adding 3D backgrounds to either live action or CGI elements that uses a method similar to pan and tile. Consider the CGI jet clip rendered in Figure 8.64. The 3D department has modeled the jet, added materials and surface attributes, lights, and a camera. The jet was “flown” from right to left and filmed with an animated camera. The mission for the 3D compositing team is to add a stunning panoramic background behind the jet animation that moves in perfect sync with the original 3D camera.
Figure 8.64
CGI jet animation render
Figure 8.65 shows the stunning panoramic background plate that will be photographed behind the CGI jet while maintaining the exact same camera move that was “baked” into the jet animation. The 3D department has provided the camera animation data from their 3D program. This is the same type of data that might come from the matchmove department after camera tracking. The camera data is entered into the 3D camera of the compositing program so its camera moves in perfect sync with the one in the 3D department.
Figure 8.65 Stunning panoramic background plate
Figure 8.66
3D scene setup
The 3D scene setup is shown in Figure 8.66. The background plate is simply texture mapped onto the inside of a section of a cylinder. The camera pans across the background plate, perfectly replicating the original camera move used in the 3D department for the jet. Notice again that there are no lights. The 3D compositing program renders the background plate from the camera’s point of view, which is then composited behind the CGI jet animation.
While the Pan and Tile method has a background broken into several separate tiles, this technique uses a single large image. The textured cylinder section is then positioned to fit into the camera’s field of view for the entire shot.
Care must be taken when applying the texture map to the cylinder to not distort it by stretching it in X or Y. Accidentally stretching the image is very easy to do and will annoy the nice client. If you are given multiple separate images for a pan and tile shot it would be much simpler to merge them together into a single panoramic image like this so you can use the inside of a cylinder rather than a series of cards. However, using the pan and tile cards will introduce a bit less distortion. Your call.
Figure 8.67
The finished composite
With the camera data loaded into the 3D camera and the panoramic background in position we are ready to render and composite the finished shot to get the results shown in Figure 8.67.
WWW 3D background.mov – this is the complete low res video of the finished composite shown in Figure 8.67.
3D compositing will continue to expand in importance to visual effects production for the foreseeable future for the simple reason that it is a more-efficient and cost-effective way to produce visual effects. Becoming proficient in 3D compositing will be critical to maintaining job security in the future visual effects industry.
Alembic geometry is a 3D scene interchange file format that is rapidly replacing the FBX interchange format. Alembic is a file format for exchanging 3D scene information between different platforms and applications in a very compact and efficient form. 3D apps like Maya can write out their 3D scene content in the Alembic format, which can then be read in by any other app that supports Alembic. Beyond that, Alembic stores 3D geometry in a format that is very efficient with a small file size that is quick to read and unpack. But wait – there’s more! It also allows you to load just the frames you want or just the bits and pieces you want. You don’t have to load the entire database for the whole robot if you just want to deal with the head.
Figure 8.68
Character mesh
Alembic distills the scene to meshes with their vertices (also called a point cache) that can be easily read in by the next application. It is perfectly analogous to lighting and rendering scenes out to an image file that can then be displayed by any application that reads that image file format. A 3D Renderman scene can only be loaded and viewed in Renderman, but the rendered EXR file can be loaded and viewed by Nuke, Photoshop, After Effects, or any other app that reads EXR files.
The way Alembic makes it easy to exchange 3D scenes is that it “bakes out” just the vertices of the geometry so you have the results of complex procedural animation without the complex procedures themselves. So a deforming rigged character (Figure 8.68) is baked out to just the character’s mesh for each frame, but without the rig. Included along with the mesh can be materials, lights, cameras, axis, and other nifty bits of the 3D scene.
If the geometry has no deformations, just simple transformations, then only one instance of the geometry need be saved along with the transform information. Very compact. This approach introduces great efficiencies and time-savers in the CGI production pipeline. It creates an asset-based parallel CGI pipeline so that multiple groups can be working on the same scene elements at the same time. It also makes it far simpler to pass work between groups (or facilities), eliminating the need to convert data between platforms.
Figure 8.69
3D geometry with transformations
Figure 8.69 illustrates the Alembic representation of a scene in its simplest form. The top part of the figure shows the shot we want to make: a bouncing soccer ball. The bottom part illustrates the Alembic representation of that scene: just one mesh for the ball plus the transforms required to make it bounce. This is a very compact form for the scene.
Note that the transforms are in a verbose frame-by-frame format rather than just a few keyframes defining a spline path of motion. The reason for this is that every application would draw the spline somewhat differently but verbose frame-by-frame transformations are easily defined and exquisitely repeatable. The geometry and animation will be identical regardless of which app uses it.
One of the other key features of the Alembic format is illustrated in Figure 8.70, which is the ability to just load selected parts of the database. This dramatically lowers data transfer and load times as well as render times. The wireframe model on the left shows the entire elegant structure we are working with. However, if we don’t need the middle cylinder we can just deselect it and it is never fetched, loaded, or rendered. How this is done is illustrated using Nuke’s implementation called a “Scenegraph”. Each object is listed in the “inventory list” and you just disable those components you don’t want at a particular moment. In this example the second wireframe model is missing the cylinder because it was disabled in the Nuke Scenegraph.
Figure 8.70
Selecting geometry with the Scenegraph
In the real world of production, where you are dealing with more complex 3D structures than my little example here, the gains are huge. Let’s say you are working with an articulated robot. The Scenegraph will display a hierarchical list of all the fiddly bits of your robot so that you could, for example, turn on just the head, which would include the eyes, mouth, and death ray. Or you could click off the head and just turn on the eyes, or just the left eye.
The key here is that you only pay for what you actually use in memory space, file-transfer time, and render time. This is not true of the FBX file format, the other, older 3D scene file interchange format. With FBX you must load the entire database even if you want to deal with just one part. You would then turn off the unneeded parts in the application after having suffered the overhead of loading them all in. Better to not load them in the first place. Further, because FBX supports more sophisticated representations of the 3D scene there are interpretation differences between applications that introduce compatibility problems that Alembic does not have.
Another cool attribute of the Alembic format is subframe support for motion blur. Let’s say the geometry is in one location on frame 3 and another on frame 4. You could, for example, load the geometry on frame 3.2, 3.4, 3.6, and 3.8 to render motion blur. The Alembic format will return these in-between subframe positions automatically without intervention from the application program
Camera tracking is the basis of a 3D matchmove, and, of all the amazing things that computers can do with images, to me this is the most amazing of all. When a CGI element needs to be added to a live action scene it must be rendered frame by frame from exactly the same camera angle and with the same lens as the live action scene. 3D reference points from the live action scene are also needed to help line up the 3D geometry to be added. All of this is made possible by camera tracking. Here is how it works.
The computer can analyze a clip then reverse engineer the camera location and its lens distortion throughout the shot, but also produce several points in 3D space to locate landmarks in the scene. These 3D points are called the “point cloud” and the reverse engineered camera and lens is called the “solved camera”. With a point cloud and a solved camera 3D objects can be accurately positioned within a live action scene then rendered with a matching camera move. Without them they cannot.
Figure 8.71
Triangulating on a point
Camera tracking works by triangulation. Referring to Figure 8.71, as the camera moves through the scene it “sees” the corner of the cube from four different angles. Using a bit of trigonometry, that point can be located in 3D space. The built-in camera models figure out both the lens and camera location, then triangulation reveals where the point is in 3D space – relative to the camera. It cannot locate objects in world space without being told by the operator where the ground is and other reference points. Camera tracking is the bridge between the 2D and 3D worlds of visual effects and proceeds in three steps – tracking, calculating the solve, and building the scene.
Figure 8.72
Original clip
The first step in the camera-tracking work-flow is feature tracking. The 3D tracker will look for hundreds of features in the scene to lock onto then track them over the length of the clip. The tracking phase has but one purpose – to collect hundreds of feature tracks that the 3D tracker can later use to calculate its solve. Better results are achieved if the clip has the lens distortion removed (flattened) first. This topic is cov ered in detail in Chapter 11: Camera Effects. The original clip in Figure 8.72 is a simple interior shot with a nice camera move and has already been flattened for tracking. Here are some tips about the tracking phase.
Study the clip – loop the clip in your viewer to study it. You are looking for problems in the clip that will spoof (confuse) the tracker and give you bad feature tracks. Moving objects such as a car or person will usually not spoof 3D trackers, as they are looking for static objects so are designed to reject moving ones. When in doubt, mask out anything that might spoof the tracker. Things like reflections in glass windows or the surface of water – because reflections shift with the camera move and we need fixed objects to track. Mask out the sky, even with clouds. The reason is that they are at a great distance, and features very far away from the camera do not help the solve. We will see why shortly.
Pre-process the clip – sometimes the feature trackers cannot get a good lock on the large number of features required, so pre-processing the clip can assist the tracker. Grain is usually not the problem for 3D trackers as there are so many feature trackers that the grain averages out – unless the clip is hideously grainy. The tracker is actually making a luminance version of the clip internally to track on so you might pre-build a tracking version of the clip that the tracker will like better. Most of the picture detail is in the red and green channels and most of the noise is in the blue, so you could make a one-channel image that is the sum of just the red and green channels. Increasing the contrast may also help the tracker.
Set tracking parameters – all 3D trackers have settings of one kind or another to be dialed in by the artist to optimize the tracking phase for each clip. There may be settings to change the number of trackers, sensitivity to noise in the clip, the type of camera motion it should be looking for, lens information (if you have it), and others, depending on the software.
Figure 8.73 Clip with feature trackers
Start tracking – the tracker will first blast hundreds of feature trackers onto the screen like the example in Figure 8.73. It places the feature trackers all around the screen based on its internal rules, as modified by the tracking parameters that you have set. One important requirement is that the feature trackers cover the entire screen fairly uniformly and not be clumped in one or two areas. The tracker then steps through the clip frame-by-frame, moving the feature trackers to keep them locked onto their targets. Some of the trackers will break lock and terminate. New ones will be spawned to replace them to keep the feature tracker count constant – again, using the internal rules plus your parameter settings. The tracking phase can take several minutes.
After the tracking is completed the next step is “the solve”. This is where the camera tracker analyses all of the tracking data collected and computes the lens, the camera position frame-by-frame, and the point cloud. This step is computationally expensive so it can take several minutes. To assist in the solve, certain data should be collected at the location during principal photography such as what lens was used, notes on lighting, and key measurements between selected landmarks. However, most of the time you will just be handed the clip and will have to fend for yourself.
The error – the tracker compares the tracker data to internal camera motion and lens models to calculate an overall error number so that you can tell if you have a valid solve or not. This is where the fun begins – trying to figure out why you have a bad solve so you can correct it, then re-solve it to get a lower error number. The problem is that it can be very difficult to figure out what is spoofing the tracker. They are not very good at telling you what ails them.
Cull bad points – some of the trackers have generated bad data that is confusing the tracker, so those bad trackers need to be identified and eliminated. The trick is knowing which to eliminate. The tracker calculates an error number for each tracker, so eliminating high error trackers is a good start. However, there can also be trackers in the mix that have reasonable errors that are contributing to the solve confusion. This is another reason to mask off problematic areas of the clip before tracking.
Figure 8.74
Point cloud and camera
Once the solve error has been refined to within acceptable margins the scene can be built. This is the step where the solver produces the solved lens and camera and the 3D point cloud, as illustrated in Figure 8.74 using the Nuke CameraTracker. The point cloud is colored, based on the original image to assist in 3D lineup. The camera motion is shown for first, middle and last frames as green wireframes. Note the missing areas of the point cloud. These are due to occluded areas of the scene that the camera could not see (see the original clip in Figure 8.72) so could not model. Once the scene is built here are the next steps:
Setting the ground plane – the tracker has no idea which way is up or where the ground is so the initial scene build will start at a default orientation that is incorrect. Keeping in mind that the entire purpose of doing the camera tracking in the first place is that you need to place a 3D object within the scene, that job will be a lot harder if the ground is tilted at an odd angle. You must carefully select points from the point cloud that you are certain are on the ground plane, then tell the tracker to re-orient the entire scene accordingly. If your ground plane is not level you will suffer greatly trying to position your 3D objects.
Setting the scale – the tracker also does not know the proper scale for your scene so it will start with some default size that you will also have to correct. However, if you are tracking the scene for your own purposes, such as to track a card with an image into the scene, then you might be able to leave the default scale. If you are importing 3D geometry then you will have to correct the scale of either your 3D scene or the 3D geometry. Assuming that you will be getting more than one piece of 3D geometry from the CGI department you should rescale your scene to match the geometry, so it will be correct for subsequent geometry.
Incorporating location data – in some situations the solved scene must be scaled to exactly match the original location where the scene was shot. This can be done a couple of ways. The big way is to do a location survey that collects data about the location dimensions during principal photography. This can be done with Lidar, a laser-based point-mapping technology, or with photogrammetry, the science of modeling a scene from a series of photographs. The small way is to take a few measurements at the location with a tape measure. Let’s say it is 10 feet between this rock and that tree. Those two objects are identified in the point cloud then the scene is scaled up until there is 10 scale feet between them.
A strange but important point must be made here. The 3D tracker can create a 3D point for an object in the scene even if that object moves out of frame during the camera move. A feature tracker need only lock onto the spot for several frames to provide enough data for the solver to calculate its location. This means that there are many points in the point cloud that are outside of the camera’s field of view at different times during the shot. This also means that those 3D points can be used to track objects such as a monitor even if it goes completely out of frame, as long as it was in frame long enough to generate its points for the point cloud. Bottom line – a camera tracker can also be used to track objects that go completely out of frame. Try that with your point tracker. To be fair, a planar tracker can track an object partially out of frame, but not completely out of frame.
Figure 8.75
Wireframe in point cloud
With a good camera solve and a correctly scaled and oriented point cloud we are ready to position the 3D geometry in the scene. Using the point cloud as a positioning reference, the 3D geometry is placed in the scene while viewing it in the 3D world, like the character example in Figure 8.75. The 3D lights are set up for the character to match the lighting in the live action scene, the material properties are assigned to the geometry, then the character is rendered using the solved camera. The CGI character will be rendered frame-by-frame with the exact same moving camera as the live action. With some good lighting and color correction the CGI character will fit right in to the original live action plate as if it was photographed there, like the example in Figure 8.76.
Figure 8.76
Final comp
The lens distortion issue must be correctly managed for all of this to fit together properly, because the original live action plate has lens distortion in it but the rendered CGI does not. There is an extensive section on managing lens distortion when mixing CGI with live action in Chapter 11: Camera Effects.
WWW Camera tracking clip.mov – this is the clip used in this camera-tracking section so you can try your hand at camera-tracking the same shot. It’s an easy track, so you should have fun.
If a 3D object needs to be incorporated into a large outdoor matchmove shot there can be a few modifications to the above workflow. Here we will see how, instead of a point cloud with hundreds of randomly located 3D points, we can use a few precisely placed tracking markers that only mark the exact area of interest. We will also see how test patterns are rendered and composited to confirm the correct camera solve and the accuracy of the tracking markers. All of the steps above were done – lens distortion removal, tracking, generating the camera solve, generating a point cloud – but, instead of using the point cloud, a somewhat different technique was used for the lineup.
Figure 8.77
Original plate
Figure 8.78
Tracking markers
Figure 8.77 represents the original plate, an industrial complex where an exciting car crash scene is to be filmed with a wild camera move. The mission is to place a $500,000 Lamborghini roaring out of control into the shot, but no one wants to bend up a real Lamborghini. The original plate is analyzed by the 3D tracker, which produces the 3D tracking markers seen in Figure 8.78, instead of a point cloud. These markers lock onto features in the plate that help the 3D artist identify the ground plane, so the car can be placed on the ground (not below it or floating above it). Further, the solved camera must move exactly like the live action camera to maintain a believable perspective and avoid the car squirming and sliding around the shot.
Figure 8.79
Checkerboard lineup check
Figure 8.80
3D car with references
Figure 8.79 shows a frequently used lineup check where a simple checkerboard pattern is rendered and lightly composited over the original plate. If the camera solve is correct then the checkered test pattern will lock onto the ground plane over the length of the shot, in spite of the moving camera. In Figure 8.80 the animator is using the tracking markers and checkerboard lineup reference to make sure that the 3D car is positioned correctly in the scene. So, the tracking markers are used to line up the checkerboard reference, then the car is lined up to the checkerboard reference. With all references removed the car is rendered in Figure 8.81. The final composite in Figure 8.82 will have a recklessly driven Lamborghini photographed by a wildly moving camera, all locked together into a seamless visual whole.
Figure 8.81
3D car render
Figure 8.82
Final composite
In this example a live action scene was tracked so a 3D object could be added to it. This same technique can be used to place live action people into a completely computer generated 3D set, often called a “virtual set”, with dynamic camera moves. The people are filmed on a greenscreen stage that has tracking markers on the wall. This is then camera-tracked so the 3D department knows how the camera moved on the greenscreen set. This data goes to the 3D camera that can then render the 3D set with a matching camera move. The compositor then pulls a key on the original greenscreen and composites the people into the CGI environment.
When CGI meets the real world do not expect things to match up perfectly. Uncorrected lens distortion, small errors in the camera-tracking data, and inappropriately chosen tracking points all conspire to prevent things from fitting together perfectly. It is our job then, as the compositor and shot finisher, to fix it all. That’s why we get the big bucks.
So far we have been focused on 2D and 3D compositing. With the composite completed we are now ready to turn our attention to those things that make the composite more photorealistic – starting with the all-important topic of color correcting. The next chapter is all about lighting effects, how color correction operations affect both the picture and its data, plus an extensive section on establishing good workflow for color-correcting the layers of a composite to integrate them photorealistically.