5 The Race to 3-D

So the grand dimensionalization project began, at Nintendo and everywhere else in the game world. Every 2-D franchise would, via trial and error, see what it would play like when placed in a virtual world. Just about every game franchise would have a stumble or two making this move. They were fundamentally different types of game play, and therefore resulted in different types of games. […] The move to 3-D would be the biggest single design change games had ever seen. (Ryan 2012, 188, 178)

In the previous chapter, we saw how the Super Famicom’s Mode 7 graphics could stand as a shiny piece of silverware, capable of putting twinkles in the eyes of guests at the grand table and enhancing usual meals with a little je ne sais quoi—some spicy rotation here, a pinch of scaling there, the occasional dash of reflection, and a helping of translation. The SFC’s graphics capabilities, however, also sported a more radical innovation, one that was nested in the Mode 7 shearing transformation. To fully appreciate this contribution and properly situate it among the technological landscape of the 1990s requires us to tackle the larger question of 3-D graphics and technological innovation. In this chapter, we will see how the SFC’s Mode 7 infrastructure allowed the console to attain a free-roaming perspective view typically held as the goal of 3-D graphics, even as it remained at heart a 2-D or “2.5-D” system. The chapter will close with a review of how Nintendo’s business decisions kept the firm away from the forward momentum in the later years of the platform.

Techniques and Technologies for 3-D Graphics1

The late 1980s and 1990s in video games can be seen as globally driven by a common goal: the race toward 3-D. As Carl Therrien’s etymological inquiry into the origins of the first-person shooter as a genre reveals, “3-D” was one of the quintessential buzzwords of the decade, present in post-id Software shooters but primarily in role-playing, action, simulation, and racing games alike. Reflecting on his study of video game promotion and reviews (including but not limited to the epitext and peritext of games), he remarks, “Most of the games under scrutiny in this paper have been discussed and/or sold as a three dimensional experience; ‘3-D’ appears to be the most pervasive textual element in our network” (Therrien 2015). Although the drive toward 3-D and its use as a buzzword have almost always been present in video game discourses and marketing (it can be traced back to the 1970s, as we will see in this chapter), there is an intensification of the efforts to pierce the third dimension from the late 1980s and into the late 1990s.

In a way, this was the third frontier to be overcome in video game spatiality, after the first frontier, which had been the move from fixed- to multiscreen games in the 1970s (think of Berzerk, Robotron: 2084, Adventure for the Atari VCS, Pitfall!, etc.), and the second frontier: scrolling screens, first unidirectional (horizontal in Defender and Super Mario Bros., vertical in Xevious and Kid Icarus), then multidirectional (Rally-X, Metroid). A few early pioneers had tackled the third dimension, with some success (Night Driver, Pole Position, Hang-On), but the occasional foray would only become a full-scale offensive push in the late 1980s. This “three-frontier” description, however convenient, faces a bit of irksome counterfactual history: In truth, 3-D had been present for quite some time already within certain video game genres. Indeed, computer games had been having 3-D ever since the 1970s, well before arcades had scrolling. Why? (or how?) Because they weren’t going for the same kind of 3-D—or, more precisely, the same graphical regimes—than arcade and console games were.

As we will see, 3-D in itself is not a technology or a “thing” with substance. 3-D is an idea and a problem. To say that a game (or a film or picture) “is 3-D” is to say that the visual qualities of the object are organized in a way that accurately represents or simulates the three dimensions of spatial perception (width, height, and depth, technically abstracted into the x, y, and z axes), although they are, in truth, 2-D objects. Put more bluntly, the goal of “going 3-D” is to take a bidimensional surface (screen, canvas, or paper) and somehow “stick a third dimension in there.” The reason that video game developers strived to make 3-D games (and why video game marketers insisted on it) is that it contributed to a sentiment of “being there” for the gamer, as Therrien (2015) writes—of being more “immersed” (another powerful buzzword) in the virtual world depicted. Achieving 3-D can be done through a variety of graphical techniques or technologies, which can be partitioned into five groups: axonometric projection, stereoscopy, linear perspective, prerendered polygons, and real-time polygons.

Axonometric Projection

Many games sought to achieve 3-D through axonometric graphics, a form of parallel projection that dates back to Zaxxon, Q*Bert, Ant Attack, Knight Lore, and dozens more games. These games present the in-game space (and characters) from an angled “three-quarter view” as 3-D, unlike the side view from Super Mario Bros., which hides width, or the top-down view in The Legend of Zelda, which renders height difficult to assess. Axonometric games can appropriately render all three dimensions but not according to human perception; objects, buildings, ships, and anything else do not appear smaller to the eye the farther away they are or distorted in their proportions because they recede toward the horizon at the back (a practice in art known as foreshortening). Lines that are parallel in an object (e.g., paved floor tiles, as in figure 5.1) stay parallel in the picture and do not converge toward the horizon. In fact, games rendered with axonometric projection do not feature a horizon at all: Tiles of ground fill the screen as far as the player can see, as if looking down at a chessboard.

9787_005_fig_001.jpg

Figure 5.1 Axonometric (“isometric”) graphics in the SNES titles Shadowrun and Equinox. Emulated on Higan v0.95.

Usually called “isometric” through popular usage and tradition, although the term is technically inaccurate,2 these games can be seen as the ludic legacy to practices focused on spatial measurement, planning, and accounting, including descriptive geometry, technical and architectural drawing, and industrial design and engineering. Because they favor a more rigorous, intellectual, and strategic approach to space, it is no surprise to find many strategy or management games using the view (Age of Empires, Final Fantasy Tactics, SimCity 2000, FarmVille, etc.). These games output the fictional world as a grid of angled square tiles, and they don’t attempt to simulate someone’s gaze from a specific (and human) point of view; they construct space as an abstraction, a map for players to manage, like interactive animated graph paper. Thus, although the graphical technique is indeed a way of achieving tridimensional space, it is not done in the tradition of illusionism in art—the long-sought ideal of mimesis, the imitation of natural phenomena. As such, though axonometric projection is different from the other kinds of 3-D that we will see in the chapter, it is worth keeping in mind that the relationship between 3-D and illusionism is not automatic.

Stereoscopy

Stereoscopy is another case of 3-D that has been experimented with at various stages of video game history, as a constant but always marginal movement. The principle behind stereoscopy relies on the optic phenomenon of convergence in human vision. The human visual system is a set of two eyes, each of which perceives the world separately. When we focus our visual attention on objects situated in the large overlapping area covered by both eyes, they converge toward the point of visual attention. This results in two slightly different images, a few degrees of angle apart, that are mixed in and analytically composed by our brain into a total, unified image. The averaging of disparities between the two images, however small they may seem, allows us to indirectly perceive depth and the volume of objects. The recent surge of 3-D films and video games that have swept movie theaters, home televisions, and even handheld game systems (with the Nintendo 3DS) may use a number of varying technical protocols to achieve the effect, but they always share the same basic principle: two slightly different images are emitted, and our eyes perceive them as one global image with a depth interval.

The 8-bit Sega Master System had its SegaScope 3-D glasses, which were used for games like Space Harrier 3-D. On the 8-bit NES, Rad Racer also used some glasses, as did Jim Power: The Lost Dimension in 3-D on the Super NES. Nintendo went the extra mile in this direction, basing a whole system on the premise of stereoscopy: the Virtual Boy, symptomatically marketed with the slogan “A 3-D Game for a 3-D World.” The Virtual Boy’s failure was unequivocal: Nintendo pulled it from the market within six months, with fewer than 800,000 units sold between the summer of 1995 and the spring of 1996. A monochrome red display on black backgrounds was a difficult sell, even if the binoculars offered working stereoscopic graphics. Ergonomics were a definite issue because the system proved too bulky to be “headheld” or really “portable”; it had to become a “transportable” system to be used as a tabletop device, supported by a tripod (with users typically suffering from back pain, on top of eye strain). In the end, however, the games failed to impress because they did not offer new modes of gameplay. The stereoscopic effect enhanced the depth impression, but ultimately it was the same graphical regime as the 2-D games, with added backgrounds and parallax scrolling seen in chapter 4. The stereoscopic layering of pictures could not produce authentic tridimensionality but rather resulted in an animetic interval between two bidimensional pictures.

In this sense, there is something even more 3-D to be found inside the fabric of stereoscopic (and nonstereoscopic) pictures, something so common that it is often overlooked: the organization of a drawing (or of computer graphics) according to the rules of linear perspective.

Linear Perspective

A long and rich tradition of illusionism exists in the history of Western art, one that revolves heavily around the principles of linear perspective put forth by Leon Batista Alberti in the 1435 treatise De Pictura. Alberti proposed a set of techniques that relied on geometrical and mathematical principles, which artists could use to construct a space on their canvas that accurately represented depth as perceived by humans. The essence of linear perspective lies in a vanishing point, a central focus of visual attention toward which all the lines of objects that stretch in the depth axis should converge. Unlike axonometric projection, this method does not accurately render the dimensions of objects or of the space but rather imitates the view a human subject would have of them. What is rendered is a subjective gaze rather than an objective space. Objects that are farther away from the viewer will be visually situated higher, closer to the horizon line. They will diminish in size as well and appear smaller than they really are, unlike in axonometric projection, where tiles and objects keep their absolute size regardless of where they are on the visual surface. Their shapes and lines will be distorted, with their depth being increasingly compressed through foreshortening. The objects’ colors will progressively lose their saturation and blend together with the background earth and sky into successive strata of green and then blue haze (a technique called atmospheric perspective).

The techniques forming linear perspective have been widely adopted by artists throughout the centuries, making it something like a “default” system of representation (Damisch 1987). The earliest video games to have integrated 3-D graphics were, in fact, simply showing game environments according to a system of predefined views made of static pictures that followed (more or less closely) the rules of linear perspective. This method can be found in the 1974 Maze War and the following RPGs that later came to be grouped as the “dungeon crawler” subgenre: Akalabeth:World of Doom, Tunnels of Doom, the Ultima, Wizardry, and Might & Magic series, Dungeon Master, and so on, including the Eye of the Beholder series shown in figure 5.2, next to F-Zero. Because the graphics were structured with lines receding toward a vanishing point and a horizon line, the impression of having a tridimensional space functioned to create maze-like experiences and a type of spatial immersion unlike any other. Gamers could move into a fictional world, turn around and explore other directions, and in general found themselves in the middle of a game-world space, rather than occupying a privileged position of viewing separate from the world (either above it and looking down, in top-down view, or outside a glass window or transparent wall and looking at it from the side, in side-scrolling view).

9787_005_fig_002.jpg

Figure 5.2 Linear perspective in the SNES port of Eye of the Beholder and F-Zero. Emulated on Higan v0.95.

These games instilled a specific graphical regime: the step-based slideshow of linear perspective pictures. Environments were not scrolling by in real time and fluid space but were rather a predetermined set of postcards that provided a fixed “hard space.” As a result, this graphical regime offers 3-D views of game space but do not feature a 3-D space; the pictures result from a tile-based construction of the game world, with player movement limited to going either forward or backward one tile or making 90-degree turns around. Logically, the game is partitioned as a grid of x-y coordinates (like graph paper), but visually each time the gamer moves, the game displays a picture of the scene according to the rules of linear perspective, creating the illusion of exploring a tridimensional world. This illusion is easily underestimated nowadays, used as we are to seeing video games rendered in technically correct linear perspective, and can make us forget how impressive it is. Electronic Gaming Monthly reviewer “Major Mike,” discussing the Super NES port of Eye of the Beholder, noted, “A highlight of this one is 3-D graphics” (EGM #59, June 1994, 33). It is easy to understand here that 3-D means “a succession of views organized according to the principles of linear perspective,” for which Eye of the Beholder is not particularly notable. But by 1994, 3-D was all the rage and everywhere to be found in the post-Doom glut of first-person shooters, 3-D graphics cards, and polygons.

Although linear perspective provided an adequate sense of depth in constructing a 3-D portrait of space, it had to come in static pictures. The next step was to find ways to make 3-D worlds compatible with action gameplay to eliminate the hard spaces of fixed tile distances and make it fluid.

Mode 7 and Perspective

Beyond the special effects it could sprinkle over 2-D worlds, what really set the SFC’s Mode 7 apart was the ability to generate pseudo–3-D experiences through an impressive perspective effect. In this, Nintendo’s innovative contribution to graphical technologies took inspiration from Sega’s Super Scaler Engine. Yu Suzuki of Sega developed the influential arcades Hang-On, Space Harrier, and OutRun using sprite scaling to have objects smoothly zoom in toward the player in real time as the ground rolled forward underneath. The functioning of Mode 7 differed on a crucial principle: It would apply to a background map (the “ground”) rather than sprites, stretching, twisting, and rotating it as a flexible surface and scrolling it smoothly at high speeds. It functioned by changing the transformation matrix across scanlines: As the picture was processed and displayed line by line on the screen, the PPU could be instructed to draw the next line according to a different transformation. Accordingly, Mode 7 gave game developers the possibility of taking a detailed image of a landscape seen from a bird’s eye view and display that “map” by slightly stretching the image more and more as the lines got drawn on the screen, successively widening the ground so that features seemed to recede in the distance and converge toward the horizon, as with a perspective drawing.

9787_005_fig_003.jpg

Figure 5.3 Illustration of a line-by-line transformation to simulate perspective in Mode 7, exported from F-Zero. Left: top-down view of the ground map. Right: progressive angling to reach the perspective shearing. Emulated on Higan v0.95.

A second background plane, representing a skyline as seen when looking over the horizon line, is drawn in the top portion of the screen to create the illusion of a total unified space. Over the Mode 7 “playfield,” the 2-D sprites (individual movable objects) were superimposed and positioned according to distance (higher toward the horizon the farther away they are). This technique was demonstrated by Nintendo’s F-Zero, and the fragile illusion can be broken down by disabling select background layers in emulators, as I’ve done in figure 5.4.

9787_005_fig_004.jpg

Figure 5.4 F-Zero’s playfield without the background skyline (left), and without the playfield (right). Emulated on Snes9X v1.53 for Windows.

Although the tentative move toward 3-D was important in establishing the Super Famicom’s identity—“Mode 7” was a term thrown around by everyone and understood as “that special thing the SFC can do”—the bulk of the SFC’s specs were designed with 2-D games in mind, which was a perfectly reasonable thing to do back in 1988. Nintendo’s teams were “thinking in 2-D,” contrary to Yu Suzuki (at least if we believe his claims from Mielke 2010, 3); Mode 7 was a way to render visually what was logically computed as a top-down 2-D map, as can be seen in figure 5.3. Because Mode 7 can only project flat surfaces, everything on which cars should bump or race around had to be included as a sprite. In fact, anything meant to stick out of the ground needed to be represented as a sprite laid on top of the right coordinates.

The fragility of Mode 7’s illusion is perhaps best captured if we try to imagine it in physical space instead of in the digital realm. To achieve the Mode 7 effect, we would need a sheet of paper with a detailed ground map in bird’s eye view drawn on it. We would take that sheet and lay it on an angled 45-degree table, one that had been fitted with a special lever-operated treadmill that somehow folded and stretched the sheet of paper, one line at a time from the bottom, until the vertical parallel lines converged toward the center top of the sheet, through an ingenious system of mechanical pegs perhaps—the most difficult operation to realize without a computer to do matrix transformations. With our “background plane” in place, we would then take some characters and objects (drawn in profile view from the side, front, or back) glued on upright cardboard stands and carefully place our cardboard cut-outs on the exact spots they should occupy on our ground map.

Then, finally, we would take a real background picture—a postcard of the sky with a nice mountain range, for instance—and pin it to the wall or corkboard behind our angled table. We would place ourselves in front of the table, in the middle, get down on our knees to have our sight down at the right level, and we’d be graced with an illusionary projected world in perspective. Then, if we turned the lever, our table’s treadmill would have our ground map scroll by (and loop back), but we’d need an assistant with wires or such to move our cardboard cut-outs in synchronicity (and pull them away from the scene if we scroll the “background plane” too much). If we did all this, then we’d have a projected world in perspective and smooth motion—no small feat.

There is still one more problem with Mode 7’s perspective trick: As elements (sprites) got closer to the player, they needed to be enlarged. Unfortunately, sprites could not be scaled, rotated, or otherwise affected by the matrix transformations that Mode 7 permitted because Mode 7 was applied to backgrounds, not sprites (contrary to Sega’s Super Scaler engine). Game developers had to include predetermined renditions of every sprite in multiple “distance copies”; the game checked every object’s distance away on the projected landscape and pulled the necessary smaller or larger presupplied version of the sprite as necessary to simulate their growing or shrinking. This made for weird “pop-ins” during the play experience as sprites crossed their distance threshold and grew in size (and sometimes crossed it back as the player slowed down or backpedaled in a dance of jittery metamorphoses). Combined with the SFC’s restriction of having only sprites of two sizes at once, this drastically limited both the raw number of sprites and the quality of the depth illusion in pseudo–3-D Mode 7 environments because the copies of different-sized sprites could tie up precious visual memory. Figure 5.5 illustrates how a single race car in F-Zero could require multiple distance (and angle) copies for the illusion to work satisfactorily.

9787_005_fig_005.jpg

Figure 5.5 Racing against the Blue Falcon in F-Zero means seeing the car at various predrawn sizes and angles as the player overtakes it. There are more angles of view when the car is up close (in the top three rows) and a lot less as it is farther away. Spritesheet built by Solink, with contributions from Davias, downloaded from Spriters Resource (http://www.spriters-resource.com/).

The Earth Was Flat

The perspective effect, although fragile and with obvious limitations, was a convincing step in the direction of representing 3-D worlds, thanks to the smoothness and speed at which the ground scrolled and the 360 degrees of freedom of movement afforded to players. This distinguished Mode 7 from the previous depth-scrolling racing games of Sega’s Super Scaler system, which only offered forward movement along a predetermined race track. However, it came with an important and absolute limitation: the flatness of the terrain. Because Mode 7 was, at heart, a trick of 2-D, it presented ground maps with width and depth but not height. The illusion was made salient whenever a player, floating high in the air, like in Final Fantasy III, came closer down to ground level, revealing the mountains below to be just pictures of mountains painted on a flat carpet or map (see figure 5.6). Anything that would normally stand upright had to be represented as a 2-D sprite, carefully positioned on top of the Mode 7 floor map and scaled appropriately using different distance copies to maintain the illusion as the player moved nearer or farther away.

9787_005_fig_006.jpg

Figure 5.6 Flatness of mountains revealed when flying low in Final Fantasy III. Emulated on Higan v0.95.

An equally jarring example of Mode 7’s limits often occurs in The 7th Saga. As the player travels through the overworld in a top-down view, enemy encounters are played out through an impressive and dramatic Mode 7 spiraling zoom, bringing a perspective view down to the exact tile where the player-character was standing. The player’s detailed battle character sprite is shown, and enemies appear in front of them—somewhere along the adjacent north tile. This works wonders when the player is traveling in a plain or another flat surface, but when the player is walking alongside a mountain range, the Mode 7 perspective illusion breaks down as the player-character and enemies battle back and forth over grass and some flat, distorted mountain pictures on the ground. Far from creating an effect of immersion in the game-world, in these cases, the perspective rendering of the background only heightens the artificiality of the game’s visual representation, confirming that what the player is traveling on is a schematic map rather than an actual world. In this respect, Illusion of Gaia had a more adapted use of the perspective effect because it presented the characters in “travel sequences” that displayed an old and stylized map acknowledged as such, instead of going for a convincing illusion of a world.

9787_005_fig_007.jpg

Figure 5.7 The 7th Saga’s transition from top-down overworld traveling (left) and perspectival fight scene (right) reveals the artificial construction of height as the mountains are reduced to a flat picture on the ground. Emulated on Higan v0.95.

A more serious issue The 7th Saga would face is when the player is traveling in an indoor environment. These environments depend on the 2-D top-down tiles that make up the background “floor” being carefully crafted with angles and colors that attempt to depict walls and create the illusion of castle halls, or cave and dungeon walls. If encounters in these tight quarters used the same perspectival effect than on the overworld, then the illusion of walls and ceilings would be shattered as the player’s character and monsters would fight atop flat wall pictures spread on the ground. Instead, the game simply has the map fade out to a generic fight scene with the background completely dark, an arrangement that works regardless of the actual location where monsters are encountered, and that preserves the fragile illusion of space.

One game that solved the height issue elegantly was Secret of Mana, pictured in figure 5.8. When riding the dragon Flammie high over the world, the game offered the perspective view, but when the player went too low in altitude in preparing for landing, it switched dynamically to a top-down view of the ground map. Presenting the top-down view allowed the player to aim more precisely at the spot on which to land but also helped to mask the artificial flatness of the Mode 7 plane.

9787_005_fig_008.jpg

Figure 5.8 Inspired alternating between perspective and top-down views in Secret of Mana means the flatness of mountains is deemphasized by the top-down view. Emulated on Higan v0.95.

Many sports games went with the Mode 7 perspective effect: They had to represent a finite number of players on the playing field (often well within the sprite limits of the console), a ball, and two nets or goals, and that was it. The rest was just lines on the ground, drawn on a 2-D plane, an ideal fit exacerbated by the convenient fact that just about any sports needs to be practiced on flat ground. The audience, rink, walls, and any large number (or large size) of upright objects were, however, problematic. Figure 5.9 demonstrates two approaches to the issue. NHL Stanley Cup went for an uneasy mash-up by plastering rows of fans seated in a fixed frontal view at the top of the screen (as a background skyline) while the hockey rink rotated around, suspended in empty space. NCAA Basketball resolved the problem by doing away with the audience altogether, and the playing field just floated among a background of blue Nether.

9787_005_fig_009.jpg

Figure 5.9 NHL Stanley Cup and NCAA Basketball. Emulated on Higan v0.95.

Many SFC/SNES games included a Mode 7 effect that wasn’t offering full 360-degree movement but kept to a graphical regime of top-down or side-scrolling view, enhanced by a slight angle to give some additional volume to the graphics. Brett Hull Hockey ’95 offered some Mode 7 shearing that significantly angled the view, as well as Super Soccer and other sports games. Many games from other genres offered such aesthetic Mode 7 treatments. In Final Fantasy II, when the player pilots the Big Whale, the world map is angled further back from the usual top-down view to give a sense of majestic scale to the Lunar spaceship. In Final Fantasy III, the overworld map is slightly angled to give the illusion of a round earth receding away, on a larger scale than the typical top-down view for the villages and dungeons players must explore.

The crucial difference between the top-down or side-view graphical regime with Mode 7 angling sprinkled on top, and the newer graphical regime of 360-degree freedom of movement, was whether the viewpoint was mobile. A mobile viewpoint, as in NHL Stanley Cup, meant objects had to be redrawn dynamically to have their size and angle of view consistent with the viewpoint’s potentially new position every step of the way—they needed to be sprites, but Mode 7 applied only to backgrounds, and the SNES had more trouble handling a high number of sprites simultaneously than its rival Genesis, let alone keeping multiple copies of sprites for different sizes and different angles of view as well. Mode 7 couldn’t represent indoor environments because walls would have been made of sprites—enormous sprites that would have exceeded any limit, of visual memory, metatiles, or sprite per scanline. Even the outdoor environments suffered from the lack of a properly simulated height dimension, as the flat mountains of The 7th Saga and Final Fantasy III have shown.

The Super Star Wars trilogy should be noted for going beyond the call of duty, especially Super Star Wars: The Empire Strikes Back, in the Battle of Hoth level. The game managed to offer hills and slopes to players as they flew their snowspeeder around in Mode 7, which was no small feat, as Nintendo Power described the stage: “Now the action takes to the air in some of the coolest flight combat in any video game as you fly over 360o of 3-D terrain” (Nintendo Power #53, October 1993, 11). This impressive technical achievement was hyped to gamers in Nintendo Power’s “making of” article dedicated to the game, which described the functioning of Mode 7 in a refreshing display of technoliteracy among the realm of technobabble and buzzwords:

Other technical wonders are found in the speeder stages and when the X-wing flies over the clouds. The sense of speed is imparted from splitting the screen and scrolling two different images. The background (above the horizon) scrolls conventionally left and right. The foreground (below the horizon) is created from a topographical map. Using Mode 7, the map is tilted sideways and the 3-D textures look like surface features as it scrolls beneath you. In Empire, these maps also rise and fall, giving the illusion of passing over hills and valleys. (Nintendo Power #52, September 1993, 85)

Super Empire was pushing against the limits of “faking” 3-D space in a platform made for 2-D games. But to the discerning gamer (or to any gamer, really), something was off. Its impressive hills and valleys were, in fact, randomly generated (more accurately, rhythmically generated) as the player went forward, instead of being located in certain defined spots to make up a virtual topography. Players needed only drive in a certain direction, see how the landscape’s height rose and fell, and then reverse course and drive back the same way to realize the game had no memory of the slopes and valleys they had just passed, and that the hills were generated in a rhythmically regular but spatially inconsistent way. Further compounding the problem was the level’s objectives: Because players had to circle three or four times around the legs of AT-AT walkers, they were bound to see hills, flat lands, and valleys generated successively while circling around the same spot.

All in all, the idea of generating varying height slopes was great, but the impossibility of tying the variations in elevation to specific coordinates on the map means it would have been perfect for generating waves on the sea rather than an illusory pseudo-topography sure to break down under scrutiny. Still, Sculptured Software’s creative take on the limits of Mode 7 testifies to both the developer’s technical skill and the increasingly tight technological quarters that Mode 7 was growing into—not at all unlike the Famicom a few years before. 3-D was making headway into video games through a much more powerful yet disarmingly new way: polygons.

And Then There Were Polygons

The computation of polygons in real time had appeared in video games through a plurality of entryways. I, Robot, developed in 1983, is often credited as the first game to have used 3-D polygonal graphics. Five years later, in 1988, Namco’s System 21 and Taito’s Air System were released in the arcades, and polygons were showcased in the former’s Winning Run and the latter’s Top Landing. Flight simulators were perfect candidates for polygonization after all, given that they relied on the accurate simulation of all three dimensions. In the early 1990s, polygons made up the hallways and “sectors” of Doom, into which 2-D sprites would move and shoot each other, not unlike the cardboard cut-out sprites of Mode 7, with an important addition that Mode 7 could never do: walls. Alone in the Dark reversed the concept, placing characters and objects made of 3-D polygons over backgrounds that were fixed graphics. Sports games also got their polygons, with Stunts and then the hyperbolically named 4D Sports Boxing. Strategy games, but most of all the multiple descendants of Doom that crystallized the first-person shooter genre and ultimately everything else, would eventually follow suit.

Polygons were here to stay, and many PC gamers invested in a new technology for their computer: 3-D accelerated graphics cards, an additional piece of hardware solely dedicated to the specialized function of calculating and rendering polygons. A convergence in marketing united games and 3-D accelerated graphics cards; personal computers were on the road to technological supremacy and brought complex, intricate gaming experiences with them. The hard spaces of discrete tiles in dungeon crawlers evolved into fluid tridimensional worlds, a transition perfectly seen in DreamForge Intertainment’s RPGs Ravenloft: Strahd’s Possession, Ravenloft: Stone Prophet, and Menzoberranzan. These games let the player toggle between the two graphical regimes of step-based and 360-degrees free-form movement. Players could travel through most of the games in steps, which accelerated movement, but occasionally activate the fluid 3-D engine and gain finer, smoother control, instead of being locked into the tiles’ discrete spatial organization.

Many games did not use real-time polygons, that is, polygons rendered on the fly by the host computer and that could be dynamically redrawn to accommodate movement of the viewpoint by the player. Aided by the extra storage capacity of CD-ROM drives, many of them used prerendered polygons to present impressively detailed, sprawling virtual worlds to their players, stirring immersion. By rendering the views of polygons ahead of time, powerful computers could work for an extensive amount of time to produce high-quality visuals, which would then be recorded and played back as movies. The method was essentially what the animation industry used to render 3-D animated movies, with the same drawback: Once the images and movements had been computed, they could not be altered.

Myst reprised the postcard-style static screens that rendered in perspective drawing the 2-D “graph paper” corridors of dungeon crawlers, this time organizing the visualizations in an irregular suite of creative, “cinematic” (or more accurately “photographic”) views. The 7th Guest provided the same postcard type of views but articulated transitions between postcards with an animated traveling sequence, linking together the static screens by dynamic movement. These transitory animated sequences, however, were not enough to alter the graphical regime of the step-based slideshow that was common to both of them, because players did not interact with the transition to explore or perform other actions. Star Wars: Rebel Assault pushed movement as its strategy: Every level had been constructed as virtual scenes in 3-D polygons, and LucasArts had predefined a path of movement through the level they rendered as a movie. The gamer steered a starship or aimed at targets on the screen while the movie of forward movement through the set of polygons played on, thus bringing the game in the same graphical regime of on-rail shooters than the Sega-CD live-action games Sewer Shark and Tomcat Alley.

The Super Nintendo was ill equipped to deal with polygons. Its slow CPU and limited data buses left it with little raw power to work with things that did not fall neatly into the corporation’s predefined graphics modes. Multiple scrolling backgrounds and parallax effects, high-resolution graphics modes, or matrix transformations applied to 2-D backgrounds couldn’t do a thing for polygons. Lots of flashy 2-D would never result in 3-D—it could at most give “2.5-D,” as Nintendo liked to describe Super Mario World 2: Yoshi’s Island.

A number of games managed to create polygons (or derivatives, such as vectors with color fill-ins) on the Super NES, but they all did so at the expense of plodding frame rate, reduced number of colors, or by windowing the screen’s surface to limit the display area. Dragon View is a good example of the latter technique because its impressive real-time 3-D polygonal overworld traveling mode is confined to a window roughly half the size of the screen—and even then it has important slowdowns when multiple objects appear on screen. Another World was made of filled vectors (2-D polygons, such as the Drakkhen overworld) and, as a result, is one of the few SNES games to have loading times between areas, on top of a choppy frame rate during action sequences (although the introduction cut scene to Flashback: The Quest for Identity makes anything seem silky smooth by comparison, really).

That polygons could appear in 2-D games under controlled conditions wasn’t a particularly shining achievement. Nintendo would not sit by on its flat Mode 7 background plane and watch polygons take off from the sidelines. The specialized silverware and decentralized architecture of the SFC/SNES could accommodate an additional coprocessor dedicated to managing polygons. Instead of marketing that additional computing power in a risky add-on, with the associated problems of installed base and submarket segmentation that Sega was wrestling with thanks to its CD-ROM and 32X expansions, Nintendo would follow the way it had charted out with the Famicom and rely on expansion chips set in game cartridges.

Expansion Chips: “Now they’re playing with effects … Super FX!”

Although every platform offers a set of possibilities to game developers by facilitating certain aspects of game making, each of them also has an internal history: the early days and years of a system see a lot of software experimentation from designers trying to maximize the console’s potential and go beyond the original hardware limitations, which is gradually accomplished throughout the platform’s life. Although this applies to all platforms, it takes on a whole new dimension with Nintendo’s cartridge-based consoles because new technology is also added into cartridges as the platform advances through time.

One of the defining features that had made the Famicom and NES so adaptable to the ever-shifting nature of the games business, as well as keeping the initial cost of the system down, was the concept of having special enhancement chips inserted in particular cartridges to expand on the system’s base specifications. Masayuki Uemura’s expandable and flexible engineering solution was repeated and taken to a new level with the Super Famicom, as the number of cartridge chips and the impact they had in shaping their games really pushed beyond the limits of the platform.

Some of the earlier expansion chips had modest effects because it wasn’t visually clear how the chip contributed to the game, and so many gamers did not even know about these expansion chips. Pilotwings and Super Mario Kart, for instance, are usually presented as strong examples of Mode 7 graphics, but they actually benefitted from an onboard digital signal processor (DSP) chip, the DSP-1, which assisted with various math functions to accurately track coordinates in space, render images, and scale or rotate them in Mode 7 perspective view. What the chip was doing was assumed to be the working of regular Mode 7 technology, and Nintendo of course focused Super Mario Kart advertisements in pushing Mode 7, so that the SNES as a whole would look all the better (Harris 2014, 312).

The early DSP-1 still wasn’t enough for Nintendo’s ambitions, however. Therefore, in 1990, the venerable Japanese giant consulted a team of hotshot British programmers known as Argonaut Software. They had managed to perform two equally impressive feats: getting 3-D polygons in a functional NES prototype version of their computer game Starglider (renamed NesGlider), and getting 3-D polygons in a functional Game Boy prototype for a game that Nintendo picked up, produced, and published in 1992 under the name X. When Nintendo called them in, they needed a way to make the Pilotwings plane rotate in real-time 3-D instead of displaying multiple versions of its sprite that had been predrawn for a selection of predefined angles. Although the allotted three-month delay was too short for that, the Argonauts set to work on a powerful graphical accelerator expansion chip that would eventually become the Super FX chip, to be demonstrated with great fanfare in Nintendo’s 1993 original game property (the first since the SNES launch in 1990): Star Fox, a game jointly developed by the Argonaut team from an SNES prototype of NesGlider, and Nintendo developers led by Shigeru Miyamoto. Thanks to the chip, the game would play out almost entirely in real-time 3-D polygons.

The Super FX chip would be the one to bring chips in the spotlight, getting mentions in magazine previews and articles left and right, and even appearing on Star Fox’s game box: “Revolutionary Super FX Micro Chip Creates Special Effects Like Never Before!” This short text, as well as the chip’s name itself (FX for Effects, with the Super that had practically become a Nintendo trademark by then), embody the logic of “special effects” that I positioned as Nintendo’s approach to graphical technologies in the 16-bit era.

Chips became a technological argument for developers; Capcom promoted its CX4 chip (specially designed to integrate 3-D wireframe meshes in their 2-D platformer) on the back of the Mega Man X2 box: “Enhanced realism and 3-D effects with the new CAPCOM C4 graphics chip!” Square’s Super Mario RPG: Legend of the Seven Stars and Nintendo’s Kirby Super Star used another chip, the SA-1 (for Super Accelerator-1), which housed a 65C816 microprocessor with a clock speed of 10.74 MHz (three times the SNES’s 65C816 “quick access” speed of 3.58 MHz or four times the 2.68 MHz “slow access”) and an array of enhancements, including faster and additional RAM, memory mapping, and math functions. Essentially, this made the SNES hardware little more than a box to house the real brains behind the game: the chip nested in the cartridge.

Nintendo’s commercial push for the chip was of course taken up by the games press. The Super FX chip was revealed by Nintendo on August 26, 1992, at the Shoshinkai trade show. Electronic Gaming Monthly covered it in a one-page article titled “Super-FX Chip Brings 3-D to Super NES” (EGM #40, November 1992, 48). It is worth noting that EGM partitioned the article’s page with a text box explaining that Nintendo wouldn’t be releasing a CD-ROM attachment for the Super NES anytime soon, the project having been seemingly abandoned. EGM causally links the two events, seeing in the Super FX chip a way for Nintendo to skip CD-ROM technology and have the SNES compete favorably against its competitors—and, most notably, the threatening Atari Jaguar (cue chuckle from contemporary readers who know from their retrospective vantage point how the roaring Jaguar fizzled away in the end).

The linking of the two news in the EGM feature encapsulates the interplay between technological trajectories that took place over the late 1980s and early 1990s: FMV proved to be a dead end because it sacrificed interactivity in trying to achieve photorealism. What the gaming world needed was not more graphical fidelity but more graphical regimes; polygons provided a way to achieve realism in building a fully realized virtual world while also creating new gameplay possibilities and situations for gamers to enjoy. The Super FX chip could do so, to an extent, but was merely a ticket to the defining technological trajectory for the future of games, a preview of things to come. The train would be Nintendo’s next console, announced during the 1993 Shoshinkai, code-named “Project Reality,” and codeveloped with Silicon Graphics—the firm that was on everyone’s minds for computer-generated special effects, responsible for the T-1000 in Terminator 2 and T-Rex in Jurassic Park. Even as it had the ticket in hand and the train coming up, however, Nintendo was working on a different tangent in the trajectory of 3-D: the previously mentioned monochromatic, stereoscopic wireframe graphics “headheld” system, the Virtual Boy. However, Nintendo had bet on the wrong horse, and the Virtual Boy died before the train even got to the station. Stereoscopic wireframe graphics were tied to the old (graphical) regimes and weren’t the way to go; the brave new world of polygons lay ahead.

Like any king or emperor facing newfound expanses of democracy, Nintendo wasn’t in a hurry to jump into tridimensional polygonal games, a new way of making games that required “thinking in 3-D” (Mielke 2010, 3) and could topple the established order by invalidating competent game developers’ expertise in 2-D games overnight. Many histories of the period claim that Shigeru Miyamoto was developing a 3-D Mario game at the time of Star Fox, a project referred to as “Mario FX” (cf. Ryan 2012, 165). This is incorrect. As Evan Gowan (2012) shows, Miyamoto had the idea of making a 3-D Mario game, but nothing was materializing at the time. “Super Mario FX” was the codename given to the Super FX chip during development by Argonaut Games (Mathematical Argonaut Rotation I/O). There never was a question of producing a 3-D Mario game on the SNES. Instead, Nintendo was actively fighting against the inclusion of 3-D games on its 16-bit console.

A Stubborn Gorilla Goes 3-D: Donkey Kong Country

True to itself, Nintendo adopted polygonal 3-D for its surface-level graphics and treated it as a way to up-end graphical fidelity while pursuing games that conformed to traditional graphical regimes and gameplay genres. It was as if Nintendo, the giant gorilla of video game business, was too stubborn to adapt to the changing reality of video games. Another technically strong British developer was going to help Nintendo push further in the polygonal 3-D trajectory. Rare had been developing games for the NES since 1986, contributing more than 45 titles to the platform’s library, including original series like Battletoads and Wizards & Warriors. It was one of the only developers to have invested a substantial amount of money in getting equipped with Silicon Graphics workstations to model 3-D characters or environments. 3-D graphics had an instantly recognizable aesthetic—one that was au goût du jour at a time when society was obsessed with dinosaurs thanks to the computer-generated imagery in Jurassic Park. In 1994, Nintendo purchased a 49% participation in the studio, making it a second-party developer.

Rare married technological innovation with conservative design, which would result in Donkey Kong Country and Killer Instinct. Both featured 3-D graphics in the clear, familiar frame of well-known 2-D genres (the platformer and fighting game, respectively): graphical upgrades within familiar graphical regimes. When Nintendo presented footage from its upcoming game at the Summer 1994 CES, the press assumed it was a preview of a game for the firm’s announced Project Reality, a dual arcade-and-home all-encompassing system. Everyone was taken by surprise when it was revealed to be a game coming out for the baseline Super NES. 3-D characters—and way better looking than Sega’s Virtua Fighter blocky humans that had impressed people the world over in arcades—running and jumping in perfectly fluid framerates on a 16-bit console? What was this new devilry?

There was a secret: The polygonal characters had their animations prerendered and stored as individual frames in the game’s memory. Technically, they were played out exactly like any other game’s character animations, highlighting the fundamental difference between modeling and animating that constituted the foundation of polygonal 3-D graphics. Modeling the character could be done with any materials: The classic case of digital pictorials in pixel-drawing, hand-drawn pictures scanned into still images, individual photographs of people who had been filmed and then digitized (as we will see in the next chapter), clay models hand-animated with stop motion techniques (as in Clay Fighter), or characters modeled with 3-D polygons and then animated frame by frame were all equal in the animation process executed by the SNES’s PPU; it was simply a matter of displaying individual frames one after the other. In that way, the grain and visual signature of 3-D computer-generated polygonal graphics was present, but it was all surface; the deep functioning of these graphics was still prerendered, inflexible 2-D sprites.

Donkey Kong’s comeback (which I covered in chapter 3) wasn’t only a cultural statement by Nintendo (which, always reiterating, christened the Japanese game Super Donkey Kong); it was also, and more bluntly so, a technological statement. Promotion around Donkey Kong Country revolved heavily around the technological advancements it was bringing to the table, in a notable departure from the typical technology-avoiding trend that characterized promotional discourses from 1994 onward. Hence, Donkey Kong Country represents an interesting site of tension between the two approaches of technological promotion. The Advanced Computer Modeling (ACM), listed as “obviously cool stuff” in a Nintendo promotional poster that downplayed the technological discourse, was flanked by “old-fashioned” technological flaunting in another untitled two-page spread by Rare for United Kingdom magazines of the time:

It’s taken 22 man years, 32 megs, 32,768 colours and 1 super computer to make him look this gruesome. You’ve never seen anything like this before. Donkey Kong Country is the world’s first fully-rendered video game. To produce it took 22 years work on 6 SGI work stations and one XL Super Computer. The graphics are 3-D. The playing arena is 32 megabit. The levels number 111. (No, that’s not a misprint—one hundred and eleven). But the most amazing aspect of Donkey Kong Country is that you don’t need a 32 bit machine or a CD-ROM system to play it. Because Donkey Kong Country is only on the Super NES. So go and grab one now. You’ll go absolutely ape.

Donkey Kong Country sold a whopping 6 million copies in the first 45 days from launch, proving that something worked somewhere along the way, whether it was the popularity of the classic video game character, the technological marketing, or simply—and regretfully—the amazing surface-level graphics. Shigeru Miyamoto, who had helped Rare finish Donkey Kong Country by making various design touch-ups to improve the gameplay, reportedly said, “Donkey Kong Country proves that players will put up with mediocre gameplay as long as the art is good” (Kent 2001, 518). It did more than sell 6 million copies over the 1994 holidays (Buchanan 2009), however: It helped Nintendo seize back the market lead from Sega’s Genesis (Schilling 2003a, 11). Miyamoto’s answer to that would come with the next flagship game for Nintendo: Super Mario World 2: Yoshi’s Island. When he proposed the game to Nintendo’s marketing, they turned it down, asking for projects that featured impressive visuals, like the 3-D prerendered graphics of DKC. Miyamoto retaliated by pushing further Yoshi’s Island child-crayon art style, which gave the game a distinctive visual signature that satisfied marketing.

As child-like and innocent as it may have looked on the surface, however, Yoshi’s Island was an impressive technological beast at its core, in a new spin on Nintendo’s characteristic duality. It was equipped with Argonaut’s latest version of the Super FX coprocessor, astutely dubbed the Super FX-2 (having letter-and-number combinations was always a sign of technological complexity and an easy way for the public to perceive something as improved). That chip provided advanced sprite manipulation possibilities, advertised by Nintendo as “morphmation” technology: Sprites of enormous sizes were one option, but more important, the chip allowed sprites to be scaled and rotated, effectively rendering the Mode 7 operations available to be performed on sprites rather than background planes.

Aside from technology, there also was creative evolution. As we’ve seen in chapter 2, when the SFC was launched, Miyamoto had promised, “Wait, and I will learn more about the limits of this machine” (Sheff 1993, 231). The more inspired techniques found in later SFC games, such as Secret of Mana’s point of view switch to alleviate Mode 7 flatness, Chrono Trigger’s time-traveling Mode 7 sequence, and finely tuned Yoshi’s Island moments—the Fuzzy-dizzy LSD trips, the rotation of the entire sky when running across a small moon to battle Raphael the Raven, the gigantic Baby Bowser walking forward in the background—all fulfilled the promise and show how platform mastery increases over time thanks to both technological advances and creative experience.

Nintendo’s choice in sticking with 2-D sprites and classic gameplay is representative of its cautious treading on the grounds of innovation. It wasn’t for lack of polygons and 3-D games with new types of gameplay around Nintendo. Argonaut had developed a Super FX game of its own, Vortex (1994), and assisted Nintendo in creating Stunt Race FX—granted, a conventional racing game in its gameplay, but still one made of polygons. Sadly, the more innovative games on which Argonaut worked had their release canceled by Nintendo after substantial development effort.

Inches Away from the Finish

Star Fox 2 was not only “fully completed,” according to Argonaut programmer Dylan Cuthbert (in Gowan 2010), it had even been promoted by Nintendo, who disclosed screenshots and ran previews in Nintendo Power (cf. #69 in February 1995 and #76 in September 1995), as well as having the prototype on display at the Winter 1995 Consumer Electronics Show (Gowan 2010). Nintendo’s decision to cancel its release may be explained by the recent release of the PlayStation and its more advanced polygonal 3-D graphics, which would have made the game look crude by comparison. Most accounts, however, link that decision with the impending release of the Nintendo 64 and the choice to have 3-D games coming out only on the N64 (in the end, as Cuthbert notes, the N64 would get delayed for so long that Star Fox 2 wouldn’t have hurt anything). Gowan’s assessment of the game contextualizes Nintendo’s resistance to innovation: “The final beta of Star Fox 2 is the culmination of a nearly two and a half year development process to take the game from an on-the-rails shooter to a fully 3-D experience” (Gowan 2010). In other words, it was not simply a change in graphical technology but a more significant change in graphical regime.

Many unique elements in Star Fox 2 can be characterized as full innovations in terms of genre conventions, bringing together the genres of real-time strategy and shoot’em up. A main map screen displays units advancing toward each base, and the player must chart a course to intercept them, somewhat like Ogre Battle: The March of the Black Queen or the older 1979 Space Battle for the Intellivision—or, more troublingly, like Argonaut’s 1992 Game Boy game X, with which Star Fox (and especially the Star Fox 2 prototype) bears more than passing resemblance. Planet and battleship levels offer mazes that the player must navigate, in addition to things to be shot. None of these features would make it into the next Star Fox game, Star Fox 64, which in comparison looks like an enhanced remake of the original Star Fox that conforms to the logic of reiteration rather than innovation. Dylan Cuthbert expressed it so: “Star Fox 64 incorporated a lot of the newer ideas we created in Star Fox 2 but it didn’t, in my view, take the genre a full step forward. Star Fox 2 really was a different direction of gameplay” (Cuthbert in Gowan 2010).

The other missed opportunity to innovate was FX Fighter, a 3-D fighting game Argonaut was developing to compete with Sega’s Virtua Fighter that had been making a killing in the arcades since its release in late 1993. However, the game had been dethroned by Killer Instinct, an arcade game released in late 1994, jointly published by Nintendo and Williams and developed by Rare. The game was announced to be running off Project Reality hardware, and a port for the Nintendo 64 would come; the home version would use the same technology, bridging the gap that separated home and arcade hardware, as the Neo Geo had attempted to do. However, as the N64 was delayed, Killer Instinct was ported for the Super NES instead, hitting store shelves in August 1995. It is hard not to consider the implications this had for FX Fighter. Like Star Fox 2, the polygonal fighting game was presented at the January 1995 Winter Consumer Electronics Show and previewed in Nintendo Power #69 in February. Like Star Fox 2, it was canceled by Nintendo.

Piecing together these different events gives a clear and easy line of reasoning for Nintendo to have acted in such a way: Rare’s Donkey Kong Country had received enormous praise and sold millions of copies in record time, thanks to prerendered 3-D graphics integrated in tried-and-true 2-D platformer gameplay. Sega’s Virtua Fighter had made a splash in the arcade, but Rare’s Killer Instinct had taken up the mantle by integrating 3-D prerendered graphics in tried-and-true 2-D fighter gameplay; ergo, there was no need for FX Fighter. Gamers were satisfied with games that had novel and impressive graphics in refined and familiar gameplay situations. That approach would prove its worth with Yoshi’s Island as well. Known formulas allowed Nintendo to leverage its accumulated expertise in game crafting, rather than risking ventures in new game genres that slipped outside its control. It was a simple restating of what the platform had been about all along: The Super Famicom was, after all, a “Super” version of its Famicom that favored incremental improvements on known genres and special effects. Likewise, the additional hardware chips in late Super Famicom games would provide “Super” effects that tied into the same game experiences in a renewed display of “lateral thinking.” Sega might have kept repeating its slogan, “Welcome to the Next Level,” but Nintendo would insist on sticking to the basics—super basics—as if enough polish could make the Silver Age last forever.

Notes