Chapter 7. Video and Audio

In 2005, Web 2.0 gained a new standard-bearer. It was based in hot new technology, attracted people and venture capital dollars, and above all, it was viral. Centers for Disease Control and Prevention viral.

And yet, the technology that drove this company wasn’t Ajax. What made YouTube an instant hit was web video. YouTube took advantage of the video functionality found in Flash Player, not to mention gigabits of bandwidth, to deliver one of the most addictive experiences on the Web.

If you’re looking to be the next YouTube, this book won’t help you. In fact, maybe nothing will, unless you’re a CEO of a Fortune 100 company and you’re looking to unload a few billion dollars on hosting, content monitoring and filtering, and defending against international regulatory and civil actions. But video is now a first-class citizen of the modern Web, and everyone from designers to developers to marketing departments to educators should understand when, how, and why to use it.

Web Video: The Early Years

Though video on the Web came into its own only within the last few years, computing and video have a long history. If you know a former Amiga owner, this will be obvious to you, as they will have described its role in vivid, excruciating detail. From the release of the Amiga 1000 in 1985, it was one of the premier tools for editing video in production environments. However, in those days, a 7 MHz Motorola 68000 processor wasn’t quite up to the task of processing 525 interlaced lines; the system offered only overlaid graphics and control over external videotape recorders (VTRs).

The first mass-market video application was QuickTime, which debuted in 1991. Used primarily in “multimedia” CDs, QuickTime videos were often chiclet-sized 160x120 movies, to accommodate the slow CPUs and low-resolution displays of the day. (By way of comparison, the iPhone and iPod touch could display 8 of those 160x120 movies simultaneously on their 480x320 display. The iPhone’s 620 MHz ARM CPU and 16 GB of storage wouldn’t have been anything to sneeze at in 1991, either.) Still, in applications such as educational materials, video in any form was a huge step forward for what was mostly a text-centric medium.

While the Web was still taking shape, small numbers of users had begun working with video online. They were divided into two constituencies: the Usenet file sharers, perhaps best known as the folks who popularized MP3s; and the video chatters, using early programs such as CU-SeeMe to connect with one another. The Usenet alt.binaries groups experimented with a few different formats, but, early on at least, most settled on some variant of the Motion Picture Experts Group’s MPEG-1 format. Mind you, back then, you needed a lot of things to be able to watch anything good: a high-quality Usenet newsreader application—which could read multipart, uuencoded messages and decode them automatically—as well as a set of codecs, a player application, and all the time in the world, as these huge videos came down over 14.4 kbps or 28.8 kbps modems. YouTube, it wasn’t. But it did, along with live video chat technology, lay the groundwork for the next wave of applications to take the technology mainstream.

Real Networks released the RealVideo player in 1997. Already popular for introducing streaming radio with its RealAudio format, Real’s first step into video codecs made live net video a reality. It would take a number of years before these technologies were used frequently, but by the early 2000s, most major media outlets worldwide were producing either simulcast or recorded streams of their content via the Web, using the Real, QuickTime, or Microsoft Windows Media formats.

The next change in the landscape came in 2002 with the addition of video capability in Macromedia Flash Player 6. (Macromedia was subsequently purchased by Adobe in 2005.) Flash was already widely deployed, enjoying more installs than any of the existing video plug-ins, and as its installed base upgraded to newer versions, Flash quickly had an impact on the video market.

Suddenly, the potential for web video had increased. Flash, which had evolved from a platform for animation into a reasonably full-featured multimedia platform, was now capable not only of integrating video but also of overlaying and synchronizing that video with other graphics. Designers reacted tentatively at first to the addition of video; it would take at least a couple more years for most Flash projects to contemplate planning a shoot, producing a final package, and hosting large video files for their clients.

Then came the rise of YouTube. YouTube was not the first aggregator of small clips of web video: Real, Microsoft, and Apple had offered free content to users through their respective players. Real even offered its paid RealOne service to provide more streams at higher quality. But the walled-garden approach each provider took meant users were limited to what was in each company’s directory and were strongly persuaded to use the player application to find that content, leaving them little opportunity to share that content with others. With Flash video, on the other hand, the browser was the player, and user-generated content was easy to find. In fact, by reducing a video producer’s hosting costs to zero (or even paying them a cut of advertising revenue, as Revver did, with YouTube following), the web video sites fundamentally solved one of the biggest barriers to the proliferation of video content.

What may be the final piece to the puzzle (at least for now) is high-definition, or HD, content. The vast majority of the content published to the Web has been at a lower resolution than that of standard-definition television. But HD capabilities exploded onto the scene in 2007, thanks to the rollout of the H.264 codec and an arms race between Adobe, Microsoft, and Apple. H.264 (also known as MPEG-4 AVC, for Advanced Video Coding) is a video compression standard that offers high-quality, high-resolution video with relatively low bandwidth requirements. In other words, H.264 is the silver bullet for HD over the Web. Apple was the first to release major player support for H.264, integrating it into QuickTime, and by extension iTunes. It also built support into the fifth-generation iPod, and H.264 has been the underlying technology for the video content available for sale via iTunes.

The next move was Microsoft’s. The first preview release of Silverlight included support for Windows Media videos up to a resolution of 720p (1280x720). Adobe soon thereafter released a version of Flash Player with a codec for H.264 and support for Full HD, or 1080p (1920x1080) resolution. Given that no broadcast network has even begun transmitting in 1080p, this effectively marks the end of the resolution debate until standards and content for the upcoming 2160- and 4320-line specifications start to appear, which is several years off from the time of this book’s publication.

Video and Universal Design

It’s important for content creators to understand the history of video because it all comes down to four issues that constantly need to be watched:

Bandwidth
CPU usage
Screen resolution
Player support

Table 7-1 lays out some of the technologies we talked about to give you a better idea of what we know. QuickTime, when it was first released, enjoyed more bandwidth for its content than any other technology we listed for at least the next 10 years, as most QuickTime content was served up at 150–300KB/s (or, in networking terms, 1.2–2.4Mbit/s). But it was still shackled by CPU power and an audience that was just in the process of moving up from VGA (640x480). It was also virtually the only game in town, so player support was hardly a consideration. Real’s first video player had better CPUs to rely on and a healthy 1024x768 on many displays, but Internet bandwidth was still on the order of 28.8 kbps to 56 kbps on the vast majority of clients, and Real’s content was still proprietary.

Table 7-1. Overview of differences between video formats

	Bandwidth	CPU	Output Quality	Player support
QuickTime 1.0	High	Very low	Low	Very low
RealVideo 1.0	Very low	Low	Moderate	Very low
MPEG-1	Low	Low	Moderate	High
MPEG-4	Low	Moderate	High	Very high
H.264	Low	High	Very high	High

All three major players (iTunes, Silverlight, and Flash Player) support full-screen viewing.

Codecs are codecs are codecs. They come around. People argue about them. They get poor implementations, then pretty decent ones, and the content flows from there. But H.264 broke the mold, going from zero to near-complete support in the industry in an extremely short period of time. A properly encoded H.264 video can be played in QuickTime (and therefore iTunes), RealPlayer, or Flash, at HD-quality resolutions, and on the iPod at 640x480. It also offers a high degree of compression, which means less of a hit on bandwidth.

Optimizing Web Video

So, what’s the catch? High compression means more CPU cycles are needed to decode a stream. That means older machines, as well as most devices, may struggle to play back larger H.264-encoded files. Usually, players will respond to an overflow of data by “dropping” frames in the playback. This is considered bad form, as it results in a jerky visual appearance at best, and irritating distortion or choppy audio can also occur. If you’re just playing back a small 320x240 video, it won’t matter much if you use MPEG-4 or H.264, but as files get larger, the CPU-versus-bandwidth debate can become significant.

As video becomes a more central feature, though, a single format or size won’t be enough to satisfy every user, making a dedicated video server more and more of a necessity. Many video server products can perform on-the-fly transcoding of source material and content negotiation to determine which format and bit rate is the best for the current user. A Windows Media client could receive an HD stream in WMV-HD format, for example, while someone on a mobile phone could get a 20 kbps feed in mobile-friendly 3GPP.