Chapter 6. Naturalistic Driving Studies and Data Coding and Analysis Techniques
Sheila G. Klauer, Miguel Perez and Julie McClafferty
Virginia Tech Transportation Institute, Blacksburg, VA, USA
A better understanding of driver behavior is a critical component for future safety improvements on the roadways of the world. Naturalistic driving studies offer this unique insight into precise driver behavior under normal driving conditions but also in those critical seconds leading up to crashes and near-crashes. This chapter presents the life cycle of naturalistic driving studies with an emphasis on the coding procedures required to accurately and precisely capture these critical driving behaviors. This technique will be important for safety researchers worldwide to understand, use, and assess as large naturalistic driving study databases become publicly available. These publicly available databases will be a very rich source of driver behavior information that should be used by safety researchers to better understand, model, and develop effective crash mitigation technologies that will dramatically improve safety on the world’s roadways in the near future.

1. Introduction

Although roadway fatality and injury rates have dropped significantly during the past 50 years, these reductions have been primarily the result of improved safety belt use, air bag technology, improved crashworthiness of automobiles, and improved infrastructure (i.e., better guardrail design, roadway lighting, etc.). These improvements have had a major impact on fatality and injury rates; however, it is generally acknowledged that any further reduction in fatality and injury rates will be due to modifications in driver behavior.
It is widely accepted that driver error contributes to more than 90% of all automobile crashes (Lum & Reagan, 1995). To better understand human errors made while driving, traffic safety professionals have used either epidemiological research methods or controlled experimentation. Large crash databases based on information gleaned from police accident reports have been useful for broad questions but lack sufficient detail to study driver behavior that results in a crash. Although empirical research possesses sufficient detail, these data are collected in contrived environments and are unable to adequately capture normal driving environments and/or real crash situations.
Technological improvements have enabled traffic safety researchers to better study driver behavior in situ or in real-world traffic environments. Improvements in computer processing speed and data storage coupled with the reduction in physical size of these components have not only allowed instrumented vehicle studies to gather more parametric data but also resulted in vast improvements in video data collection. These improvements have not only allowed safety professionals to retrofit vehicles with state-of-the-art eye tracking systems, physiological monitoring equipment, and collision warning systems but also allowed for the large-scale collection of driving performance data for long periods of time (e.g., 100 vehicles for 1 year).
It is these technological improvements that will help to increase the amount of data regarding driver behavior in the seconds leading up to crashes. Thus, safety professionals will be able to assess data on actual crashes (like epidemiological databases) with high-resolution, detailed driving performance data (like empirical studies). These instrumented vehicle studies are an important tool for researchers to add to their safety tool box to improve safety on our roadways.
This chapter describes the traffic conflict technique and the theory behind the power of instrumented vehicle or naturalistic driving studies, the life cycle of naturalistic driving studies, and powerful analytic techniques that can and have been used with these data. Although instrumented vehicle studies range from one vehicle for a 30-min test period with an experimenter present to large-scale deployment of instrumented vehicles with data collected over a long period of time, this chapter focuses on the larger scale deployment studies, also known as naturalistic driving studies.

2. Traffic Conflict Technique

Many industrial safety researchers face challenges similar to those faced by transportation researchers when attempting to directly measure safety or predict the probability that an accident will occur given certain circumstances. In most settings, accidents that lead to injury or death are fairly rare events, and therefore any corrective action is reactive rather than proactive. Thus, it would be very beneficial to safety engineers if they were able to be more proactive in their abilities to determine unsafe acts that may eventually lead to injury or death.
Heinrich, Petersen, and Roos (1980) developed a hazard analysis technique based on the underlying premise that for every injury accident, there are many similar accidents in which no injury occurs. For example, for a unit group of 550 accidents of similar type and involving the same person, approximately 500 of these accidents would result in no injury, 49 would result in minor injuries, and only 1 may result in a major injury. The theory suggests that the same contributing factors occur for the no injury and minor injury accidents as for the major injury accidents. Thus, if the industrial engineer can identify contributing factors and reduce the number of no injury and minor injury accidents, it is possible to prevent the major injury accidents. The relationship among major injury accidents, minor injury accidents, and no injury accidents is called Heinrich’s triangle (Figure 6.1).
Following this premise, transportation researchers from GM developed a method using cameras and observing traffic conflicts at intersections (Parker & Zegeer, 1989). Their general definition of a traffic conflict is as follows: “An event involving two or more road users, in which the action of one user causes the other user to make an evasive maneuver to avoid a collision” (p. 4).
In the field of transportation research, this hazard analysis method has become known as the traffic conflict technique. This method has been used to estimate crash risk at intersections using a count of traffic conflicts rather than crashes. Wierwille et al. (2002) employed the traffic conflict technique by unobtrusively videotaping traffic at intersections to identify causes of driver errors (critical incidents), near-crashes, and crashes. They chose rural, suburban, and urban intersections that had a high percentage of collisions.
Variations of the traffic conflict technique have been developed for use on an instrumented vehicle. This modification involves cameras being strategically placed on one vehicle to determine the number of traffic conflict involvements for a particular driver (Dingus et al., 2001, Hanowski et al., 2000 and Mollenhauer, 1998; Wierwille et al., 2001). Dingus et al. (2006) used this modified version of the traffic conflict technique in their “100-Car Study” by videotaping a single driver and the environment surrounding a single vehicle to identify driver errors (critical incidents), near-crashes, and crashes that impacted the instrumented vehicle. This technique proved valuable in identifying the impacts of fatigue and distraction on light vehicle drivers and how fatigue and distraction increase crash risk among light vehicle drivers.

3. Philosophy of Large-Scale Instrumented Vehicle Studies

Large-scale instrumented vehicle studies can be conducted to assess the safety aspects of a particular in-vehicle system (these types of studies are also called field operational tests) or to better understand the driver behaviors that result in crashes. For both, it is important to note that it is the driver behavior in real-world conditions that is of the utmost importance for all resulting analyses. Instrumented vehicle studies are conducted to assess driver behavior under normal, daily pressures, on normal routes, and under normal traffic conditions. Thus, external validity will remain very high at the cost of internal validity or experimental control.
Although the experimenter cannot control the type of traffic patterns, environmental conditions, or driver state, the researcher will collect large amounts of data to assess and categorize these variables of interest. Thus, researchers recruit large numbers of participants and each participant experiences a long, continuous data collection period to ensure that adequate data are collected in all types of traffic patterns, environmental conditions, and driver states to make statistically valid assessments. The belief that large amounts of data will provide the critical amount of data is paramount in epidemiological research and is also true for large-scale instrumented vehicle studies.
One of the primary strengths of naturalistic driving studies is that they provide high-resolution driving performance data (parametric data) coupled with video data. These high-resolution data provide a rich and precise source of information about a driver’s behavior in a normal driving environment. These rich data sources are extremely valuable due to their precision regarding driver behavior. From these data sources, very large data sets (i.e., multiple terabytes) can also be produced that yield hundreds of near-crashes and tens of crashes. For example, in Dingus et al.’s (2006) 100-Car Study, continuous data were collected on 109 vehicles for a minimum of 12 months. The resulting data set was slightly more than 42,000hours of driving data, more than 6 terabytes (TB) of video, and it included 69 crashes, 761 near-crashes, and 8295 incidents. Depending on one’s background, this is either a large data set (6 TB) or a small data set (i.e., only 69 crashes). The important aspect is the precision of these data, whether they contribute to the large 6-TB data set or the 69 crashes. In both cases, the video of the driver’s behavior with the corresponding vehicle parametric data (i.e., vehicle speed, deceleration, Global Positioning System (GPS) location, lane position, lane deviation, etc.) yields a very rich data set.

4. Life Cycle of Naturalistic Vehicle Studies

1. Study design and data collection
2. Data preparation and storage
3. Data coding
4. Data analysis

4.1. Study Design and Data Collection

For the completion of a successful naturalistic driving study, well-defined research questions should guide the selection of the critical data elements collected in the naturalistic study, the appropriate data analysis plan, and the successful design of the study. Although these steps may be critical for all research studies, the versatility and flexibility of naturalistic driving studies make it tempting to collect a large volume of data that do not directly relate to the primary research objectives. This is sometimes advantageous, but it will typically result in more expense and consumption of resources. Thus, researchers must carefully weigh the advantages and disadvantages of adding additional variables. Second, given the large volume of data collected, it is even more important to understand and plan for the analyses as part of the study design. If the data analysis plan truly dictates the data needs, the analyses will be successful. If this step is not thoughtfully considered, the data collected will not appropriately answer the research questions or will require extensive and resource-intensive data processing to produce the data to answer the research questions.
The appropriate study design will lead to the selection of the appropriate participant population as well as the collection of the appropriate video views and parametric data. Although there are a wide variety of potential data elements, commonly used elements are presented in Table 6.1.
TABLE 6.1 Description of Potential Sensor Technologies Used in Naturalistic Driving Studies
Sensor ComponentDescription
Vehicle network boxCollection of data directly from the in-vehicle network box. Data include vehicle speed, brake application, percentage throttle, use of turn signals, and seat belt use.
AccelerometerCollection of lateral, longitudinal, and gyro.
Forward headway detectionCollection of radar data (range, range-rate, azimuth, etc.) to indicate the presence of up to seven targets in front of the vehicle.
Rear headway detectionCollection of radar data (range, range-rate, azimuth, etc.) to indicate the presence of up to seven targets behind the vehicle.
Side vehicle detectionCollection of radar data indicating the presence of a vehicle on the sides of the vehicle.
Global Positioning System (GPS)Collection of latitude, longitude, and horizontal velocity as well as other GPS-related variables.
Automatic collision notification systemHigh bandwidth collection of acceleration to detect a severe crash.
Cellular communicationsCommunication system designed for vehicle tracking and system diagnostics.
Driver-identified events/glare sensorCollection of lux value (for nighttime conditions only) as well as event button.
Lane positionCollection or processing of video data to identify the vehicle position in between lane markings.
Video dataMultiple video views are typically collected. These might include forward view, rear view, driver’s face, over the driver’s shoulder, rearward from the passenger side, interior cabin view, driver’s foot/pedal view, and dashboard/instrument panel view. Infrared lighting is used in the vehicle cabin to ensure visibility of driver’s behavior at night.
Driver’s head positionCollection of driver’s head position that can serve as a gross measure of driver distraction.
Driver eye glanceCollection of driver’s eye glance location. These typically require calibration.
Drowsy driver detectorCollection of driver’s eye glance patterns that signals when a driver’s eyelids are closing or are closed.
Passenger detectorUses sensors in the seat to detect if there is sufficient weight in the seat to suggest passenger presence.
System initializationOverall system operation.
As part of the study design, participants must be selected and recruited who are of the appropriate age, gender, and demographic distribution. Participants must also be protected because naturalistic driving procedures collect large amounts of identifying data. Identifying data not only include the driver’s face video data but also the GPS data that could indicate the location of residence, workplace, etc. Sensitivity and strict data collection procedures must be followed to ensure that participants’ privacy is protected.
For researchers in the United States, applying for a Certificate of Confidentiality from the National Institutes of Health is a key step in protecting participant privacy. This certificate ensures that participants’ identifying data cannot be subpoenaed for use in legal proceedings due to their participation in the research program. For example, if a participant made a serious error while on camera that resulted in an injury or fatal crash, the participant’s data should be protected from subpoena for use in court proceedings or insurance negotiations. Without this protection, safety research would prove very difficult because recruitment of driving study participants would be severely restricted.

4.2. Data Preparation and Storage

After the key data elements have been selected and incorporated into the data acquisition system (DAS), the appropriately sized data storage device must also be selected for the DAS. The collection of continuous data from vehicles, both video and parametric data, is not trivial. Reasonable quality video data rates are 6–8 megabytes of video per minute, which results in a passenger vehicle collecting 20 gigabytes of data per month. The video data typically comprise 80–95% of the total data collection compared to 5–15% for the vehicle parametric data.

4.3. Data Coding

The research objectives and research questions will dictate the types of data triggering and coding that are required for each project. The researcher will then develop the triggers and the coding protocols for the trained data coders to follow to ensure that the coders record the appropriate information. Data coding quality and control is a critical element and is the topic of the next section.

4.3.1. Coder Training and Quality Control Policies

The following four roles are critical to the data reduction process:
1. The researcher or research manager (either internal or external) oversees the research project from research design to data collection, coding, analysis, and reporting. At the data coding step, the researcher takes the lead in protocol and data dictionary development and provides input and feedback throughout all four phases.
2. The data coding manager serves as the direct liaison between the researcher and the data coding team and oversees all QA/QC steps. Most questions from coders can be fielded by the data coding manager; those that cannot are taken to the researcher.
3. Senior data coders (or lab proctors) are generally experienced, high-quality data coders who monitor a project’s progression through the QA/QC workflow, assist the data coder manager with coder training, test new protocols before coding work begins, create and score tests to formally measure coder reliability, and monitor the workflow to coders.
4. Data coders, of course, perform the bulk of the data coding. They also participate in the QA/QC process by completing required tests and assisting with spot checks. Data coders are ideally limited to working no more than 4 or 5h per day.

4.3.1.1. Phase 1: Protocol Development

QA/QC of manual data coding actually begins before coders ever see a protocol. After the researcher has drafted a preliminary protocol and data dictionary based on the research questions, it usually goes through several rounds of review with the data coder manager. Due to the reduction-intensive and exclusive nature of the data coder manager’s work, this person often has more experience in finding potential ambiguities, knowing when categories may be missing from certain variables, and adapting new protocols to be consistent (if possible) with previous protocols for later cross-analysis. After both the researcher and the data coder manager are satisfied with the draft protocol, it enters the first loop to be tested. The senior coder takes the lead in this testing by viewing a variety of events and completing the coding for those events based on the draft protocol. As the senior coder works, he or she takes notes about where uncertainties arose, what types of events were difficult to categorize, other variables that may seem important to the coding, and areas where the written protocol may need to be elaborated further. These comments are then reviewed by the data coder manager and the researcher.
Discussions at this point should address how well the protocol performed in answering the research question, whether the data have been coded as intended, and whether this coding provides the information required. If changes are significant, a second round of testing is recommended. Once the protocol is satisfactory, it enters the second phase, coder training. However, when the protocol leaves this stage, there may still be changes made to it later in the process. Depending on the complexity of the protocol (and the time required to reduce each event), phase 1 may require from 1 or 2 days to 1 week or more to complete.
As with any research, unexpected scenarios, driver behaviors, or environmental conditions often arise during data coding and/or protocol development, and this sometimes results in a need to edit, append, or further clarify the working data dictionary. For instance, if a road type, secondary task, or conflict type is observed in the video that does not clearly fit into the categories provided by the dictionary, then the data reduction manager, in conjunction with the researchers, must decide and thoroughly document how to code that scenario for immediate and future reference. When this occurs, it is imperative that a formal process for updating the data dictionary and/or data reduction manual be followed:

4.3.1.2. Phase 2: Coder Training

4.3.1.3. Phase 3: Data Coding

4.3.1.4. Phase 4: Data Delivery

In phase 4, the data coding team works to prepare the data set for delivery back to the researcher so that statistical analysis can begin. First, any remaining spot checks are completed, and any remaining discrepancies between original coder and reviewer are resolved. Then, based on the spot check review and pragmatic review of the protocol, all known errors or potential inconsistencies are reviewed. This is the data verification step. Examples of these issues include a known confusion between the coding of a particular location as interstate or open country in the Locality variable or a potential inconsistency between two variables that should be internally consistent or coded similarly (e.g., if the subject is using a cell phone during an event, it should be marked as “cell phone” in the Distractions variable and “distracted or inattentive” in the Driver Behavior variable). With all of the QC measures taken during the first three phases, this step should be minimal, but it is still critical. The data coding manager and senior coders should identify these issues and then work with the coders to resolve them. Data verification should be completed until the data are internally consistent with the data dictionary. As a final review before data delivery, pulling all the data out into a spreadsheet and checking the relationship between events and variables is a good QA step. Any questionable events should be flagged for a final review. These events may be ones that are coded with either unexpected or rarely used categories that the data coding manger would like verified, or they may be clear random errors such as a subject ID recorded that was never in that vehicle. The amount of time required for data verification depends on the complexity of the protocol (as with phases 1 and 2), in addition to the number of events included in the coding protocol and how consistently the QA/QC workflow was followed during the first three phases. Thus, data verification can take from 1 or 2 days to several weeks.

4.4. Data Analysis

4.4.1. Assessing Crash Risk

(3.1)
B9780123819840100062/si1.gif is missing
and is a comparison of the odds of success in row 1 versus the odds of success in row 2 of the table.
Algebraically, this equation can be rewritten with crude odds ratios calculated as shown in Eq. (3.2):
(3.2)
B9780123819840100062/si2.gif is missing
where A is the number of crashes/near-crashes where <inattention type> was present without any other type of inattention; B is the number of baseline epochs where <inattention type> was present without any other type of inattention; C is the number of crashes/near-crashes where <inattention type> was not present or was present but in combination with other types of inattention; and D is the number of baseline epochs where <inattention type> was not present or was present but in combination with other types of inattention.
To interpret an odds ratio, a value of 1.0 indicates no significant danger above normal driving. An odds ratio less than 1.0 indicates that an activity is safer than normal driving or creates a protective effect. An odds ratio greater than 1.0 indicates that an activity increases one’s relative risk by the value of the odds ratio. For example, if reading while driving obtained an odds ratio of 3.0, then this indicates that a driver is three times more likely (or 300% more likely) to be involved in a crash or near-crash while reading and driving than if just driving the vehicle.
Results from our previous naturalistic studies suggested that tasks that take the driver’s eyes off the forward roadway, such as reaching for a moving object, applying makeup, dialing a cell phone, text messaging on a cell phone, and reading, significantly increase crash risk (Hickman et al., 2010 and Klauer et al., 2006). Those activities that did not take the driver’s eyes off the forward roadway for extended periods of time showed no increase in crash risk. These activities included talking to passengers (for adult drivers), adjusting radio/HVAC, talking on a CB, talking on a cell phone, and drinking.

4.4.2. Prevalence or Driving Exposure

Naturalistic driving studies can provide precise driving exposure information for the sample of drivers involved in the study. Although crash risk is an important assessment, it must be weighed in comparison to the prevalence in which drivers engage in the “risky” behavior. For example, dialing a cell phone is considered high risk; however, the time required to complete this task is relatively short compared to that for text messaging. Thus, safety professionals who observe the increased frequency of text messaging while driving combined with the observed increased crash risk find that this behavior is quite worrisome compared to eating or drinking while driving.
In the field of transportation safety, exposure measures are typically limited to drivers’ self-reported vehicle miles traveled, the number of licensed motorists, or highway vehicle counts for a specific location. Although naturalistic driving studies typically recruit fewer participants than do survey- or questionnaire-based studies, the exposure information is much more precise than self-report. This precision encompasses exposure to specific risk factors (i.e., driving at night) that can be more precisely measured using naturalistic driving data than in a self-report or highway vehicle count.
One of the key findings of the original Dingus et al. (2006) 100-Car Naturalistic Driving Study was the much higher frequency of crashes compared to police-reported crashes, from which most of the traffic safety analyses were previously conducted. Drivers were involved in crashes with their vehicles four times more frequently than they reported to police. This is a rich source of data and rich source of exposure that was completely unavailable prior to the 100-Car Study.

4.4.3. Contributing Factors for Crashes and Near-Crashes

Results from the 100-Car Study indicate that there are typically multiple causal factors for crashes and near-crashes and only one or perhaps two causal factors for most critical incidents. Comparisons have also been performed between crashes and near-crashes that indicate that the most dramatic difference between crashes and near-crashes does not reside in the type of causal factors. Rather, it is whether an evasive maneuver was initiated or not, where no evasive maneuver is initiated for crashes and an abrupt evasive maneuver is initiated for near-crashes. These results support the use of near-crashes as a safety surrogate (Guo, Klauer, Hankey, & Dingus, 2010).

4.4.4. Advanced Product Testing

Due to the nature of drivers being studied/observed in their normal environment, the naturalistic driving study presents a unique opportunity for both auto manufacturers and nomadic device manufacturers to study product use. Specifically, naturalistic driving studies provide manufacturers with an in-depth view of how drivers actually use and interact with their products under normal, daily driving scenarios. Although the recruitment for these types of studies may originate with the original equipment manufacturers providing a list of potential participants who own the desired system, the data can be very valuable for future system design and also for an understanding of the potential unintended consequences or driver misuse of specific devices.
Naturalistic driving studies can be designed to assess various products, but post hoc analyses can also be performed using data already collected. For example, McLaughlin, Hankey, Dingus, and Klauer (2009) performed a study that examined different forward collision warning device algorithms using Dingus et al.’s (2006) 100-Car data. Different algorithms were compared to actual driving data to assess the number of potential crashes and near-crashes that may have been avoided if drivers had been given an alert that they were about to collide with an obstacle.
Many naturalistic driving studies have been conducted to assess the feasibility of various safety warning devices. These studies are typically referred to as field operational tests or (FOTs). Some of the more recent FOTs that have been performed in the United States include the Run Departure Crash Warning System FOT (LeBlanc et al., 2006), the Drowsy Driver Warning System FOT (Blanco et al., 2009), and the Integrated, Vehicle-Based Safety System Heavy-Truck Field Operational Test (Sayer et al., 2010), which evaluated combined warning system alerts to drivers.

5. Conclusions

Naturalistic driving data provide powerful tools for safety researchers that incorporate some characteristics of epidemiological data analysis techniques with empirical data analysis techniques. Although these characteristics are very beneficial, they also provide novel new data and analytic methods in which to explore and study driver safety, specifically driver behavior.
The life cycle of naturalistic driving studies includes the following:
1. Study design and data collection
2. Data preparation and storage
3. Data coding
4. Data analysis
Each of these steps is complex primarily due to the size and extent of the data being collected. As stated previously, naturalistic driving studies typically collect 6–8 gigabytes of video per minute, which can easily result in thousands of hours of video collected, and 6–10 TB of data that must be prepared, stored, coded, and analyzed.
Naturalistic driving studies are typically lengthy and resource-intensive but worth the rich, detailed data that can be collected. These types of studies are complex and require extensive planning both prior to data collection and through the entire life cycle of the study to ensure that the initial research objectives are appropriately evaluated. Detailed planning at every step in the life cycle will result in a much easier and efficient data analysis phase of the project. Inefficient and/or minimal planning can easily result in a failed project that cannot evaluate the original research objectives.
Results from previous naturalistic driving studies have quantified the inherent dangers in driving drowsy and driving while engaging in text messaging, cell phone dialing, applying makeup, and any other task that requires more than 2 s total time of eyes off the forward roadway. Future studies may provide answers to even more complex issues regarding driver age, geographic location, and vehicle type. The Strategic Highway Research Program 2 (SHRP) Naturalistic Driving Study will be a national resource for traffic safety professionals, with preliminary data sets available as early as 2012.
Naturalistic driving study databases from past and future studies will be available to the safety research community. For example, Dingus et al.’s (2006) 100-Car Naturalistic Driving Study data are already available online at http://www.vtti.vt.edu. Given this accessibility, this chapter focused primarily on the data reduction and analysis of naturalistic data because these steps will be critical to researchers who want to use these data sets for safety research. Although the data reduction and analysis task is critical, researchers also need to have a clear understanding of how the DAS worked and the limitations of the data collection system. All phases of the naturalistic data study life cycle are important to understand in order to effectively and accurately analyze the data.
The SHRP 2 Naturalistic Driving Data, as well as the European and Canadian naturalistic driving studies that are being planned, will provide extensive driver behavior databases. The power of naturalistic driving studies and the more in-depth analyses of driver behavior is an important step toward achieving greater reductions in driver injuries and fatalities on our roadways. The safety research community must become adept and develop improved analytic techniques to use with naturalistic driving data. With this additional tool in the safety researcher tool box, there is hope of making great strides toward zero deaths on our roadways.