Employee surveys have a long history dating back to the 1920s when psychologist J. David Houser had interviewers ask employees a set of standardized questions that were then graded on a 1 to 5 scale (Jacoby, 1988). From this data Houser was able to derive a “morale” score that could be used for comparisons between departments and organizations. Houser's work spurred the activity of academic researchers who conducted employee attitude surveys in the 1920s and 1930s.
Despite this activity, the use of employee surveys by organizations was not widespread until after World War II. Organizations, consultants, and academics recognized the value of surveys and a burgeoning of survey work occurred in the 1950s. Like the early work of Houser, this resurgence focused on employee morale.
By the late 1950s researchers began to address the definition or meaning of “morale.” One article from this period noted that the literature on morale “yields definitions which are as varied as they are numerous” (Baehr and Renck, 1958). Another paper of the same period by Guba (1958) defined morale as follows:
Morale is a predisposition on the part of persons engaged in an enterprise to put forth extra effort in the achievement of group goals or objectives.
If this sounds familiar, it should. One of the recent trends in employee surveys is a focus on employee engagement, not so very different from what surveys have been trying to measure for nearly a hundred years.
Employee surveys can be designed for a variety of purposes, but most surveys have the end goal of improving organizational effectiveness. Employee surveys are often listed as a top HR practice (e.g. Huselid, 1995). Indeed, well-designed and executed surveys can tell us a lot about how employees view the organization, management, their coworkers, and themselves.
With the advent of computers and internet-based tools, survey researchers are spared much of the work and time that used to be part of the survey process. Sending and receiving paper surveys, entering data, and conducting analyses are no longer major considerations. While this facilitates the collection and analysis of survey data it does not lessen the key requirements for rigorous survey research. Sound survey methodology can tell us a lot about a team, group, or organization. In the best case, survey information can help us understand where the key challenges are and what we need to do to meet them. In the worst case, survey results can be misleading or meaningless. Planning and executing good organizational surveys involves a number of key considerations, chiefly:
Having a clear definition of what we are trying to measure may seem like an obvious step but the labels we attach to our measurements can be misleading or confusing. We distinguish between a construct – the hypothetical, conceptual definition of the variable – and an operational definition – the actual procedures that we use to measure the construct. Constructs should be grounded in a theoretical or rational model that specifies how a specific construct is different from other constructs. An example would be the constructs job satisfaction and employee engagement. Although both are related, the construct definitions should be clear enough to distinguish the two.
Once we have defined our construct we have to develop a way to measure the construct – the operational definition. For organizational surveys this usually means a standardized set of questions designed to assess the construct. The reliability of our measure indicates how likely it is that we would get the same results if we were able to administer the same measure multiple times.
The most frequently used index of reliability is the alpha coefficient (Cronbach, 1951), which analyzes the extent to which multiple measures (e.g. survey items) of the same construct are correlated. If we assume that each item is a separate measure of the same construct we can plot the increase in reliability as we add more items to a scale. Figure B.1 shows the relationship between the number of items and reliability. Assuming a single item reliability of .3, Figure B.1 shows how the reliability increases as we add more items of similar quality to the scale. Reliability is important because without it we are merely measuring random variation. To pass relatively standard acceptable ranges, a set of items or questions should have a chronbach alpha score of at least .6–.7 Anything less would not meet most academic thresholds to be considered reliable.
The next consideration for survey research is validity. Validity requires evidence that we are measuring what we purport to measure.
Once we define our construct we need to show that what we are measuring (i.e. the operational definition) is a measure of that same construct and not anything else. As we just noted, it is important that our measure should be reliable. In fact, reliability places an upper limit on the validity of a measure.
Once we have determined that our measure is reliable we can gather evidence to show that our measure is valid. There are several ways that we can support the validity of a measure. First, we can examine the content of our measure (i.e. the items) to see whether it fits with the definition of the construct that we are using. Ideally, this is done by having experts in the field review the items for their relevance to the construct.
A second way to assess validity is by testing whether our measure is distinct from other measures that might be in the same general domain. In the psychometric literature this is referred to as discriminant validity. Discriminant validity requires showing that a measure of a particular construct is capturing unique variance, i.e. variance not captured by other measures. This is usually done by examining a pattern of correlations. Ideally, correlations should be low with constructs logically unrelated to our measure and higher with constructs that we expect to be related to our measure. More sophisticated procedures, such as confirmatory factor analysis (Thompson, 2004), can also be used to assess discriminant validity.
A third way to assess validity involves collecting empirical data to show that the measure is related to a desired outcome. In the case of measures of Virtual Distance, for example, we collected data on a variety of outcome measures that should be influenced by Virtual Distance, such as trust, organizational citizenship behaviors, and job satisfaction. This third method is referred to as predictive validity, or criterion-related validity. It should be noted that when we assess the criterion-related validity of a measure we should also be ensuring that our measure is adding something unique to the prediction of the outcome. This is particularly important for newly developed measures. We must answer the question, “Does our new measure add significantly to the prediction that we can get with other existing measures?”
A final consideration is sampling. If done well, our selection of survey participants can make our results meaningful; done poorly, our results can be misleading or invalid. For organizational surveys our objective is to obtain accurate information about the attitudes or behavior of a particular group.
Several factors can bias or distort our results. One question that we should ask is whether the sample is representative. In sampling terms, we strive to have a sample that is representative of the population of interest. In the case of organizational surveys, our population of interest might be the entire organization or specific groups within the organization. Most organizational surveys are voluntary, which can result in unrepresentative samples. Survey respondents might differ from non-respondents and this can give us misleading results. Though it is often difficult to know how biased our samples are, differences between respondents and nonrespondents are sometimes tracked with interesting results.
For example, nonrespondents to Facebook's annual survey were 2.6 times more likely to leave within six months than were respondents (Judd et al., 2018). A second issue has to do with respondent motivation. For example, high performing employees may be too busy to respond, with the result that surveys are completed by average or below average employees (Wilkie, 2018). Another factor affecting motivation is the perceived lack of anonymity. Respondents who are suspicious that survey results might influence their standing in the organization might alter their response so they do not reflect their true attitudes, or they may not respond at all.
Borg et al. (2008) found that employees with low commitment, low job satisfaction, and most importantly, a negative attitude toward the company's leadership were less likely to respond to certain items. Another, subtler set of issues are referred to as demand characteristics. Demand characteristics were identified originally as artifacts that distort the results of psychological experiments (Orne, 1962). Experimental subjects perceived the purpose of an experiment and behaved in a way that they thought the experimenter desired. Podsakoff et al. (2003) suggest that demand characteristics also operate at the survey level. Respondents may answer in a way that conforms to the desired responses of the organization or survey administrator.
Employee surveys can be considered the first step in an organizational intervention aimed at effecting some positive change. But in order to make real change, more than just raw survey results are needed. As Murphy (2018) notes:
If you ask questions like “I trust my boss” and you have no idea how you would actually improve trust, you're better off not asking the question. Because if you ask a question and you don't have any way to fix it, it won't be long before you go from static scores to declining ones.
Virtual Distance offers a comprehensive framework that includes a set of validated constructs assessing the distance between individuals and teams or other pairs of groups. In this book we have established an impressive set of relationships between Virtual Distance and organizational outcomes.
The data collected in the Virtual Distance Index Survey enables a deeper dive to understand what is driving these outcomes and to what extent each outcome might be related to other outcomes (for more on the specifics of the statistics, please refer to any of our many academic papers or the original dissertation for some of the more technical methods involved).
The Virtual Distance Framework allows us to go beyond merely reporting the level of a single variable (e.g. engagement) so that we can understand both the antecedents and consequences of a particular construct. The implications for practice are more consequential when we can provide a full picture of employee attitudes and behaviors. This allows for targeted interventions that can focus on the key areas where there are deficiencies or even dysfunction. We would be naïve to think that improving organizational effectiveness can be simple. Improvement requires a deeper understanding of all of the key behaviors and attitudes that drive employee performance.