“You can rely on the data scientist for the details of the statistics. However, it’s important that everyone in workforce analytics has a core understanding, as it is easier to manage your project if you know the basics of analysis.”
—Peter Hartmann
Director, Performance, Analytics, and HRIS, Getinge Group
If you are not a data scientist or statistician yourself—and perhaps even if you are—refreshing or familiarizing yourself with the basic analytical methods used in workforce analytics is helpful. The array of statistical methods for exploring data or testing a given hypothesis is not limitless, but the number of methods available can certainly make it seem that way.
Despite the potentially dizzying complexity, many methods aim to achieve conceptually similar results. Focusing on the objectives of the methods instead of the methods themselves is a sensible starting point for learning about the goals and becoming conversant in statistical analysis in workforce analytics.
This chapter discusses the following topics:
• The importance of strong research design
• The objectives of quantitative analysis
• The objectives of qualitative analysis
• Traditional statistics versus machine learning
• Bias and fairness in analyses
Before quantitative or qualitative analyses can occur, decisions must be made regarding what data will be collected, when it will be collected, how it will be collected, and from whom it will be collected. These questions fall under the topic of research design. The research design you apply determines how rigorously you can reach conclusions about causes and effects following your analysis. Identifying cause and effect is not the only goal in workforce analytics, but it is a common one. You are unlikely to unequivocally show causal effects from single research studies. However, designs that allow more confidence in causality increase the probability of creating successful interventions.
Research designs can be categorized by an effectiveness hierarchy, summarized in Table 5.1. The strongest research designs are randomized experiments, followed by quasi-experimental designs, observational or correlational designs, and, finally, qualitative designs.
Without a thoughtful research design, even the most sophisticated analyses will likely prove fruitless. Work with your data scientists to select a strong research design before you begin your analyses.
In a randomized experiment, you generally have two groups: one that receives an intervention (the experimental group) and one that does not (the control group). Employees are randomly allocated to either group. The randomization means that the groups are, on average, equivalent in every way at the start of the experiment.
After an intervention with the experimental group, the groups are measured on the outcome under study. Consider, for example, a project that examines the effect on engagement after giving employees more autonomy. In an experimental approach, any difference in engagement between the groups must have been caused by the intervention, assuming there are no confounding factors (such as groups becoming aware of the goals of the experiment). This is because before the experiment, the groups had equivalent engagement due to randomization. In addition, the control group did not receive the intervention that increased autonomy. A key point to note is that variables under study are manipulated.
Randomized experiments, therefore, meet the three criteria for showing a causal effect:
• The cause happens before the effect.
• The experiments show a relationship between the cause and the effect (when a change occurs).
• Other possible causes are ruled out due to randomization.
The design can be further strengthened by statistically controlling for a range of other plausible causal variables. This can help rule out alternative explanations of findings, in case the randomization was not perfect. For example, continuing with the autonomy and engagement example, if the randomization did not lead to equivalent personality types across groups, the results could be jeopardized because different personality profiles might cause different preferences for autonomy across groups. In these cases, worker personality could be measured and the results of the analyses adjusted for any differences between the groups.
Conducting randomized experiments in everyday working life is often not possible. We cannot always isolate the variables we want to study, and randomly allocating workers to different conditions is often impossible. As a result, other designs are chosen that still allow some confidence in identifying a causal relationship, even though they are not as strong as in a randomized experiment.
The quasi-experimental design is similar to the randomized experiment, aside from one key feature: The employees are not randomly allocated to the experimental and control conditions. This limits the strength of inferences that can be drawn from analyses when compared to randomized experimental designs because the lack of randomization makes it impossible to confidently declare that the two groups were equal, on average, before the experiment.
Despite these limitations, robust conclusions about effective interventions can still be drawn from quasi-experimental designs in workforce analytics, particularly if results replicate across multiple studies. A quasi-experimental approach to the autonomy and engagement topic might involve delivering the intervention to two different business units because randomization is often unfeasible; treating two people differently is difficult if they are working in the same team or unit. With these designs, it is not possible to account for other factors, such as different managers giving different levels of recognition; those factors might indeed have caused differences in engagement levels. Nevertheless, this design often provides enough confidence to decide whether to continue the intervention.
Designs that do not involve randomization and manipulation, or a control group, are referred to as correlational or observational. In these studies, analytics professionals observe the way variables relate to one another without inferring a causal connection between variables. Strong dependable correlations (for example, sizable correlations based on large random samples) can still be useful in workforce analytics, but it is important to understand when a causal association cannot be confirmed.
Whereas the experimental and quasi-experimental approaches to the autonomy and engagement example included a control group, correlational designs typically have no control group. Instead, the existing levels of autonomy and engagement are assessed for all individuals, and the strength of the association is studied. Inferring causal associations from these designs is difficult, but they may be useful in identifying variables for further study using experimental or quasi-experimental designs. These designs lead to stronger conclusions than when relying on intuition. Advanced techniques can help identify causal relationships by analyzing data from correlational designs (for example, propensity scores and instrumental variable techniques), although their effectiveness varies depending on the situation.
In contrast to the designs discussed so far, which all involve numerical assessments of the relationships between variables, the hallmark of qualitative studies is that they try to understand organizational phenomena from the perspective of workers without using quantitative methods. Examples include ethnographic studies and focus groups. A qualitative study that examines the association between autonomy and engagement might involve interviews or focus groups with workers to discuss their experiences of autonomy and engagement. An ethnographic approach might have the researcher work for a period of time in the job, side by side with actual workers, to understand how autonomy relates to engagement through the eyes of the workers. It is possible to use qualitative and quantitative methods to complement one another, a topic we discuss in the section, “Qualitative Analysis,” later in this chapter.
On their own, qualitative methodologies do not provide the level of confidence regarding correlational or causal effects that many business leaders and analysts want to see before they act on recommendations from workforce analytics. For this reason, we place qualitative studies at the lowest level of the hierarchy of research designs for causal inference (see Table 5.1).
All of the designs discussed in the research hierarchy can be strengthened when the variables being studied are measured repeatedly at multiple points in time. This is referred to as a longitudinal study. To understand how repeated measurement of variables such as autonomy and engagement leads to a clearer understanding of the effects of interventions, consider the following: An experimental intervention shows an effect on an outcome variable that might show a causal effect, but because the outcome of interest has been measured only once, we cannot determine how long lasting the effect is. Furthermore, the study does not show whether the effect becomes weaker or stronger over time, or whether the effect is reversible. To understand these issues, it is important to measure repeatedly over a period of time—that is, to undertake a longitudinal study.
The overview of analysis objectives (quantitative and qualitative) presented in the following sections and in Figure 5.1 represents the level of detail that the workforce analytics team should be comfortable discussing when it comes to types of analyses.
In workforce analytics, most quantitative analysis (that is, statistical analysis of numerical data) aims to do one of the following: explore, associate, predict, classify, reduce, or segment information about employees and organizations.
• Explore. Exploratory analysis helps understand your variables. Analysis might involve summarizing the data with a statistic such as the average, or looking at how spread out the values of the variable are, using a statistic known as the variance (that is, the variability) of the variable. Exploring might involve identifying extreme cases or determining the extent of missing data. Analysis focused on exploration often uses simple graphing techniques to reveal the distribution of the data (for more on these topics, see Chapter 10, “Know Your Data”). More advanced exploratory analysis might check differences in scores on variables for known groups, such as men versus women, or ethnic minority versus majority groups. Exploratory analysis is particularly useful for monitoring and reporting demographic trends, as well as for preparing data for more advanced analyses. Approaches include techniques such as t-tests or analysis of variance, referred to as ANOVA. Exploring data involves understanding your data, preparing your data for more advanced analysis, and testing simple hypotheses.
• Associate. One goal of analytics is to look at the relationship, or association, between variables that can take on any value between their minimum and maximum possible values (for example, age or tenure), with a higher score indicating more of the variable. An example might be the relationship between extroversion and sales performance. Sales performance can likely take on any value within a plausible minimum and maximum, so it is considered a continuous variable. Rating scales common in surveys (for example, strongly disagree to strongly agree) are often analyzed as continuous.
Relationships between these types of variables are usually studied using methods that estimate correlations. The most common correlation, Pearson’s product-moment correlation, has a possible range from –1 to +1. A value of –1 means that two variables are perfectly negatively related (that is, as one increases, the other decreases by an equivalent amount). A value of 0 indicates no relationship, and a value of +1 means the variables are perfectly positively related (that is, as one increases, the other increases by an equivalent amount).
• Predict. When analyses are used to make forecasts about the future, such as estimating the value of a continuous variable of interest (referred to as an outcome variable or dependent variable), the analyses are referred to as predictions.
Given what we currently know about employees, the general pattern in predictive analysis is to estimate their future behavior at work. For example, if we know that the relationship between levels of extroversion and sales performance is strong, sales performance should improve if we hire people who score high on an extroversion assessment. Methods used for making predictions include linear and multiple regression, regression trees, neural nets, support vector machines, and time series analysis.
• Classify. Classification analyses can be thought of as analogous to association, but this analysis focuses on associations when the outcome variable is discrete. A discrete variable has a limited number of possible categories and does not have an intrinsic ordering. It is sometimes called a nominal variable. An example of a discrete variable might be whether a worker receives a promotion or whether an employee leaves the business. In such cases, the outcome is referred to as a discrete or categorical outcome (not a continuous outcome).
Methods such as correlation and regression have been adapted to deal with these types of variables. Instead of finding correlations, chi-squared tests examine associations for categorical variables; instead of using regressions to make predictions, techniques such as logistic regression are used to make classifications. For example, the risk of an employee leaving (which has only two possible values, stay or leave) can be expressed as a probability; employees with a risk of leaving can be looked at more closely for some sort of intervention.
• Reduce. Workforce analytics teams analyze data sets that often contain hundreds or thousands of variables. This is too much information to make sense of without combining some variables. The primary goal of some statistical techniques, such as principal components analysis (PCA) and factor analysis (FA), is to summarize (reduce) the information in many different variables to create a smaller number of variables. Having fewer variables makes analysis and interpretation more manageable.
The basic principle in reduction analyses is to aggregate the similar variables into fewer variables. Say, for instance, that you have three variables that measure employee performance: a manager performance rating, data on whether the employee met critical milestones, and the success that the employee demonstrated in training. Which of these variables should you focus on predicting? Statistical analyses such as PCA and FA can create, if appropriate, a single performance variable out of other similar performance variables for prediction. This is usually appropriate only if you can show that all three together are strong measures of performance, which PCA and FA check. The new variable(s) can then be explored or predicted without losing too much information.
• Segment. Although the focus of reduction analyses is to group a large number of variables into a smaller number for exploration or further analysis, the goal with segmentation is to group the number of cases (for example, workers) in your data set into a smaller, more manageable number that provides a meaningful representation of groups in your data.
Techniques in the segment category fall under the general label of clustering. Cluster analysis puts similar cases into groups, for either exploration or further statistical analysis. An example application might involve using cluster analysis to identify subgroups of employees who have similar responses to a variety of survey questions indicating high stress so that some form of intervention can be applied to improve their ability to manage stress.
When you understand that any statistical analysis has only a few basic goals, you can see possibilities for combining objectives in a single analysis. For instance, certain techniques enable you to perform analyses that reduce and segment data sets in a single analysis, segment and associate in a single analysis, and so on. Some variations examine relationships between many different variables simultaneously, and even on many levels (for example, the individual worker level and the team level). These analyses have names such as mixture modeling, structural equation modeling, and multilevel modeling.
Consider decisions about how to combine the different analytic methods to achieve analysis objectives in relation to specific analysis problems. This is best carried out with input from your team’s data scientists. Still, the six analytics objectives provide you with the basic building blocks to understand conceptually what the analyses are doing. This makes you more knowledgeable and conversant on the topic of workforce analytics and enables you to both appropriately challenge and accurately represent the workforce analytics team’s work.
The goals of qualitative research are to understand organizational phenomena without using quantitative data. Techniques include ethnography, focus groups, and detailed case studies. Objectives of qualitative research in workforce analytics include hypothesis formulation, interpretation, and contextualization.
• Hypothesize. Qualitative analyses can be particularly valuable in generating hypotheses when no theories or quantitative data exist. The resulting hypotheses can then be tested using quantitative methods. For example, interpretation and discussion about the reasons people leave a firm can help in creating a hypothesis that can be tested with quantitative data. In this example, a qualitative analysis of exit interview data might suggest that the physical office environment was a factor in decisions of employees who quit. A quantitative study might test the hypothesis that employees are more likely to stay if they work in buildings that have been recently refurbished.
• Interpret. Another objective of qualitative research is to help interpret quantitative results. Qualitative studies explain why events occur instead of simply describing the strength of relationships in quantitative terms. For example, if quantitative data reveal that new hires are taking too long to get up to speed, qualitative research might explain why that is the case. In this example, focus groups might reveal that managers are not meeting with new hires in the first week.
• Contextualize. Michael Pratt, a professor at Boston College, notes that qualitative approaches are good for contextualizing quantitative results. Contextualizing aims to explain technical quantitative findings to HR and business leaders by adding color and depth to illustrate a point. For example, including verbatim quotes from interviewees can bring to life the results of quantitative analyses. The explanations can then help turn the analytics project into a story, emphasizing the perspective of those impacted by the recommended actions. For an excellent introduction to contextualization in qualitative research, see Chapter 3, “Qualitative Research Strategies in Industrial and Organizational Psychology,” by Tomas Lee and colleagues in the Handbook of Industrial and Organizational Psychology (American Psychological Association, 2010).
Increasingly, data that workforce analytics professionals encounter will be unstructured, or difficult to store in rows and columns of a “flat” data file. Examples include video, audio, and the vast amount of text and language that is available via the Internet.
In the past, much of this information was used qualitatively, to help form hypotheses, interpret findings, and contextualize results for audiences. Today these data sets are commonly analyzed quantitatively. Perhaps most advanced is text-mining of data sets to assess the sentiment of text passages, such as comments in open-ended surveys. Sadat Shami, Director of the Center for Engagement & Social Analytics at IBM, describes the potential of text analytics. “Developments in managing and analyzing unstructured data, such as text from social media, are really showing the value of social media as a data source for signals that we can use to infer, for example, the level of an employee’s engagement.”
Methods for quantitatively analyzing qualitative and unstructured data generally focus on converting data that were captured as language or images into a numerical representation. After that point, the objectives of analytical techniques are the same as those already mentioned for quantitative analysis. Examples of these techniques are rare event detection, sentiment analysis, and trending topics.
A noteworthy area of innovation in quantitative analyses is machine learning methods, which are increasingly common in organizations today. Workforce analytics techniques can often be distinguished by whether the methods emerged from a tradition of statistical modeling or computer science (from which machine learning emerged).
In statistical modeling, the focus is often on accurately describing associations between variables in order to describe the world as it is and to identify causal relationships. In machine learning, the focus is often on making the most accurate possible classifications or predictions, regardless of whether causal mechanisms are identified. For example, a traditional statistician might use logistic regression for classification.
Machine learning professionals (that is, highly trained data scientists from a computer science background) might also try logistic regression, but they will likely adopt a wide array of other techniques as well, to try to improve the accuracy of the model. For example, their approach might involve comparing the results of different techniques, such as a logistic regression, classification trees, and support vector machines. Despite the different origins of the traditions, the objectives of analysis in machine learning and statistical modeling are similar and fit into the six objectives presented earlier.
Machine learning techniques tend to perform better when building models to predict outcomes from many variables and when examining complex relationships. Machine learning experts also tend to apply more rigorous approaches to evaluating whether models will be valid when applied to new samples. These models are often deployed in an iterative manner, with the models updated in real time and with results deployed into operational business intelligence systems.
Statistical methods, on the other hand, tend to perform better when data meet the assumptions required for the statistical method. In many cases, however, the methods can produce very similar results. For this reason, and also because your team should ideally have access to both types of skills, this is a distinction with more relevance for the data scientists in a workforce analytics team. Today’s data scientists are usually skilled in both approaches.
The decision of whether to use machine learning or traditional statistical approaches for any given situation is best left for data scientists. They should have a clear understanding of what the analysis must achieve.
Data-based decision making in workforce analytics comes with a responsibility to ensure that decisions made on the basis of algorithms are statistically, legally, and socially appropriate. This is clearly illustrated in Cathy O’Neill’s book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Crown, 2016). It is a particularly important consideration in workforce analytics because work is increasingly becoming more automated and the nature of decision making is evolving. In high-volume recruitment, for example, algorithms are often used to filter candidates; in the past, humans made those decisions. Our role in evaluating the consequences of the decision-making algorithms in workforce analytics is critical to ensure that the outcomes they produce are consistent with the organization’s values and expectations.
Another reason to give attention to the appropriateness of outcomes is that workforce analytics is becoming populated with professionals who have strong mathematical, statistical, and computer science skills, but less knowledge and experience in managing the legal and social consequences of analytics in employment contexts. The computer science field refers to this area of study as fairness-aware data mining. The topic of fairness is still garnering attention in computer science, but it is important to note that the realm of industrial psychology already has directly portable concepts in place that can be implemented to ensure equitable outcomes for individuals from different groups. Here we introduce three of these concepts: impact, bias, and fairness.
• Impact. In the context of workforce analytics, impact refers to differences in the rates of job selection, promotion, or other employment decisions that disadvantage members of a particular group, such as women or ethnic minorities. For example, if selection decisions are made on the basis of a pre-hire employment test of integrity, impact against women can result if males score higher than females on average. Impact can occur due to bias, discussed in the next point, but it can also result from genuine differences between groups.
The issue might not necessarily result from measurement accuracy; it might stem from the choice of the variable itself. To consider another example, if a candidate’s postcode were adopted as a selection measure, the measurement of postcode itself would be equally accurate for every demographic group (that is, postcode is a highly accurate indicator of where a person lives, regardless of group). However, postcode can be a proxy for socioeconomic status, which could result in adverse impact.
The situation is more complex when a variable is measured imprecisely, which is often the case for psychological concepts. In this case, impact can be the result of genuine differences between groups or bias on the selection test. Simply observing impact does not allow someone to differentiate between true differences and bias on the test.
• Bias. The notion of bias can be explained in the context of the previous integrity example: the effect of a pre-hire personality test to measure integrity on male and female job selection rates. The test is said to be free of measurement bias if a randomly chosen man and woman of equal integrity would get the same integrity score. If this wouldn’t happen, the test produces measurement bias and should not be used.
Analyses can be undertaken to check whether selection scores relate to performance scores in the same way for all groups. In the integrity example, a test could examine whether integrity scores are equally predictive of performance for men and women. If they are not equally predictive, the test produces predictive bias, so alternative selection variables might be preferred.
• Fairness. Bias and impact are statistical phenomena. Whether the process and outcomes are “fair” is a social judgment. One interpretation of fairness relevant to workforce analytics is that all workers should receive consistent treatment. In the context of employment testing, for example, this means candidates should experience the same testing conditions, have access to the same practice materials, get the same chances to retake questionnaires, and, where appropriate, experience accommodation for disabilities.
Although much of this discussion has focused on detecting instances in which algorithmic decision making can create unfairness, it is also possible for algorithms to create outcomes that are considered “more fair.” For example, algorithms can eliminate factors such as favoritism and other biases that often enter into decisions about people, especially when individual managers are making decisions. It is important to recognize that perspectives on fairness vary among organizations and people, as well as across time and cultures.
Impact, bias, and fairness are closely intertwined concepts that are nevertheless important to differentiate. Doing so helps clarify the appropriate course of action when bias, impact, or unfairness might exist.
The glossary in this book provides definitions of technical terms used in this chapter and is a useful resource for readers who are not familiar with statistical terms. For those interested in more depth on statistical analysis, Predictive HR Analytics, by Martin Edwards from King’s College, London and Kirsten Edwards from Pearn Kandola (Kogan Page, 2016), is an excellent entry point. For advanced discussion, The Elements of Statistical Learning, by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (Springer, 2011), is a key reference.
Analytically curious HR professionals, and certainly all members of the workforce analytics team, should understand the basics of analysis so they can have sensible and informed conversations.
• Select research designs that are appropriate for the business questions you are trying to answer and know how to interpret the results.
• Learn the objectives that quantitative analyses aim to achieve: explore, associate, predict, classify, reduce, and segment.
• Use qualitative analysis to generate a hypothesis, interpret results, and contextualize findings by adding color to quantitative analysis.
• Consider the use of unstructured data for both qualitative and quantitative analysis.
• Familiarize yourself with conditions under which statistical modeling or machine learning is more appropriate.
• Be clear about the tools and skills you have at your disposal for advanced analyses such as analyzing unstructured data.
• Scrutinize your analyses for their potential to produce adverse impact, bias, or unfairness, and take corrective action as needed.