A/B testing. A randomized experiment with two groups: a control group that experiences the current or standard treatment, and an experimental group that receives some variation of the standard treatment. Any observed differences between the groups can be attributed to the treatment.
Algorithm. A step-by-step set of rules to follow in calculations to meet analytical objectives such as prediction or classification.
Analysis of variance (ANOVA). A statistical method for examining quantitative differences between two or more groups.
Analytics. The discovery, interpretation, and communication of meaningful patterns in data to inform decision making and improve performance.
Application programming interface (API). A set of definitions, protocols, and tools for building software applications, and for allowing software components from different sources to communicate with each other.
Bias. Statistical differences in scores for majority and minority groups that are unrelated to the underlying concept you are trying to measure. This occurs either in measuring a variable (measurement bias) or applying the measure to predict outcomes (predictive bias).
Big Data. Datasets of structured and unstructured information that are so large and complex that they cannot be adequately processed and analyzed with traditional data tools and applications.
Business acumen. A keenness and agility in understanding, interpreting, and dealing with business situations.
Business glossary. See data dictionary.
Business intelligence tools. Software for generating insights and reports from data stored in a data warehouse.
Causality. The effect of one variable on another (cause and effect). Two variables have a causal relationship if changes in one variable produces changes in the other.
Center of Excellence (CoE) or Center of Competence (CoC). A team or entity that provides leadership, best practices, research, support, and training for a focus area.
Change data capture (CDC). An automated approach for ensuring that data changes are synchronized across an enterprise by replicating data changes from a source system to other systems.
Change management. The process, tools, and techniques to manage the people side of change to achieve a required business outcome.
Chartered Institute of Personnel and Development (CIPD). A professional body for human resources and people development that has a worldwide community of members committed to championing better work and working lives. It is headquartered in London, United Kingdom.
Chief human resources officer (CHRO). The most senior person in an organization responsible for overseeing all aspects of the strategies, policies, practices, and operations of human resource management.
Classification tree. A machine learning approach that uses training data to create a model that can then be used for assigning cases (for example, workers) in a dataset to different possible groupings (for example, leavers or stayers).
Click-path data. The tracking of web page viewing behavior, such as where and when people click on a web page, the time they spend on a web page, and their viewing patterns. Also known as click-stream data.
Cloud computing. A type of Internet-based technology in which different services (such as servers, storage, and applications) are delivered to an organization’s or an individual’s computers and devices through the Internet.
Cluster analysis. A statistical technique for finding natural groupings in data; it can also be used to assign new cases to groupings or categories.
Cognitive assistant. A technology application that interacts with users through natural language, providing levels of confidence in its answers. A cognitive assistant continuously learns and improves.
Cognitive computing. Systems that understand, learn, and reason as they interact with humans using natural language to mimic the way the human brain works and enhance human performance.
Confounding factor. A variable other than the one you’re interested in that might affect the outcome variable and lead to incorrect conclusions.
Consumerization of HR. A term referring to employees’ expectations that technology experiences at work will be similar to technology experiences as consumers.
Continuous variable. A variable that can take on any value between a minimum and maximum (for example, age or tenure), where a higher score indicates more of the variable.
Control group. A group in a research study that does not experience treatment, but instead acts as a baseline against which change in the experimental group is compared.
Correlation (Pearson product–moment correlation). A statistical measure that indicates the extent to which two variables are related. A positive correlation indicates that, as one variable increases, the other increases as well. For a negative correlation, as one variable increases, the other decreases.
Correlational design. A research design that does not involve randomization or manipulation of who receives treatments.
Dashboard. A data display tool that provides at-a-glance views of key performance indicators relevant to a particular objective or business process.
Data (plural); datum (singular). Facts, information, and statistics collected together for reference or analysis.
Data analysis. A process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful insights, suggesting conclusions, and supporting decision making.
Data analyst. A person whose job is to collect and study data to reveal meaningful patterns and insights.
Data architecture. Models, policies, and guidelines that structure how data are collected, stored, used, managed, and integrated within an organization.
Data dictionary. A comprehensive record of business and technical definitions of the elements within a dataset. Also referred to as a business glossary.
Data ethics. The fundamental legal and moral principles of right and wrong that govern the collection, storage, use, and dissemination of data in analytics.
Data governance. The overall management of the availability, usability, integrity, and security of the data employed in an organization.
Data mart. A subset of a data warehouse that allows data to be accessed and customized by specific business functions.
Data mining. The process of collecting, searching through, and analyzing a large amount of data in a database to discover patterns or relationships.
Data privacy. The legal, political, and ethical issues surrounding the collection and dissemination of data, the technology used, and the expectations of what information is shared with whom.
Data profiling. Checking datasets for allowable values, logic, and consistency.
Data scientist. A person whose job is to perform statistical analysis, data mining, and retrieval processes on a large amount of data to identify trends and other relevant information.
Data steward. A person responsible for managing data content, quality, standards, and controls within an organization or function.
Data visualization. The representation of quantitative information in a pictorial or graphic format so that an audience can easily grasp difficult concepts or patterns.
Data warehouse. A repository for storing business-relevant data.
Database. A collection of information that is organized so that it can be easily accessed, managed, and updated.
Dataset. A collection of variables or information that is composed of separate elements but can be managed as a single entity for analysis.
Democratization of HR. A term given to HR so that the information known about employees or policies and programs is more readily available; for example, information about a manager’s team is provided by HR applications without the need to request it.
Digital footprint. The personal electronic trace or trail left from the use of Internet-connected devices.
Disappearables. Wearable devices that will become so small due to technological advances that they will almost disappear from view.
Discrete variable. A variable with a limited number of possible categories and no intrinsic ordering.
Dynamic visualization. A display of an analytics message that is animated, interactive, and contains live data so that the image changes as the information refreshes.
Ethnographic study. A qualitative, small-group research design that attempts to understand organizational events from the perspective of those experiencing the events.
Experiment. A scientific procedure undertaken to make a discovery, test a hypothesis, or demonstrate a known fact, with participants randomly assigned to groups so that every participant has an equal chance of being in the experimental group or the control group.
Experimental group. A group in a research design that experiences a new treatment or intervention intended to improve an individual or organizational outcome. The effects on this group are often contrasted with a control group that does not receive treatment.
Factor analysis. A statistical technique for summarizing many variables with fewer variables, with particular applications to measuring psychological attributes.
Fairness. The social evaluation of whether decisions are free from discrimination.
Fairness-aware data mining. The science of applying statistical techniques while managing the social consequences of analytics.
Future of work. A term referring to how work will develop and be delivered in a globally interconnected world in which almost all work can be done anywhere and the processing power of computers will enable machines to outperform humans for an ever-increasing scope of work.
Gig economy. The freelance economy, in which workers support themselves with a variety of part-time jobs, or gigs, that do not provide traditional employment-style benefits such as healthcare.
Governance. A broad term referring to the establishment of policies and guidelines, along with continuous monitoring of their proper implementation, by the members of the governing body of an organization.
Human resources business partner (HRBP). HR professionals who work closely with an organization’s senior leaders to develop an agenda for managing people that supports the overall aims of the organization. These are generalists and do not normally specialize in any subfunction of HR.
Human resources information system (HRIS). Software that provides a single, centralized view of data that a human resources management group requires for executing HR processes.
Human resources information technology (HRIT). A subfunction within the human resources function that is responsible for selecting, implementing, and maintaining the HR technology systems for an organization.
Hypothesis. A proposed explanation, in the form of a testable and falsifiable statement, often informed by observations and previous research.
Impact. Differences in the rates of job selection, promotion, or other employment decisions that disadvantage members of a particular group, such as women or ethnic minorities.
Infographic. A short-form, visual representation of information, data, or knowledge presented through simple images that highlight patterns, trends, or insights. Simplified from the term information graphic.
Insight. A deep and clear understanding derived from analysis.
Instrumental variable approach. A statistical technique common in economics that attempts to assess causal effects from correlational data when randomized experiments are not possible.
Internet of Things (IoT). An interconnected network of physical devices, vehicles, buildings, and other items embedded with sensors that gather and share data.
Intervention. An action taken with the intent of producing a specific outcome or result.
Key performance indicator (KPI). A variable or metric against which the success of a function or business is judged.
Kurtosis. A numerical indicator of whether the heights of the tails (the low and high values of a variable) in a data distribution are extreme; zero kurtosis indicates tails that are neither heavier nor lighter than would be expected in a normal distribution.
Machine learning. A subdiscipline of computer science that addresses similar challenges to traditional statistical modeling, but with different techniques and a stronger focus on predictive accuracy.
Metrics. Facts and figures representing the effectiveness of business processes that organizations track and monitor to assess the state of the company.
Mission statement. A description of an organization or function’s business, its objectives, and its approach to reach those objectives.
Nanotechnology. The application of technology at such a small scale that devices and sensors can be implanted into articles such as clothing.
Neural net model. A machine learning technique for making forecasts and classifications from many predictors, suitable when the relationships between predictors and outcomes are too complex to be modeled with traditional statistical methods such as regression.
Normal distribution. Also known as a bell-shaped curve or Gaussian curve, this is a distribution of data that is symmetrical around the mean: The mean, median, and mode are all equal, with more density in the center and less in the tails.
Observational study. A study that examines how variables relate to one another in their natural environment, without randomizing subjects to conditions and manipulating who receives an intervention.
On-premise technology. Software installed and run on computers physically located on-site (on the premises) at an organization.
Open standard. A public technology standard that allows interoperability and communication between technology systems.
Operating model. Describes how a group will conduct its business within the larger organization and external environment in which it resides. It is useful for defining working relationships, resolving issues and conflicts, and guiding decision making.
Outcome metric. The measurable result (financial or nonfinancial) of an action, program, or project—an indicator of the extent to which objectives have been met.
Outlier. An observed value that falls outside the overall pattern of an expected data distribution.
Predictive analytics. A branch of advanced analytics that is used to make forecasts about future events.
Principal component analysis. A statistical technique used to reduce the number of variables in a dataset to a smaller number while preserving the information in the larger dataset. It is a type of factor analysis and is often used as a first step to make further analyses more manageable.
Proximal metric. An indicator of progress toward a desired outcome, reflecting observable results closer in time to when an action is taken.
Proxy variable. An alternative measure of a variable of interest, when the desired variable is unavailable or of insufficient quality for analysis.
Qualitative analysis. An approach to studying phenomena when the data collection, analysis, and interpretation do not involve statistics.
Quantitative analysis. An approach to studying phenomena when the data collection, analysis, and interpretation are based on statistics.
Quasi-experiment. A research design in which the effects of an experimental intervention are compared to the effects of no intervention on a control group, without the benefit of randomizing participants to conditions.
Randomization. The process of allocating research participants across conditions of an experiment in a way that ensures no differences between the groups.
Regression analysis. A statistical process for estimating the relationships between variables, often used to forecast the change in a variable based on changes in other variables. Linear regression is used to analyze continuous variables, and logistic regression is used for discrete variables.
Regression tree. A machine learning method for making predictions about a continuous outcome variable (such as job performance) from one predictor or a series of predictors.
Reporting. The function or activity for generating documents that contain information organized in a narrative, graphic, or tabular form, often in a repeatable and regular fashion.
Research design. A research plan regarding what data will be collected, when it will be collected, how it will be collected, and from what or whom it will be collected.
Return on investment (ROI). The measure of benefit of an investment divided by the cost of the investment, usually expressed as a percentage and often converted to a monetary value.
Sensitivity analysis. A technique used to determine how different values of an independent variable will impact a particular dependent variable under a given set of assumptions. It allows an analyst to determine whether a statistical finding will remain consistent under a variety of conditions.
Sensor. An object designed to detect and record data, and provide the information back to a central database.
Skewness. A numerical indicator of lack of symmetry in a data distribution. Zero skewness indicates perfect symmetry, as would be expected in a normal distribution.
Snowball strategy. An approach for identifying potential stakeholders by interviewing select individuals and requesting recommendations of additional people to interview.
Society for Human Resource Management (SHRM). The world’s largest HR professional society and leading provider of resources serving the needs of HR professionals and advancing the practice of human resource management. It is headquartered in Alexandria, Virginia, United States.
Software as a Service (SaaS). An approach to software licensing and delivery in which software is hosted remotely in the cloud and accessed via an Internet browser.
Sponsor. A person or group providing support for a project or activity through financial means or personal endorsements.
Stakeholder. A person in the organization who has a vested interest in a project or activity and the outcomes.
Static visualization. An image to communicate an analytics message that is based on a snapshot of data at a point in time (that is, the image is still).
Statistical modeling. The use of mathematical equations, based on a set of assumptions, intended to predict or explain relationships among variables.
Statistics. The organization, analysis, interpretation, and presentation of quantifiable data.
Storytelling. A method of explaining a series of events through narrative.
Strategic workforce planning. A process used to align the needs and priorities of the organization with those of its workforce, through understanding labor supply and demand and the long-term objectives of both the organization and its competitive environment.
Support vector machine. Machine learning techniques that are used to make predictions of continuous variables and classifications of categorical variables based on patterns and relationships in a set of training data for which the values of predictors and outcomes for all cases are known.
t-test. A statistical method for estimating the magnitude of quantitative differences between two groups.
Text analytics. The process of deriving insights from large volumes of text, typically through the use of specialized software to identify patterns, trends, and sentiment.
Time series analysis. A class of statistical methods used for studying how values of a variable or a group of variables change over time.
Triangulation. A method for establishing the validity of research findings by using multiple approaches and techniques and looking for convergence.
Variance. A statistical measure of how spread (or varying) the values of a variable are around a central value such as the mean.
Vision statement. A description of the desired future impact of your function on the organization.
Wearables. Devices worn on the body to gather and provide information to the user through advanced technology, such as smart watches, activity trackers, and smart glasses.
Workforce analytics. The discovery, interpretation, and communication of meaningful patterns in workforce-related data to inform decision making and improve performance.