[CHAPTER 8]

When Well-Being Becomes a Number

Anna Alexandrova and Ramandeep Singh

Quantifying well-being is an old ambition. Proposals for how to measure happiness were made in the Age of Enlightenment by utilitarian philosophers, throughout the nineteenth century by classical economists, and in the twentieth century by social scientists of many stripes. Nevertheless, these initiatives remained the province of the quirky theoretician and utopian, while well-being and happiness carried on being principally subjects of art, literature, philosophy, religion, and personal reflection, where measurement was not the point. This began to change in the late twentieth century.

The comfort with quantification of what is perhaps the ultimate personal phenomenon was an outcome of several trends (see Angner 2011; Davis 2015; Gere 2017). First of all, it is a culmination of decades of academic work at universities and commercial laboratories refining measurements of psychological traits, emotions, and attitudes. Questionnaires and psychometric scales of positive states such as happiness and satisfaction proliferated and a new identity of “positive psychologist,” or “well-being scientist,” emerged. Central to this identity is the development and validation of such scales and their use in experimental and statistical studies of the determinants of well-being. The second contributing trend was the self-help movement that aligned itself with experimental psychology rather than the earlier humanistic tradition of psychotherapy and psychoanalysis. This movement encouraged production of popular materials such as books, training programs, and, lately, digital apps, which eventually made their way into management, human resources, and life coaching. Finally, the last decades of the twentieth century saw the rise to prominence of critiques of orthodox economics and growing demands that evidence-based policy be responsive to more than just growth of gross domestic product (GDP), consumption, and income.

The late 1990s and the early 2000s saw high-profile conferences and publications in which eminent US economists and psychologists such as Daniel Kahneman, Ed Diener, and Martin Seligman touted the optimistic new science of the good life.1 In 2009, three famous economists, Joseph Stiglitz, Amartya Sen, and Jean-Paul Fitoussi, produced a report commissioned by then French president Nicolas Sarkozy outlining the importance of well-being in national accounting (Stiglitz et al. 2009). Guidelines for measurement were endorsed internationally and intersectorally by governments and nongovernmental organizations (NGOs), reaching even traditional economic tools of cost-benefit analysis (Fujiwara and Campbell 2011; Organisation for Economic Co-operation and Development 2013). The arguments used against GDP and in favor of richer measures were often old, resurrected, for example, from Robert F. Kennedy’s 1968 speech that indicators such as gross national product measure everything “except that which makes life worthwhile.” By 2012 such sentiments no longer sounded utopian, and the UK economist Richard Layard radiated confidence in The Guardian: “If you go back 30 or 40 years, people said you couldn’t measure depression. But eventually the measurement of depression became uncontroversial. I think the same will happen with happiness” (Rustin 2012).

Today there is a critical mass of consensus that well-being is quantifiable, or at least that it can and should be represented quantitatively. Our focus here is the right attitude to this consensus. Should well-being quantification be exposed, opposed, and discouraged as distorting the true nature of this complex phenomenon and as giving in to the late capitalist dream of the numerical self? Or should it be tolerated and even celebrated as a scientific achievement centuries in the making? The answer, perhaps predictably, is neither. The triumphalist narratives miss the mark because the controversies about constructing a scale of so slippery a phenomenon never got resolved, only forgotten. Categorical disenchantment with well-being quantification is problematic too: while it is tempting to point to the mismatch between well-being properly understood in the light of some philosophical theory and the current measures, such criticism commits a category mistake. Well-being “properly understood” is not the target of these measures. Instead, their proponents redefine well-being in a way that builds quantifiability into the very concept. They sacrifice theoretical validity for the sake of making well-being a viable object of public debate.

The better question is whether this move of construing well-being as a quantitative phenomenon for pragmatic reasons is defensible all things considered. The issue as we see it rides on whether it is appropriate to trade off theoretical validity against effectiveness in public debate, by using a quantitative measure for a qualitative phenomenon. Because this conflict was articulated recently by one of the architects of British well-being measurement, politician Oliver Letwin, we call it “Letwin’s dilemma.” Each type of well-being quantification makes a trade-off between theoretical and practical demands, and different trade-offs are justifiable in different contexts. We therefore urge that quantification of well-being should be neither cheered nor denounced as a whole, but instead evaluated on a case-by-case basis. As an illustration of our strategy, we discuss two such cases. In the first one—the measure of UK national well-being by the Office for National Statistics (ONS)—well-being is quantified by a rich table of indicators which strikes, in our view, a defensible balance between validity and practicality. In the second—The Origins of Happiness, a report by economists at the London School of Economics—quantification runs amok, making too great a sacrifice of richness and complexity for the sake of political goals that are themselves questionable.

8.1. Diversity of Quantifications: A Survey

What does it mean to quantify well-being? The first step is to adopt a definition of well-being; the second is to put forward a scale that captures variations in well-being according to this definition. There are many such definitions available and several possible scales for each, so options multiply quickly. Table 8.1 summarizes what we see as the main traditions.2 Each row presents a different bundle of measures that correspond to different answers to the initial question “What is well-being?” Very roughly, the first three rows come from the psychological sciences. While they represent distinct philosophical traditions, all elicit self-reports of well-being with questionnaires. Psychologists in the first row identify well-being with felt experiences of positive and negative emotions, or happiness, and they trace their intellectual roots to hedonism. They favor experiential measures of happiness that ask whether respondents feel a given emotion such as sadness or joy. These ratings are then aggregated into a “hedonic profile” that represents a time slice of the individual respondent. Those in the second row see well-being instead as an individual’s judgment about her life as a whole, or her life satisfaction, and hence adopt short evaluation questionnaires that invite subjects to agree or disagree with statements such as “All things considered my life is going well.” Their philosophical heritage is probably closest to subjectivism—the theory that grounds well-being in the fulfillment of the individual’s priorities. Finally, the advocates of flourishing, in the third row, trace their roots to Aristotle (and classical eudaimonism more broadly) as well as to twentieth-century humanistic psychology: the good life is a life of maximal functioning and actualization of personal potential. Although operationalizing this theory is undoubtedly hard, psychologists typically articulate several “virtues,” such as autonomy, connectedness, and sense of purpose, and ask respondents to answer questionnaires corresponding to each.

Table 8.1. Definitions and measures of well-being

Definition

Measure

Happiness

Experience sampling; U-Index; Positive and Negative Affect Schedule; SPANE; Subjective Happiness Scale; Affect Intensity Measure

Life satisfaction

Satisfaction with Life Scale; Cantril ladder; domain satisfaction

Flourishing

PERMA; Psychological General Well-Being Index; Flourishing Scale; Warwick and Edinburgh Mental Well-Being Scale

Preference satisfaction

GDP; GNP; household income and consumption; stated satisfaction surveys

Quality of life

Human Development Index (capabilities); UK Office of National Statistics Measure of National Well-Being; Legatum Prosperity Index; Social Progress Index; OECD Better Life Index; Nottingham Health Profile; Sickness Impact Profile; World Health Organization Quality of Life; Health-Related Quality of Life

The definition of well-being as “preference satisfaction” in the fourth row comes from economics. The economic tradition of welfare measurement builds upon the view that well-being consists in satisfaction of the individual’s preferences as expressed in their choices. Adding the assumptions that money is a measure of the individual’s ability to satisfy their preferences, and that individuals make choices rationally, income (or consumption) becomes a proxy for well-being.

“Quality of life,” in the last row, includes several different traditions reflecting different understandings of this concept. First, we have the “social indicators” tradition in sociology from the 1970s, which sought to enrich the statistics collected by governments and NGOs beyond the most basic ones. Second is development economics with its capabilities approach (and various related proposals), which sets out to capture sets of goods that matter for the progress of poor countries beyond mere economic growth. Finally, quality-of-life measures focusing specifically on health are common in medical and public health research. In all these cases, a measure is basically a collection of indicators all thought relevant to quality of life (including sometimes economic and subjective indicators) plus a rule about how to aggregate these indicators into a single number, if necessary.

To take stock, the table shows distinct traditions of defining and of quantifying well-being. Crucially, these traditions do not represent different ways of measuring the same phenomenon. Rather they conceptualize well-being differently. They disagree as to what phenomenon we are talking about when we talk about well-being. Emotional states are one thing, broader quality of life quite another, and there is no sense in claiming that a measure of a person’s emotional state is superior, as an indicator, to a measure of their quality of life. Any choice between the rows of our table will be made on the grounds that one of these phenomena is considered the right focus of well-being research. The next two sections of this chapter report how one of these rows—namely life satisfaction—is in the process of becoming predominant.

8.2. In Search of the Right Quantity

From the perspective of advocates of subjective well-being (that would be the first three rows), the villain was and remains the income and consumption measures of orthodox economics. Economists call these measures “indicators of well-being,” and they come with a straightforward and well-developed story about quantification.3 But the public face of the science of well-being is typically associated with a rejection, or at least an attempt to marginalize, the economic definition. “Beyond Money: Toward an Economy of Well-Being” was the title of a seminal 2004 article by US psychologists Ed Diener and Martin Seligman, wherein the new field of research was explicitly positioned in opposition to economic measures, whose problematic assumptions were also exposed.

The standard criticism seizes upon the “Easterlin paradox,” named, ironically, after the economist Richard Easterlin, who first articulated it in the 1970s. He juxtaposed two facts: at any given time and within any country, income predicts self-reported happiness, but over time, as income increases, happiness does not rise correspondingly. To resolve this tension, Easterlin hypothesized that beyond a certain minimum, people’s judgment of their well-being is indexed to their income relative to that of others—not in absolute terms (relative to their previous income level). Therefore income will fail to track subjective well-being over time. In the 1990s and early 2000s, the Easterlin paradox acted as a powerful motivator for research on the relationship between objective circumstances and life evaluation, which in turn helped establish the basics of well-being quantification. Now it is far less clear whether the paradox even exists, as new research does not find evidence for the second supposed fact—increase in absolute income does after all seem to predict increases in subjective well-being over time and hence it is unclear that money and happiness come apart as Easterlin (1974) claimed they do. But these findings have not managed to dampen the interest in self-reported happiness. Even if, on average, absolute income and subjective well-being rise and fall together, there are still striking cases of divergence, for example the steady growth of GDP coupled with a steady fall in life satisfaction in Egypt and Tunisia during the Arab Spring.4 So there is still room for attending to subjective well-being as independent of income.

However, to make subjective well-being a genuine alternative for policymaking and evaluation, its advocates needed to turn it into a quantity just as manageable as the traditional indicators. Otherwise, in response to the passionate calls to make policy accountable to people’s priorities, the economists could easily retort: “Well-being is all nice and good, but how will we plug it into budgets, spreadsheets, and cost-benefit analyses?” This demand for quantifiability undoubtedly stems from presuppositions about the nature of proper scientific evidence, presuppositions that can be easily challenged on philosophical grounds. But, as we shall see, there are political grounds for it too—considerations of democracy.

As table 8.1 shows, however, there is no single way of quantifying even subjective well-being, let alone well-being under other conceptions. If you are a hedonist, quantification of happiness will take the form of turning momentary reports of emotional state into a single rating, and that single rating into a dot on a curve that represents a subject’s emotional state over time. Such “experience sampling” takes time and resources, and despite its faithfulness to classical Benthamite utilitarianism and some attempts to implement it on a large scale, even psychologists who are sympathetic to this philosophy agree that it is not practical. Although it continues to be used in research, it is not part of official statistics of the sort that well-being enthusiasts call for.5 By and large, such statistics are based on questionnaires, especially those that Easterlin himself used—the life-satisfaction questionnaire.

8.3. Life Satisfaction as a Quantity of Well-Being

Life-satisfaction questionnaires quantify well-being by asking respondents to agree or disagree (plus strongly agree, strongly disagree, or neither, and so on) with statements such as: “In most ways my life is close to the ideal” or “I am satisfied with my life.”6 When large samples of people take these questionnaires, it generates a lot of data finely differentiated by the degree to which people endorse a given statement. If these questions exhibit the right psychometric properties of reliability and predictability among the right populations of subjects, they are typically declared valid. It thus becomes possible to talk about the differences in life satisfaction between different groups (Finns are happier than Russians) and about life satisfaction coefficients (living in a safe community raises life satisfaction by that much). This is a major step toward making subjective well-being a quantity.

That life-satisfaction questionnaires can jump through these hurdles of psychometrics does not make them immune from criticism. Of these there are roughly three types. The first comes from the point of view of ethics and axiology. What does life satisfaction have to do with well-being? Philosopher Daniel Haybron (2008) argues that to be satisfied with life is by and large to endorse certain values such as gratitude, modesty, and determination. The extent to which we are satisfied with our lives is reflected in how thankful we think we should be for what we have or whether we think we should not rest on our laurels. Life satisfaction reflects “one’s stance towards one’s life” (89). But the stance we adopt toward our life on reflection is one thing, he maintains, and how we actually feel—our emotions in daily life—is another. A great deal of misery in daily life is compatible with high life satisfaction, which makes it implausible as a reflection of a person’s well-being.

A second worry comes from the point of view of measurement theory. Ratings on a Likert scale strictly speaking justify only an ordering: if I rated myself as strongly agreeing with a statement about my life satisfaction and you rated yourself as agreeing only somewhat, then (assuming our ratings are comparable—more on that shortly) it is permissible to claim that I have a higher satisfaction with life than you. But that’s all an ordering justifies. The conventions of metrology do not permit the more specific claim that I am more satisfied than you by x points; nor is it permissible to average our ratings in order to make population-level comparisons. To use technical language, ordinal scales should not be treated as cardinal ones, but it is hard to see how the strand of well-being science that is intent on providing an alternative to economic indicators can make do only with ordinal comparisons.7

Finally there are worries from the point of view of psychology. What sort of judgment is a judgment of life satisfaction? On the face of it, it requires an aggregation of a great deal of information on the part of the respondents: here are all the things in life I value, here’s how well I think I am faring on each of those values, here’s how all these assessments tote up when I think about my life as a whole. But do people actually perform such complex evaluations? There is a long-standing concern with the alleged fickleness of life-satisfaction judgments: apparently finding a coin, or seeing a person in a wheelchair, or being reminded of the weather, can drastically change a person’s evaluation of their well-being. Such effects suggest alternative explanations for how these judgments are formed (perhaps they are made on the spot and are deeply susceptible to mood) and makes their replicability dubious (Schwarz and Strack 1991, 1999). And here we have not even broached another psychological controversy—whether life-satisfaction ratings are comparable between individuals or across cultures.

None of the three angles of attack have stopped life satisfaction from becoming the most popular measure of well-being today. Its advocates dismiss the philosophical worries on the grounds that ratings of life satisfaction correlate decently with longer and richer questionnaires and that respondents’ own judgments about how to evaluate their lives should be respected (Cheung and Lucas 2014; Diener et al. 2013). The worries of measurement theorists are harder to dismiss but scientists disagree on their severity, with Ferrer-i-Carbonell and Frijters (2004) claiming that treating ordinal data as cardinal “does not generally bias the results obtained,” while Schroeder and Yitzhaki (2017) argue to the contrary.

It is the psychological objections that scientists have taken most seriously and to which they have found relatively convincing replies. New experiments reveal that judgments of life satisfaction are actually quite robust. The finding about the weather/coin/wheelchair effect on life-satisfaction reports has not been replicated (Lucas 2013). The context in which people are asked to judge their life satisfaction—what they are thinking and experiencing at the moment and in what circumstances—clearly affects this judgment. But whether these context effects make these measures unusable and uninformative is far less clear.8 There is also evidence of the interpersonal and cross-cultural comparability of life satisfaction (Diener and Suh 2003).

To take stock: life-satisfaction questionnaires are controversial but hard to abandon. While the theoretical case against them is strong, they are also widely available and generate interesting data. For practical reasons, none of the philosophical and methodological objections raised against them has been able to dislodge them from their dominant status in the science of well-being. In the OECD (2013) report with official guidelines for collecting well-being statistics, life satisfaction has a central place. Is this consensus a case of dangerous ignorance of facts for the sake of expediency?

8.4. Letwin’s Dilemma

So far, we have seen two facts that pull in different directions: on the one hand, there is a huge diversity in well-being concepts and measures, with no decisive theoretical reason to settle for one rather than another; on the other hand, life satisfaction is emerging as a winner on grounds that are often pragmatic (this will become even clearer in section 8.6). In this section, we complete the story of how this tension can be resolved by introducing the figure of the “political sponsor.” While our example of such a sponsor is from recent British history, the tropes this figure uses illuminate the predicament more generally. The sponsor sidesteps the critiques we have mentioned so far by dismissing the assumption that for a given number to become an adequate indicator of well-being, this number has to be philosophically grounded and empirically valid in the light of the best available standards in philosophy, measurement theory, and psychology.

In the academy, both the cheerleaders of life satisfaction and its critics go along with this assumption. For the cheerleaders, it is an ideal that provides intellectual legitimacy to the whole enterprise, though they add that the bar for validity should not be set too high. For them, life satisfaction meets a certain minimal theoretical standard—it is after all not completely ridiculous to suppose that well-being is a matter of how well we judge our life to be going. The critics do not buy this and demand a rigorous justification of the very idea of well-being quantification. Philosopher Daniel Hausman (2015), for example, rejects all the existing measures simply because well-being is too complex and too person-specific to be captured by any population-level scale. For both sides, however, the matter is intellectual.

For the political sponsor, however, the arguments that measures of well-being do or do not get at “the real thing” miss the mark. Science and policy are not after the real thing, but seek a redefinition of well-being that, while preserving something of the parent-concept, picks out a phenomenon in the world that is measurable, rendering it comparable and transferable. This is the argument that comes out most clearly in a testimony by Oliver Letwin, the minister for government policy under David Cameron and the latter’s influential adviser. As the Conservative government came to power in 2010, well-being was a centerpiece of its agenda for the “social revival” of the United Kingdom. Ironically, this agenda was coupled with austerity measures, and was arguably cast in the role of a warm and fuzzy distraction from the slashing of public budgets.9

In 2016 and now out of power, Letwin recollected the challenge of a politician interested in well-being thus: “If you talk in a debate about things like beauty, or happiness, or life satisfaction, or well-being, it does not take more that fifteen seconds before you are dismissed as an eccentric lunatic” (Centre for Economic Performance 2016). This, he continues, is in sharp contrast with values that are nicely quantifiable such as “weapons, hospitals, rail lines.” Letwin and his Tory colleagues wanted to change the terms of this conversation by making well-being a legitimate subject to be brought up at town-hall meetings, media interviews, and election debates. Letwin freely admitted that this required representing well-being as a quantity even though in fact, according to him, it is a quality. (Letwin has a PhD in philosophy, hence his comfort with these distinctions.) And it required not just saying that well-being can be represented quantitatively but also institutionalizing this view with statistics collected by dispassionate bureaucrats, reports, graphs, and other essential accouterments of authority. Thus was born the Measuring National Well-Being program in the Office for National Statistics (ONS), which since 2011 has continuously collected relevant statistics and regularly issued reports on well-being along with other uncontroversial numbers like crimes, births, and so on. Letwin’s objective was to obtain “a set of data which may be naïve, but also respectable and internationally comparable, and [which] can be used in political debate.”10 For bureaucrats and politicians like him, measurable and quantifiable well-being became an essential ingredient of policy debates.

The conflict Letwin describes—“Letwin’s dilemma”—is real and generalizes beyond this episode of British political history. Well-being numbers are artificial and distort the real thing, and they can have unforeseen consequences; but without these numbers public debate focuses only on conventional economic indicators, which have these failings too. As a political sponsor of well-being measurement, Letwin clearly picked one horn of this dilemma. He set in motion the operation that now reliably produces numbers about British national well-being and that we describe in the next section.

Let us take stock of the argument so far. Quantifying well-being is a messy business. There is no unique and obviously correct definition and any choice of indicators will be controversial—even the most established measures such as life satisfaction. But there is a strong practical impetus to produce some credible numbers about well-being. Many scientists are enthusiastic about challenging what they perceive as economics’ unfair domination in policy; politicians like Letwin are motivated by the desire to show a commitment to fundamental values and to redirect public spending accordingly. With this story in hand, we are in a position to move into the more critical territory. Well-being measurement always requires a compromise. Are some compromises better than others?

Let us first preempt one possible reaction—that of rejecting the framing of the dilemma entirely: “Who says that the subjects of political debate and public policy must be quantitative? Why can’t well-being be part of the conversation without being numerical? Who gets to set these terms and why?” We recognize that there is room for different argumentative strategies and, as Chatterjee and Newfield demonstrate in their respective chapters in the present volume (chapters 1 and 2), the numerical can, as a matter of fact, be dislodged. Still, we agree with John (chapter 6) and Badano (chapter 7) that it is too rash to dismiss the advantages of quantification in democratic politics. When politicians make numerically precise promises, it is easier for the public to assess how they deliver on those promises. Numbers provide a means of speaking truth to power for those who are less articulate with compelling narratives and beautiful stories. Statistics may be fickle, but numbers make debates concrete, and can be disputed on scientific grounds in a way that qualitative concepts and narratives cannot. So the dilemma stands.

Which horn is preferable? Is it better to settle for distortion or for irrelevance? We submit that there is no straightforward resolution of this dilemma. There is no argument powerful enough to show that well-being numbers are always preferable to other numbers, nor that they are always inferior. Rather it depends on the specifics and on the context: which well-being measures are used, what they are used for, and what alternatives are available. To illustrate the complexity of the matter, we will consider two examples from public policy. In our view, the first one represents a defensible generation of numbers about well-being, while the second less so.

8.5. Incorporating Well-Being into National Statistics

The ONS project to monitor the United Kingdom’s national well-being spurred a regular production of rich and diverse statistics describing all relevant spheres of life. (Indeed unlike almost everything else Cameron’s government did, the well-being measurement initiative received no criticism from other political parties.) Its main virtues are its comprehensiveness and its legitimacy. Both are a result of the seriousness with which the ONS approached the task of drafting an inclusive list of indicators that are meaningful to the public. To secure this, the ONS conducted a countrywide consultation called “What Matters to You?” between 2010 and 2012, soliciting views and recommendations from the public, experts, and communities all across the United Kingdom (Office for National Statistics 2012). Potential measures of well-being were released to the public and then respondents were queried about their suitability:

The outcome of this exercise is a measure that contains both subjective indicators—happiness, life satisfaction, sense of meaning—and objective indicators, such as economy, work, health, education, safety, housing, and recycling (Office for National Statistics 2019). The ONS settled the seemingly intractable debates involving the experts and various groups of the public by including as many items in its final measure as practically possible and also by publicly vetting this measure. Their approach is similar to schemes in France, Italy, Canada, and New Zealand, as well as the German initiative “Gut Leben in Deutschland,” which used town-hall discussions in 2013 to arrive at a list of twelve groups of indicators against which government policies were to be judged. (Interestingly, none of them concerned subjective well-being.)

This model of producing well-being statistics can be criticized using both horns of Letwin’s dilemma. The table of indicators is rich in information, thus reflecting the complex and pluralistic nature of well-being, but this very richness hides deep disagreements about which indicators are more central than others. Is access to recycling on the same plane as freedom from anxiety? How do all these indicators work together? Is ONS National Well-Being “everything but the kitchen sink”? Whether these questions are answerable to the critic’s satisfaction or instead expose fundamental problems depends on how the data collected will be used. When Cameron’s cabinet kick-started the project, they had great ambitions for judging policies and spending priorities against the ONS data. This did not come to pass, since after the Brexit referendum the well-being agenda died, at least in the upper echelons of the government. Now the good work of the ONS serves mostly a representational function: their state-of-the-art statistics paint a multifaceted picture of a community’s life according to standards that this community itself endorses. Granted, the picture could be even richer. For example, the ONS could collect qualitative data about well-being in the form of narratives and interviews. They could also take a cue from the growing movement of citizen science and more systematically involve citizens in generating data that they see as reflecting their well-being. But this might be too much to ask from a government department. So given the ONS position and constraints, their statistics do the job. In this sense the ONS found a defensible middle road out of Letwin’s dilemma.

8.6. Life Satisfaction as the Master Number

A very different project is undertaken by economists who wish to do a lot more than just reflect the variety of well-being-related priorities in national statistics. To them it is not enough to just measure well-being, they want to incorporate it into the very heart of policy evaluation—the cost-benefit analysis—so that well-being becomes the benchmark against which each item of public spending is judged. For this to happen, tables of varied indicators like those collected by the ONS are entirely unsuitable. Rather, these economists are after a single quantity whose variation in response to changes in policies and circumstances can be observed, thus enabling the identification of bundles of policies that maximize overall happiness.

The economists advancing this vision most explicitly are Andrew Clark, Sarah Flèche, Richard Layard, Nattavudh Powdthavee, and George Ward, of the Wellbeing Programme at the LSE’s Centre for Economic Performance, whose manifesto was published as the 2018 book The Origins of Happiness: The Science of Well-Being over the Life Course. There are others who subscribe to this vision (De Neve et al. 2020; Frijters et al. 2020), but our main focus will be on the Origins. The book received ringing endorsements from the most prominent well-being scientists in the United States, Canada, and elsewhere in Europe, and in December 2016 a high-profile launch event was held at the LSE to unveil the key findings of the project. It was attended by national and international powerbrokers and widely covered by the media.

It was at this event that Letwin articulated his dilemma, and it was clear which horn the authors of Origins preferred. By 2016, Richard Layard had spent decades popularizing the science of well-being and devising ways to deploy it in policy. He is explicit in endorsing the utilitarian goal of furthering subjective well-being—and for that, life-satisfaction data are plenty good enough (Layard 2005). None of the objections against them, or the agenda as a whole, appear significant to him. When introducing Origins at the launch he briefly went over the worries about validity of life-satisfaction reports, retorting that they correlate well enough with brain scans and longer questionnaires. Even more briefly he noted that some critics find that there is more to life and good governance than happiness—namely justice, rights, and fairness. However he dismissed these worries as “puritanical.” The strategy of the book’s authors appears to emphasize the strength and high profile of their allies rather than to engage with their academic detractors. For example, they dedicated the book to the patron of happiness economics in the UK policy world, Gus O’Donnell; gave a shout-out to Tony Blair in a chapter epigraph; and, finally, invited Ohood Al Roumi, the minister of state for happiness from the United Arab Emirates, to share her experiences implementing happiness policies there. When a troublemaker in the audience asked about rights of foreign workers in the UAE, she inevitably had to ignore him. The overall impression of the authors of this chapter, who were present at that launch event, is that for Layard and his team the moral and intellectual complications were a small price to pay for the potential practical significance of their conclusions.

The heart of their proposal is that the data available from various national and international panels (such as the British Household Panel Survey, the German Socio-Economic Panel, the Household Income and Labour Dynamics in Australia, and the Avon Longitudinal Study of Parents and Children) enables fairly precise inferences about how much a given set of social, demographic, and economic circumstances boosts or impedes happiness, defined as life satisfaction. And these inferences can be made over the life course, starting from childhood all the way to old age. For adults, mental health (self-assessed) emerges as the single biggest predictor of individual happiness, and similarly for children—though in their case it’s assessed by the mother, and the mother’s own mental health is the best predictor of the child’s. Other factors are also considered, such as poverty, education, parenting styles, school, employment, partnership, social norms, and so on, but none makes as strong a statistical contribution as mental health. For example, having a diagnosed depression or anxiety disorder explains twice as much variation in life satisfaction as income (R2 = 0.19 for mental health and only 0.09 for income). A person’s education has even less effect than income (R2 = 0.02), whereas the education of others in one’s surrounding has a measurably negative effect on individual life satisfaction. This is part of a well-documented effect known as “social comparison,” where the self-estimated value of your income or education depends crucially on how much of those goods others around you possess. Another phenomenon such analysis reveals is adaptation, that is, returning to the previous level of life satisfaction after a positive or negative shock. This, however, does not hold for unemployment, loss of partner, and mental illness.

The authors document these facts with great care, reporting the relative and the absolute quantitative effects of each specific factor in life satisfaction using coefficients. These coefficients play an essential role in the “revolution in policymaking” that Layard and his colleagues advocate. They argue that government spending should be evaluated pretty much exclusively using a method of cost-effectiveness in which benefits are measured in units of happiness. In this vision, each item of government spending must pass a test: does it increase happiness as efficiently as possible? Such an exercise requires a threshold of cost-per-unit-of-happiness, below which programs and services should not be funded. The authors see quality-adjusted life years (QALY), currently used by the National Institute for Health and Care Excellence (NICE) and discussed also by Badano (chapter 7), as the obvious model. As Badano explains, NICE recommends against public provision of drugs that cost more than £30,000 per QALY and this number, although “spurious” in John’s sense (chapter 6), plays a legitimate political role. Layard and coauthors propose to extend this process from health to well-being. They hypothesize that it is not efficient for the Treasury to recommend spending that costs more than, say, £3,500 per unit of happiness. This is why the estimation of absolute effect coefficients is so important to the authors of Origins. Once we know how much extra happiness income, unemployment, health, or what have you, buys, we will be able to compare the cost-effectiveness of different policies. Public moneys will go toward the happiest possible bundle of services.

What are we to make of this example of well-being quantification? Commentators from different fields will raise different objections. The critics of life satisfaction discussed in section 8.4 will worry about the exclusive reliance on this indicator (none of the other well-being indicators collected by the ONS features in Origins). Ethicists and political philosophers, on the other hand, will ask how rights, obligations, and constitutional constraints will feature in the proposed cost-effectiveness analysis (see Fabian 2018). But we want to focus on the specific problem of the use and misuse of numbers. In Origins, numbers serve to erase social, cultural, and historical context and to turn well-being into a simple object with universal determinants discoverable by statistics alone. Badano and John each make a strong case that spurious numbers may play legitimate political roles. But we doubt that such a justification is available in this case. Two instances of context erasure are particularly vivid: how the authors treat mental health and how they treat public goods.

In Origins, mental health is measured largely by brief standardized self-reports, and these reports, it turns out, explain a large chunk of variation in life satisfaction, more than poverty and inequality do. So the authors tout as their big result the idea that mental illness is the biggest cause of misery, and that intervening on it, rather than on poverty, is the most efficient way of raising happiness. An obvious circularity arises here because questions about life satisfaction are very similar to questions of self-reported mental health. But more significantly, as the network of activists Psychologists for Social Change argues, had the authors used more than simple regression modeling—had they attended to the rich tradition of qualitative research in this area—they would not treat mental health and poverty as noninteracting variables that each make a separable contribution to well-being, one large and one small. Just as intersectional feminists worry that gender and race cannot easily be decomposed into distinct causes of oppression, so poverty and mental illness should be studied together as coproducers of misery by a combination of qualitative and quantitative methods. In this case, Origins puts forward a number with questionable validity and misleading precision and then uses this number to support the consequential conclusion that poverty should be less of a priority to policymakers than mental health.

Likewise, in the analysis of the relationship between public goods and happiness, Origins seeks to explain variation in happiness across 126 countries using social variables such as trust, generosity, freedom, and social support. A cross-sectional regression based on Gallup World Poll data seemingly provides enough evidence for the authors to make universal causal pronouncements such as, “If we go from the lowest levels of trust (7% in Brazil) to the highest levels of trust (64% in Norway), this raises average life satisfaction by 57%” (Clark et al. 2018, 229). Our worry here is not the limitations of this exclusively cross-sectional analysis (that it fails to tackle the question of whether trust influences happiness or happiness influences trust). Nor is it that the trust variable is defined narrowly as the proportion of people who say “yes” to the single question, “In general, do you think that most people can be trusted?” Rather, the problem is the assumption that well-being determinants have a universal noncontextual effect that coefficients measure: x for mental health, y for education, z for trust. Of course, it is possible to disaggregate the statistics by populations and hence recognize the differences (which many happiness economists do). But recall that the authors of Origins are seeking a number relative to which policies should or should not be funded. To settle on such a number you need to treat the effect of a variable identified by statistical analysis as a uniform contribution of this variable always and everywhere. This uniformity is extremely implausible. The strong statistical effect of mothers’ mental health on children may have something to do with the gender politics of the surveys, or it may stem from the fact that mothers did most of the care work at the time when the data was available. Might fathers’ mental health become equally important as they do more of the childcare? Why present this effect as a stable cause of child well-being, as if mandated by nature? The data reveal plenty of fascinating differences among countries in the way that, say, loss of employment or disability affects life satisfaction, but these contextual effects do not sit well with the ambition to estimate quantities that can be plugged into cost-effectiveness analysis. The authors’ lack of interest in the local, the variable, the historical is motivated by the self-imposed demand for uncontroversial, stable, and tractable input into cost-effectiveness analysis. In this project, as they see it, there is no room for qualitative data, for rich but local ethnographies, nor indeed for participatory policymaking.

This example of generating and using well-being numbers, we think, is very different from the multitudinous and evolving tables of indicators produced by the ONS, even considering that those also lack qualitative information. The project of the Origins is far more controversial, because far more audacious in the way it transforms well-being into an object of quantification. We see here a metamorphosis of life satisfaction from an academic indicator that challenged economists into a master number that has stable determinants measured by coefficients. This is in stark contrast with the conclusions of the aforementioned Stiglitz, Sen, and Fitoussi report so often cited as an inspiration for happiness economics. That report’s vision of well-being is nothing like that of Origins: it argues, for example, that “no single measure can summarize something as complex as well-being” (Stiglitz et al. 2009, 12). Perhaps this recognition is nothing more than lip service, similar to Letwin’s blithe remark that well-being is “of course” a quality rather than a quantity. Perhaps once quantification of well-being gets underway, there is an inevitable tendency toward its reduction to the master number of life satisfaction. But these possibilities should not stop us from ringing alarm bells.

Letwin’s dilemma invites us to imagine just how bad it would be to use “naïve” well-being numbers as compared to standard economic indicators. We can’t make a watertight case as to which is worse: the proposal in the Origins or the existing model of representing benefit and evaluating policies? We have given some reasons to think that Layard and coauthors distort well-being in the direction that makes it hardly recognizable and do so without properly engaging with their critics. In their hands, well-being becomes a monistic quantity that reacts mechanically to changes in circumstances as if it were a Newtonian system with forces that combine by vector addition. Even those open to quantification of well-being for the sake of democratic policy deliberations, along the lines of John and Badano, should balk at such a radical transformation. Well-being may well be a pliable concept, but is it that pliable?

8.7. Advice for the Critic

What lessons do our two stories carry for the task of analyzing quantification more generally? We emphasized that when it comes to quantifying well-being there is no master measure. There is, however, a very popular measure—life satisfaction ratings—and this quantity, in virtue of being easily available and unidimensional, has made it further than other measures into the world of evidence-based policy. Whether this is a good thing is not so much a question of whether it, or any other measure, represents well-being properly, but rather of what numbers we compare it to. Well-being numbers in the newly updated national statistics (which include life satisfaction among other indicators) seem a huge improvement over the limited and narrow previous data. However, in the hands of the authors of Origins urging a new form of cost-effectiveness analysis, the notion of “life satisfaction” is much less innocent because it is paired up with a grossly implausible methodology of social evaluation as well as a larger technocratic model of governance.11

Although we concentrated only on public policy, such pairings between more and less controversial can also be found in the use of happiness numbers in self-help and management. So our guess is that the lessons generalize. Responsible criticism has to take seriously consideration of validity—that is, whether the measure in question is an adequate representation of well-being. For this purpose we need to wear the hat of a philosopher-scientist who articulates and endorses a certain minimal standard of well-being measurement. Yet this is not enough. The philosopher-scientist then needs to put on the hat of a social scientist who acknowledges the rhetorical and pragmatic role this number plays in politics, governance, and public debate, and judges how well it plays this role as compared to other numbers. The critic should be ready to accept certain trade-offs between usefulness and validity, because there is no point in holding well-being measures to an impossible ideal. The scope of considerations justifying these trade-offs should be wide, encompassing the moral and political work these numbers do.

Notes

1. Diener and Seligman (2004); Huppert et al. (2005); Kahneman and Krueger (2006); Kahneman et al. (1999); Layard (2005); Seligman (2004); Seligman and Csikszentmihalyi (2000).

2. This table is an abbreviated version. For a full version and references, see Alexandrova (2017).

3. For example, development economist Angus Deaton describes his Nobel Prize–winning research as concerning “wellbeing, what was once called welfare, and uses market and survey data to measure the behavior of individuals and groups and to make inferences about wellbeing” (Deaton 2016, 1221).

4. Stevenson and Wolfers (2008) articulated an influential critique. Clark at al. (2012) present the state of the art, and OECD (2013) defends the continued relevance of well-being research.

5. Kahneman et al. (2004a, 2004b) explain the virtues of experience sampling for science and national accounting; the work of Stone et al. (2016) is an example of its use in research.

6. There are different ways of measuring life satisfaction. The Satisfaction with Life Scale (SWLS) is one popular five-item Likert scale. Its prominence is due to it being short and to the fact that it has by now been used in hundreds of studies, especially by its originator, the prolific psychologist Ed Diener and his colleagues and students (Diener et al. 1985, 2008). Questions about life satisfaction also figure in all the main large-scale surveys and panel datasets worldwide, such as the German Socio-Economic Panel, the UK British Household Panel Survey, and the Australian HILDA Survey.

7. Though see Larroulet-Philippi (2021) and Vessonen (2019) for worries about these standard conventions regarding validity.

8. See Lucas and Lawless (2013) and Oishi et al. (2003) for a defense of life satisfaction judgments, and Deaton and Stone (2016) and Lucas et al. (2016) for the latest debate on context effects.

9. See Davies (2015) for a critical commentary. For a sympathetic one, see Express KCS (2015).

10. Centre for Economic Performance 2016.

11. We develop this argument further in Singh and Alexandrova (2020). For more on whether the practicality of the life satisfaction measure justifies its dominance, see Mitchell and Alexandrova (2020).