The Tyranny of the Ideal: Justice in a Diverse Society

The Elusive Ideal

Searching under a Single Perspective

“Would you tell me, please, which way I ought to go from here?”

“That depends a good deal on where you want to get to,” said the Cat.

—LEWIS CARROLL

1.1 Evaluative Perspectives and the Social Realizations Condition

IN HIS Lectures on the History of Political Philosophy, RAWLS TELLS US that “a normalization of interests attributed to the parties” is “common to social contract doctrines.”1 This remark is made in the context of discussing Rousseau’s notion of the general will, which is also said to require a shared “point of view.”² On Rawls’s reading of Rousseau, private individuals are characterized by a variety of different interests that are magnified by self-bias and selfishness. Such individuals can live together under freely endorsed common laws only if they “share a conception of the common good.”³ This shared conception, in turn, is generated from individuals’ shared fundamental interests and capacities, which derive from their shared human nature. As Rawls sees it, these common fundamental interests allow individuals to abstract from their differences and occupy a shared legislative point of view, based on a shared conception of the common good.⁴ Furthermore, when occupying this common view the parties all have the same basis for their deliberations, and so everyone wills the same laws, and this is what allows them to live together under freely endorsed common laws.

Rawls is informally articulating the concept of a perspective—or, as I shall say, an evaluative perspective—which has been more thoroughly explored and formally modeled in the last decade.5 The rest of this book develops and analyzes the idea of an evaluative perspective on ideal justice. Although we shall see that evaluative perspectives are implicit in much political theory—especially in thinking about ideal justice—they are unfamiliar to most political philosophers. I try to render some formal ideas intuitive as far as is possible, without sacrificing too much rigor. More formal issues are dealt with in appendixes. Nevertheless, it is a bit of work to understand the idea of a perspective and how it enters into political philosophy, but it will pay off: features of theories of justice that are not obvious come to the forefront once we focus on perspectives.

As I shall understand it, an evaluative perspective, Σ, includes three fundamental elements (we shall see in §II.1.2 that two additional elements are required for ideal theory):

(ES) A set of evaluative standards or criteria by which alternative social worlds in a domain {X} are to be evaluated.

(WF) For all worlds i in the domain {X}, a specification of the world features of i that are relevant to evaluation according to ES, the evaluative standards. Specifying the relevant world features (WF) requires a set of categorizations that constitute the justice-relevant description of world i on perspective Σ. A description of a social world includes its institutions, social and economic dynamics, and relevant background conditions, including relevant economic and psychological facts (§§I.2.1–2, I.3.1). The domain {X} includes the current social world and the ideal, as well as other nonideal worlds in the option space.

(MP) A mapping function takes the evaluative standards (ES) and applies them to a social world, i, as specified by WF, yielding what I shall call a justice score for world i, the social world described by world features WF_i. The mapping function has two roles. (i) The mapping function must employ a model or models that predict how the justice-relevant features of a social world (WF_i) will interact to produce a social realization. This is the modeling task of the mapping function (§I.3.1). (ii) The mapping function must take the set of evaluative standards (ES) and determine their relative importance in such a way that they provide a single evaluation of this social realization. This might be called the overall evaluation task of the mapping function. Given (i) and (ii), a mapping function takes ES and maps it on to a realization based on WF and in so doing generates a justice score for a social world.

Rawls’s description of the normalized interests supposed by all social contracts seems to refer only to the set of evaluative standards (ES)—the parties’ shared interests. But without specifying the world’s features to be evaluated (WF) and a mapping function (MP), a shared set of evaluative standards will not suffice for a shared evaluation of a social world’s justice. If individuals are going to adopt a shared point of view they must also agree about precisely what they are evaluating (the basic structure? the family? churches?),⁶ their ultimate social realizations, and how to apply their shared standards to that which is being evaluated. It is important to stress that these elements of a perspective are not arbitrary, nor simply an artifact of our model: together they satisfy requirements for an ideal theory that we uncovered in chapter I. A theory of ideal justice, T, that includes ES, WF, and MP satisfies the Social Realizations Condition (§I.4), namely:

T must evaluate a set (or domain) of social worlds {X}. For each social world i, that is a member of {X}, T evaluates i in terms of its realization of justice (or, more broadly, relevant evaluative standards). This must yield a consistent comparative ranking of the members of {X}, which must include the present social world and the ideal, in terms of their justice.

A theory meeting ES, WF, and MP evaluates a set (or domain) of social worlds {X} in terms of their realization of justice. The cardinal justice score guarantees a comparative (noncyclical) ranking of the members of {X}, which includes the present social world and the ideal.

Some explanatory comments about these elements of an evaluative perspective are in order. Consider first the world features (WF). I suppose that in perspective Σ, no two social worlds share the exact same justice-relevant features. Social world a may have justice-relevant features {f, g, h}, while b has some, but not all of these features (e.g., f and g, but not h), and c, a still different set of features (e.g., f and h, but not g). Social worlds are thus individuated by their justice-relevant features. Note that condition WF refers only to the justice-relevant features of a social world and their realization, since only these are strictly necessary to generate the evaluations required by ideal theory.7 We should not think, though, that our model implies that an ideal theorist starts out by knowing which features of the world are relevant to justice. In the development of its perspective, Σ would no doubt start by identifying a large set of features, and then, given its evaluative standards (ES), would determine which are relevant to justice. Indeed, later on (§§III.2.4–5), I shall stress how the evaluative standards and world features are interconnected within a perspective—a perspective’s evaluative standards affects its view of what features of the world are relevant to evaluation and vice versa. Certainly no assumption is made here that these features of a perspective are independent of each other—indeed we shall see quite the opposite obtains. All the model requires is that before a final judgment of the justice of any social world i can be generated, perspective Σ must have settled on the criteria of evaluation (ES) and the features to be evaluated (WF).

To begin to clarify the crucial, but typically overlooked, mapping relation (MP) between the evaluative standards and the justice-relevant features of the social worlds, consider three different procedures that might be employed. The first we might call “categorical” and is probably the most common type of evaluation in philosophy. Categorical judgments are concerned with whether or not a social world is just; employing this procedure would yield a series of yes/no judgments.⁸ Whatever their attractions in other contexts, such judgments are of little use for an ideal theory that seeks to orient our quest for justice by guiding us to better (i.e., more just) worlds, short of the ideal. According to the Social Realizations Condition, we need an evaluative function capable of generating a range of justice judgments. John Broome contrasts the philosopher’s use of “categorical” judgments with the economist’s focus on “comparative” judgments. Whereas the philosopher asks “Is a just or unjust?” the economist asks “Is a more or less just than b?”9

The idea of a comparative judgment, however, is itself ambiguous between strictly comparative and scalar interpretations. On the strictly comparative interpretation one can form a judgment about a given social world, a, only by comparing it to another social world, b. On this reading, judgments of justice have the same logical structure as preferences: they are primitive binary relations. Although philosophers often conceive of preferences, at their most basic level, as unitary states such as desires, so that one can have simply “a preference for x,” in decision theory they are inherently binary: one can only have a preference for x over something else, say y. Thus in decision theory a preference for x over y is not a comparison of, say, the strength of one’s preference for x and of one’s preference for y (concepts that do not enter into decision theory); “x is preferred to y” is the most primitive preference judgment. Similarly, then, a comparative notion of justice might say that the most primitive justice judgment is binary: “a is more just than b.” Such a comparative judgment would be primitive as it is not a comparison of two judgments—of a’s and b’s justice. Furthermore, on this understanding of the comparative nature of justice, because all judgments of justice share this primitive binary comparative structure with preference, just as we could have intransitive preferences (x is preferred to y, y is preferred to z, yet z is preferred to x), we could also have intransitive judgments of justice (a is more just than b, b is more just than c, yet c is more just than a).10 As is well known, unless we impose some sort of transitivity axiom (as does Sen’s social choice theory approach; §I.1.3) pairwise judgments made on the basis of N-dimensional underlying considerations can lead to a host of such pathologies.¹¹

Because of this, without modification (such as imposing consistency conditions) such a strictly comparative approach is inappropriate as a model of ideal theorizing: the Social Realizations Condition may well not be met. Very roughly, unless we impose some sort of transitivity on the set of binary judgments in {X} (thereby creating an ordering of {X}) we cannot be assured that there will be a best element in the set—an ideal. A set with cycles at the top of the ordering of {X} would leave us without an ideal, and cycles involving many elements in the middle could lead us around in circles in our quest for greater justice.¹² To avoid these problems, I model the evaluation of the justice of social worlds as based on a scalar function jointly defined by the evaluative standards, the world features and the mapping function, which is not inherently comparative (though, importantly, neither is it categorical). On this model, an evaluative perspective S has a set of evaluative standards (ES) and a consistent way of applying them to social worlds (MP), which generate for any social world a score of its justice, which is about the justice of its world without comparison to others. Comparing the justice of two social worlds is thus not a primitive binary relation, but a comparison of two different justice evaluations. Some understanding of comparative justice along these lines is necessary if we are to model the Social Realizations Condition: our ideal theory requires an optimal element (the ideal) in domain {X}, as well as a consistent set of comparative rankings of less-than-optimal elements. And if we are to make sense of the metaphor of a mountain range with “high peaks” we would do best to suppose some sort of cardinal-like scale, if only to understand the pure logic of ideal theory.

A quintessential philosopher may insist that something is lost in this comparative (noncategorical) approach: we have only comparisons of degrees of justice rather than two categories, “just” and “unjust,” which call for very different responses. Thus the quintessential philosopher may wish to at least introduce a “positive” and “negative” range in the justice scores, identifying classes of “just” and “unjust” societies. Now while in some contexts a binary contrast between justice and injustice may be critical, in a theory of the ideal we are seeking to chart a course from where we are to increasingly just social states. Along the way, we wish to avoid getting much less just than we are presently, while finding a way to reach the heights of justice. To understand this quest for the ideal, a distinction between the areas of the unjust and of the just is not necessary. That said, within a strictly scalar approach (ranging, say, from 0 to 100), a theory of the ideal can have a mapping function such that, unless evaluative criteria are met to some threshold level very low overall scores of justice result, while above a certain threshold the scores become very high and perhaps compressed.13 Our conception of a perspective on ideal justice allows, but does not require, this type of evaluation.

The mapping function has a number of critical tasks. Supposing that there are multiple evaluative standards, the mapping function must specify some sort of weighting system; it takes these multiple criteria and employs them to evaluate a social world.¹⁴ In Rawls’s theory the mapping function can be understood as the original position.15 Recall that in chapter I (§§2.1–2) I was skeptical of principle-defined ideals: even Rawls, who was so devoted to his two principles of justice, thought that the General Conception was more appropriate in some social worlds. Suppose, then, that we wish to evaluate two social worlds: a, whose features and their realization include severe economic deprivation of the entire society and b, with far more affluence. Rawls suggests that the justice of a should be determined by the General Conception while b should be evaluated in terms of the Special Conception. As both are part of the same theory of justice, there must be a set of underlying values (liberty, equality, reciprocity, quality of life prospects) that regulates this choice; the parties to the original position, in deciding that world a is to be evaluated by one principle and b another, are, in effect, applying different trade-off rates between liberty and income in the two worlds: in world b no inequality of liberty for the sake of greater resources is allowed, while in world a such trade-offs are allowed. The model device of the original position thus is part of what I have called the “mapping function,” taking a set of underlying values and applying them to the evaluation of different social worlds.

That said, the model to be developed here does not depend on taking the set of underlying values and ideals—part of what Hamlin and Stemplowska call a “theory of ideals”—as more basic than the principles of justice. A theory might take the opposite approach. Instead of seeing principles as ways of commensurating underlying values in different social worlds, a perspective could insist that “in political philosophy as much as in aesthetics, the comparison of certain objects regarding their value (such as societies or paintings) depends on certain principles.”¹⁶ Such a perspective could put principles of justice at the foundation of the set of evaluative standards. So long as the principles were specified in a way such that they apply to all social worlds in the domain {X} to be evaluated,¹⁷ and the mapping function could score social worlds on how well they satisfied these principles, a principle-based perspective can be modeled. What is required is that, in some way, the perspective seeks to optimize the satisfaction of the evaluative standards, with the global optimum (the best in {X}), identified as the ideal. The evaluative standards, as weighted by the mapping function, yield a justice score for social worlds and thus define the ideal (as the optimum), as well as allowing comparisons of less-the-ideal conditions.18 Something like optimization is required to meet the Social Realizations Condition.

We should not think of this as a “mechanical” optimization according to some obvious formulae. The mapping function can include weights and lexical principles, can allow for a number of ties, and can be context sensitive.¹⁹ Different weights can be used in different circumstances (think again of Rawls’s General Conception).

Also critical to a mapping relation must be a predictive model or models (part [ii]), which yield an estimate of the social realization of the set of justice-relevant features in a social world. As we have seen (§I.3.1), every set of world features will be associated, within a given model or models, with a probability distribution of social realizations. It is simply implausible to suppose that our understanding of nonexistent social worlds is so refined that for each set of features (WF), there will be one and only one possible justice-relevant social realization.²⁰ That is why we suppose that each well-developed perspective on ideal justice must include a model or models of the social realizations of a set of world features. Supposing this, the mapping function must take the range of possible social realizations for any given set of relevant features (WF), and generate some sort of expected justice score. We can imagine optimistic functions that identify the justice of a set of features (a social world) with the best social realizations that might arise from it,21 and pessimistic ones that focus on the nastiest ones.²² And of course some theories seek to weigh the risks.²³

1.2 Meaningful Structures and the Orientation Condition

Recall the Orientation Condition:

T’s overall evaluation of nonideal members of {X} must necessarily refer to their “proximity” to the ideal social world, u, which is a member of {X}. This proximity measure cannot be simply reduced to an ordering of the members of {X} in terms of their inherent justice.

The Orientation Condition requires that a theory of the ideal must make sense of the idea of the “proximity” of social worlds a and b, and how far they are from the global optimum, u, and this proximity cannot simply express their overall inherent justice scores. As I have been stressing, unless the Orientation Condition is met, T could meet the Social Realizations Condition and yet be a simple “climbing theory” à la Sen’s (§I.1.3), generating an ordering of social states (and in this sense proximity judgments) purely in terms of their inherent justice; the world closest to a would be the world in the domain {X} with the closest justice score. The Orientation Condition requires that we make sense of proximity judgments of social worlds in a way that does not simply reflect their inherent justice.

I believe that by far the best way to think about the “distance” between two social worlds is in terms of how similar their underlying features are (see further §II.1.3). When I say that social world a is “very close” to social world b, I shall mean that it is very similar to it. This makes sense of the critical idea that the ideal orients our pursuit of justice by providing a direction of our endeavors. We move toward the ideal by making our world more like it, by changing our institutions and background facts so that they better align with the ideal world. The insight that the ideal is needed to give direction to our endeavors for reform cannot, as I have been arguing, simply be that we should reform so as to make our society more just; no doubt that is true, but, once again, Sen’s climbing model can perfectly accommodate that. To say that the ideal gives direction for reform is to say that we know the sort of society (its world features) that characterizes the ideal, and this knowledge should help us in reforming the characteristics of our society.

Consequently, we need to add a fourth element to a perspective on justice that underlies a theory in which the ideal is ineliminable: the perspective must be able to generate consistent judgments of the form, that, with respect to the justice-relevant features, “social world a is more similar to b than a is to c,” which I shall denote as [(a∼b)>(a∼c)].²⁴ Given the underlying features (WF) of worlds a, b, and the ideal, u, an ineliminably ideal perspective must be able to judge whether a or b is more similar to u in its defining features. This seems intuitive. Those worlds that, according to the world feature (WF) element of perspective Σ, have nearly identical justice-relevant properties will be seen by Σ as very similar worlds; if the justice-relevant properties of a are almost identical to those of b, worlds a and b will be very close; if the justice-relevant properties of world a are very different from u, it will be very far from u. To say that utopia is far from our current world is to stress how different its features are from our own; if utopia (or dystopia) is close, only a few modifications of our present world would bring it about. This is intuitively obvious at the limit: a is maximally close to itself because it has the precise same justice-relevant features as itself. The world closest to a will be that member of the domain {X} the justice-relevant properties of which are most similar to a. It is essential to stress that these similarity judgments are internal to a perspective: one of the things that defines a perspective on justice is not only what features of worlds it identifies as relevant to justice (WF), but which worlds it sees as very similar to others.

Understanding “distance” as an indication of likeness on pairwise comparisons conforms to the general literature on measuring “distance” between different systems. As Martin L. Weitzman stresses, “Distance is such an absolutely fundamental concept in the measurement of dissimilarity that it must play an essential role in any meaningful theory of diversity or classification. Therefore, it seems to me, the focus of theoretical discussion must be about whether or not a particular set of distances is appropriate for the measurement of pairwise dissimilarity in a particular context, not about whether or not such distances exist in the first place.”25 I assume that pairwise similarity is the basic relation, on which more complex measures of distance build. For simplicity’s sake, I shall suppose that a theory of the ideal can array the domain of social worlds to be evaluated, {X}, as a consistent similarity ordering where the end points are most dissimilar from each other. (There are several ways in which such an ordering could be generated, and each has its distinct formal characteristics. For those who are interested in these matters, appendix A explores different approaches and some problems that arise.) I shall call this the Similarity Ordering (SO) of a perspective on ideal justice, Σ.

It might seem that this idea of an ordering of social worlds in terms of their overall similarity is artificial, and simply a result of accepting the Orientation Condition. Not so. In addition to identifying the relevant features of a social world and evaluating them (the tasks of evaluative standards, world features, and the mapping function of a perspective), a perspective on any optimization problem over a domain is useful only if it creates what Scott Page calls a “meaningful structure” or “meaningful relatedness” for that domain.²⁶ A perspective on a problem like seeking justice does not simply see the option space as a random collection of social worlds with diverse justice-relevant features, but rather as a set of options differentiated by systematic variations in their underlying properties. The organization of the option space in terms of systematic variation in underlying structures is how a perspective makes sense of the domain; given that certain properties are relevant to justice (WF), Σ then arranges the options in terms of their variations of these fundamental properties. And so it makes sense of the optimization problem confronting the theory. It is thus part and parcel of this view of the meaningful structure of the domain {X} that Σ be able to make the sort of similarity judgments that the Orientation Condition requires.

The Orientation Condition refers to “proximity” to the ideal social world, u. An overall similarity ordering is a rough notion of “distance,” but only very rough. If we have a set of five ordered social worlds {a, b, c, d, u} in one sense we can say that b and u are far from each other, yet it may also be the case that the relations in figure 2-1 obtain. I thus suppose that a sophisticated perspective on ideal justice enriches its similarity ordering (SO) over the domain {X} by applying a distance metric. Recall Weitzman’s observation that “distance is such an absolutely fundamental concept in the measurement of dissimilarity that it must play an essential role in any meaningful theory of diversity or classification.”²⁷ As he remarks, the real dispute between theories that seek to model similarity is not whether a distance metric will be used, but which one it will be. Different perspectives will, Weitzman suggests, arrive at different distance metrics; thus I do not suppose any specific metric for a perspective. However, to help fix the general idea of a distance metric, we can give a formal characterization. Let us say that Σ defines a metric space—an ordered pair (X, d), where {X} is the domain of social worlds and d is a function on {X} that defines the distance between all points in {X}. For now, at least, we assume that the distance metric (DM) is constrained by the prior complete similarity ordering (SO) of {X}. Again following Weitzman, we can say that such distances (X, d) must satisfy three core conditions. ∀i, j ∈ {X}: (1) d(i, j)²⁸ ≥ 0; (2) d(i, i) = 0; (3) d(i, j) = d(j, i). To which we add (4) ∀i, j, k ∈ {X}: [(i∼j) > (i∼k)] ⇔ d(i, j) < d(i, k)].²⁹

Figure 2-1. Orderings and metrics

With these last two features of a perspective (the complete similarity ordering and distance metric) we now can fully meet the Orientation Condition. A perspective containing the final two elements can generate an overall evaluation of nonideal members of {X}, not simply by referring to their individual justice scores (as the Social Realizations Condition requires), but also by their similarity-distance to the ideal social world, u, which is a member of {X}. Social world u, we can say, is the global optimum in {X}—it has the greatest justice score. But the distance metric implies that some world i, whose justice score is close to that of u might not be close in this second sense—its underlying structure could be far from u’s. Once again, it is important to stress that introducing the distance metric is by no means an artificial idea simply motivated by my statement of the Orientation Condition. If we seek to understand how our social world is similar to others, and which are most similar to the ideal, and this is not to collapse into Sen’s nonideal climbing model (§I.1.3), some understanding of overall structural similarity is critical, and distance metrics are standard in modeling the similarity (or diversity) of different elements in a given domain (e.g., ecosystems).

We now have specified five elements of any evaluative perspective suited to ideal justice:

(ES) A set of evaluative standards;

(WF) An identification of the relevant features of social worlds;

(MP) A mapping relation from evaluative standards to the justice-relevant features of social worlds, yielding a justice score for each social world;

(SO) An ordering of the underlying structures as identified in WF that relates the social worlds in {X} in terms of a complete, consistent similarity ordering;

(DM) A distance metric applied to this similarity ordering.

Together ES, WF, and MP satisfy the Social Realizations Condition, while SO and DM satisfy the Orientation Condition.

1.3 Why Not Feasibility?

I have interpreted the Orientation Condition as requiring a second dimension of evaluation (in addition to the intrinsic justice of a social world), that of similarity of justice-relevant structures and background facts. No doubt some readers have been puzzled, assuming that this second dimension must be some sort of feasibility metric or space. Indeed, in Simmons’s rejoinder to Sen’s “climbing model,” he insists that a mere ordering of states of affairs in terms of their intrinsic justice will not suffice because, in addition to the height (justice score) of a location, we are interested in “feasible paths to the highest peak of perfect justice” (§I.1.3).³⁰ So: why not feasibility rather than similarity?

I certainly do not deny that, in many respects and in many contexts, feasibility is critical. I have insisted that some notion of feasibility—no matter how optimistic—is always part of modeling a possible world and its justice (§I.3.1); because the mapping function includes such models of worlds, a notion of feasibility is intrinsic to our account of perspectives on justice. And, of course, to make sound action recommendations to a person, a political entity, or a collectivity, a different sort of feasibility consideration is critical—the feasibility of moves from one social world to another. The importance of feasibility in many contexts is undeniable.³¹ Our present concern however, is whether something like a “feasibility metric” can satisfy the Orientation Condition in ideal theories of justice.

I think it is quite clear that, while recent work in political theory has focused on feasibility as central to the “ideal/nonideal debate,” it is not an appropriate metric by which to satisfy the Orientation Condition. Gilabert and Lawford-Smith analyze feasibility as a four-placed predicate of the form: “It is feasible for X to ϕ to bring about O in Z,” where X is an agent, ϕ a set of actions, O a set of outcomes, and Z a context, which “might be broad, ranging over all of human history, or narrow, limited to a particular time and a particular place.”32 Notice that feasibility is indexed to agents, time spans, and contexts. Thus outcome O may be feasible for Alf at time t₁ in circumstances C₁, but not at time t₂, but might be again feasible at t₃, though now the circumstances have changed. Perhaps it was feasible for Betty only at t₂ in circumstances C₂. For a theory of the ideal to specify a plausible feasibility space to orient our quest for justice, it would have to specify not only the agent (for example, feasibility could be defined in terms of the US Congress, the American people, Western society), but a time period. Suppose then a theory specifies Z as “any time within the next decade” and X = “US Congress and president.” So only things that are feasible throughout the entire decade for the American Congress and president are in the feasibility space. But again, even this claim must be indexed to time. At the beginning of the decade, time t₀ it may be true that (i) Congress along with the president can ϕ to achieve outcome O any time during the time span t₀–t₁₀. But what Congress can achieve any time in the decade is apt to change as the decade proceeds, and Congress and the president take other actions that impact on whether ϕ can still achieve outcome O. Thus at time t₃, it may (ii) be false that Congress can ϕ to achieve outcome O during the remaining time span t₃–t₁₀, which would contradict claim (i), unless each is further indexed to a certain time for which it is true.

For example, in the American Civil War many in the North commenced the war to restore the union to something like its antebellum form, with slavery permitted in the southern states (but not federal territories). This was also the initial aim of Lincoln, and it certainly seemed feasible at the outbreak of the war; it is plausible to say that for most in the North the war was undertaken to achieve this end. But Lincoln decided that the Emancipation Proclamation was needed for successful prosecution of the war; after it, and after the years of bloody battles, it was very likely no longer feasible in 1863 that the union could be restored its antebellum form, even though the Democratic Party fought the 1864 election on a platform basically devoted to achieving it. In this case an aim O that was perfectly feasible in 1861 (and in 1861 would seem feasible throughout the next five years) became infeasible—interestingly, because of events that were instigated to achieve O, in 1864. Given the way feasibility changes as actors proceed through time, it seems important to always index feasibility judgments to the time of evaluation.33 But, of course, then “feasibility space” is highly unstable, and that seems inappropriate for a theory of justice.

Additionally, feasibility judgments have many different dimensions, which operate over differing time spans, agents, and contexts. Wiens nicely sketches the many different dimensions of feasibility:

We can make our assessment more tractable by grouping the relevant facts into analytically useful general categories. Some straightforwardly rigid constraints are logical consistency, the laws of nature and human biology. But we should also attend to less rigid, more malleable constraints. I only present a few salient examples here, leaving development of a full list as a practical exercise: ability constraints, which comprise facts about human abilities; cognitive constraints, which comprise facts about our cognitive capacity, including cognitive biases and computational limitations; economic constraints, which—if taken broadly—comprise facts about possible allocations of money, labour power and time; institutional constraints, which include facts about institutional structure and capacity (for example, the number and distribution of veto points in a collective decision procedure and the ways in which political officials are selected); and technological constraints, which include facts about the tools, techniques and organizational schemes available for bringing about new states of affairs. I [include] … motivational constraints, which identify the limits of what people can be motivated to do given intrinsic features of human agents that affect motivation (including affective biases, prejudices and fears), as well as the extrinsic features of an agent’s environment that interface with her intrinsic motivational capacities (including social norms and incentives).34

Wiens argues that different features can be aggregated into a general notion of feasibility (though he is somewhat skeptical about how accurate such general feasibility judgments might be).³⁵ He proposes aggregating all these facets into an overall “feasibility frontier,” modeled on the economists’ “production possibility frontier.” According to Wiens, “we can say that a possible world is a member of the feasible set only if that world is circumstantially accessible from the actual world. … It follows that realizing a state is feasible only if there is at least one world at which the state is realized that is circumstantially accessible from the actual world; realizing the state is otherwise infeasible.”³⁶ The idea of “circumstantially accessible” does not mean simply that no constraints exclude the move, but that given the resources and causal process obtaining in the present world, the feasible state of affairs can be brought about, much like the production possibility frontier tells us what level of production can be brought about in a given economy. Others are skeptical whether the many different layers of feasibility, differing according to time spans, and contexts, can be coherently brought together in this way.³⁷ Again, even if all this could be done, feasibility judgments must be time and agent indexed. Because the conditions that underlie feasibility are constantly shifting, the landscape of justice would be as well. While we may have to take account of these constant shifts for policy purposes, such a landscape would hardly orient our thinking about justice, providing what Rawls called “a long-term goal of political endeavor” (§I.1.2).³⁸

Perhaps, it might be thought, a feasibility space that could satisfy the Orientation Condition could be constructed with some more fixed notion of feasibility, say “social engineering feasibility”: social world i is close to j just in case “we” (the agents still must be defined) have a social technology that could (under some parameters) reform i into j. I shall not explore this idea. But note that any space defined in terms of feasibility will behave oddly. Recall (§II.1.2) that we adopted a constraint on distance metrics: d(i, j) = d(j, i), which requires that the distance between worlds i and j is the same as that between j and i. Our similarity metric (DM) satisfies this general condition. A feasibility-based similarity metric would not. As Geoffrey Brennan and his coauthors point out, some social states are “absorbing” in the sense that once a society is in that state, it may be very difficult to get out—such states are “sink holes.”³⁹ Thus a move from i into the “sinkhole” j may be highly feasible (so, not distant in feasibility space), but moving back from j is not at all feasible, thus undermining symmetry: i is close to j, but j is not close to i. Note also that “feasible to move” is, at least in one sense, not a transitive relation: that it is feasible to move from i to j, and that it is feasible to move from j to k, does not mean that it is feasible to move directly from i to k; in fact that might be impossible.⁴⁰ Jon Elster believes that a utopian theory must advocate such transitivity. “Utopianism says that what can be done in two steps can also be done in one step.”⁴¹ Elster thinks this is false on its face, but unless some sort of transitivity holds an ideal theory may well be committed to the much-dreaded “incrementalism”;⁴² in order to get to the ideal, the theory may require that we proceed through the worlds “on the way” to the ideal, in something like a step-by-step fashion. Moreover, the time indexicality of feasibility leads to other relations that look intransitive. If at time t, it is feasible to move from i to j, and at time t, it is feasible to move from j to k, it does not follow that at time t it is feasible to move from i to k, either directly (as noted above) or indirectly (because by the time we move from i to j, time t will no longer obtain, and we may not know whether a move from j to k is feasible at time t₂). Whatever else may be the case with this complicated idea, feasibility space is certainly oddly behaved and shifting—indeed rather disorienting.

For a number of reasons, then, the Orientation Condition should not be interpreted in terms of feasibility. As Juha Räikkä observes, “To construct a political theory is not necessarily to engage in politics. … A political theory concerns the issue of what one is justified in thinking about the moral status of societal institutions, and it does not follow that what one is justified in thinking, one should do at the moment.”43 All this is entirely right, so long as we stress his final words—“at the moment.” An ideal theory of justice seeks to orient our long-term quest for justice, not its time/agent-indexed best feasible moves. But this is not to say that even an ideal theory is about only what we should think, not what we should do. They are not ultimately separable, for to think about justice is to think about where we should move, and how to engage in this quest (§I.1.5). Note that our similarity ordering (SO) is by no means irrelevant to a type of feasibility: we would expect a reasonably high correlation between similarity as I have characterized it and many judgments of feasibility. If a world is very close to our own, and only a few changes needed to bring it about, it is likely to be feasible, at least in the middle term, for political institutions and social movements to bring it about. I hasten to add that this will not always be the case, but it often will be, and in most cases we can readily see how we possess the “social technology” to bring the new state about—we can identify a modest number of changes that would need to be made.

2 RUGGED LANDSCAPE MODELS OF IDEAL JUSTICE

2.1 Smooth v. Rugged Optimization

An evaluative perspective, then, allows us to make judgments about the justice and structural similarity of a set of social worlds. Sometimes, as figure 2-2 indicates, a perspective can show us that our search for the ideal will be easy. Here the x-axis represents Σ’s understanding of the underlying structure of social worlds a through n in {X}, based on their similarity (as Σ sees it), as required by the Orientation Condition. The y-axis represents Σ’s evaluation of the inherent justice of these worlds satisfying the Social Realizations Condition. On this fortunate perspective, often called a “Mount Fuji” perspective, marginal changes in the underlying structure are always associated with marginal changes in their justice. As we move from social world a toward u, every small change in social structure leads to a small increase in justice. Similarly, as we move from u toward n, each small change yields a small loss in justice. Finding the ideal, u, is theoretically simple. First move from where you are. If you get to a more just social world, keep going in that direction. If and when you get to a less just social world, stop, and move back in the opposite direction: keep on moving in that direction until a marginal change yields a less just world. Finally, move one step back and you will have arrived at the ideal, the most just social world!

Figure 2-2. A Mount Fuji optimization landscape

Notice that under these conditions, even if we add the similarity and distance metrics to our model so as to satisfy the Orientation Condition, securing justice is essentially captured by Sen’s “climbing” model (§I.1.3). We really do not have to know that the highest “peak” is world u; all we need to know is which way is “up.” There really is no important sense in which the ideal orients our efforts to seek more justice. Thus we see that a model that includes the Orientation Condition does not prejudge the dispute between Sen and Simmons (§I.1.3), it merely allows us to make sense of it. We should read Simmons as maintaining that the ideal theorist seldom faces such a straightforward optimization problem.

Now it may be thought that, pace Simmons, ideal theory does, in the end, confront relatively “Mt. Fuji-ish” problems. The thought is this: as we have seen, an evaluative perspective arrays social worlds in terms of their justice-relevant properties—world a is closer to world b than to c if and only if its justice-relevant features are more similar to b than to c. Suppose that b’s justice-relevant features are very much like a’s; we would expect that the justice score of b would, then, be very much like a’s. As we move further away from a, we might expect the justice score of c would be closer to b’s than a’s, while the justice score of d, the next world out, would be closer to c’s than to b’s, and closer to b’s score than to a’s. And so on. If so, it looks like a smooth justice score curve up (or down) from a, perhaps peaking at world u. On the face of it, minor changes in the justice-relevant features of the social world (WF) should be closely correlated with the social world’s justice as measured by the evaluative standards (ES) and mapping function (MP).

Alas, this attractive picture is misleading. As I pointed out above, Rawls insisted that the “idea of realistic Utopia is importantly institutional,”44 and indeed the importance of institutions is a theme throughout utopian thought (§I.1.2). This is crucial: the justice of an institution, practice, or policy can be dependent on what other institutions or policies are in effect, as shown in figure 2-3. Here I consider what might be described as a “bleeding-heart libertarian perspective”—that is, a perspective on justice that places great weight on free markets and small states, but also values basic government aid to the less well-off.⁴⁵ Now:

Let x = Prohibition of deficits;

Let y = Prohibition of tax increases;

Let z = Prohibition on cutting vital services.

Figure 2-3. The interactions of policies and resulting justice

Suppose we start out in social world a, which has limits on deficits. Our libertarian may judge this world to be reasonably just because current generations cannot push the costs of their consumption on to the future, and so will be apt to be more cautious about governmental expenditure. But recall that our libertarian is concerned with the less well-off members of society. The libertarian may therefore judge society b to be more just than a, since b protects vital services on which the least well-off depend. However, suppose we move to world c that keeps the prohibition on cutting vital services, but drops prohibition on deficits. On this perspective, the social realization of world c is a less just world than either a or b (with a score of 4 compared to 10 or 12), as the prohibition on cutting vital services is likely (given the model used by the mapping function) to inflate the size of the state whose costs will either be pushed onto future generations or funded through increases in taxation. Introducing a limit on taxation in world d at least mitigates some potential injustices, raising the justice of d compared to c (8 compared to 4), but still leaving d less just than a or b. Now suppose that in e, as in world d, there is a prohibition on increased taxation, but also, as in world b, prohibitions both on cutting vital services and on deficits. One might think that the libertarian would judge this to be the best of all worlds. However, now the libertarian’s model of how this set of institutions will work out may lead to a “California syndrome,” in which expenses can neither be cut nor paid for, giving rise to the real possibility of default on the state’s obligations, which may pose the greatest threat to justice of all, leaving e the least just social world.46

Note that in figure 2-3, as we move from a to e, we have “justice peaks” at b and d, with “gullies” in between. Accordingly, the perspective generates an optimization problem more akin to the rugged landscape of figure 2-4 than to the Mount Fuji landscape of figure 2-2. If the various dimensions (institutions, rules, etc.) of the social world that a perspective is judging in terms of justice interact (on the perspective’s preferred model or models, included in MP) as in our bleeding-heart libertarian toy example, then we are confronted with an NK optimization problem—one in which we are optimizing over N dimensions with K interdependencies among them.⁴⁷ When K = 0, that is, when there are no interdependencies between the justice scores of the individual institutions, we are apt to face a simple sort of optimization problem depicted in figure 2-2. When we face a simple optimization problem, the more of each element the better, and each act of local optimization puts us on a path toward global optimization, or the realization of an ideal. Not so when K begins to increase (as in evolutionary adaptation). When multiple dimensions (in our example, institutions) interact in complex ways to produce varying justice scores, as we saw in figure 2-3, we are faced with a rugged landscape in which optimization is much more difficult.

I believe that the NK characteristics of the justice of social institutions—that they have multiple dimensions on which they are evaluated and these display interdependencies as in figure 2-3—are critical in creating the rugged justice landscape of figure 2-4. However, this “justice as an NK problem” hypothesis is (roughly) sufficient to create rugged optimization problems, but by no means necessary. Even if justice were a simple unidimensional criterion (e.g., justice depended on only one institution and was itself a simple criterion), searching for the global optimum would still confront a perspective with a rugged landscape if the underlying structure that the perspective discerns in the similarity ordering of social worlds (the x-axis) is not well correlated with y-axis scores. To see this, take the simple unidimensional y-axis value of height, and arrange a group of one hundred people according to the similarity criterion of alphabetical ordering of first names. If we array our one hundred people on a line, their heights will be rugged indeed—the diminutive Willamina may be standing between Big Wallace and the 7′1″ Wilt the Stilt. Whenever a perspective arrays on the x-axis the elements in the domain in a way that is badly correlated with the relevant y-axis scores of the elements, that perspective will generate a rugged optimization landscape (§III.1.2 considers three examples of such ruggedness relating to the pursuit of justice). However, in our model of a perspective on justice, the way a perspective Σ arrays social worlds—the structure it discerns among them—is based solely on the world’s justice-relevant features; it is the evaluation of these and only these features that give rise to justice evaluations. Consequently—in marked contrast to our height example in which the scores and underlying structure are not remotely related—if we assume a simple theory of justice (whether there is only one factor in determining justice, or several factors that can be simply aggregated because they do not manifest interdependencies) we would expect the array of social worlds to be correlated with their justice, and so the justice optimization problem to be relatively smooth. That is, we would expect that the variation in the total justice scores of social worlds to be fairly well correlated with variation in their justice-relevant features, approaching figure 2-2. However, when our modeling of the social realizations has significant NK properties (such as in figure 2-3), the correlation of similarity with inherent justice is almost certain to be highly imperfect, and so we are almost certain to confront a rugged optimization problem (but see §III.2.3). Of course that a perspective judges that the features of a social world (WF) have interdependencies is a consequence of how it models the interaction of those features; thus we again see the central importance of the often-overlooked mapping function (MP). The NK features of institutional justice help explain why it is so difficult to avoid seeing the pursuit of ideal justice as a rugged optimization problem. Nevertheless, much of what I say in the remainder of this book applies even if one rejects the justice as an NK problem supposition, so long as the underlying x-axis structure of the social worlds in domain {X} is not highly correlated with their y-axis justice scores.

Figure 2-4. A rugged optimization landscape

Having now set aside the suggestion that the core orientation point is about feasibility (§II.1.3), Simmons can be reinterpreted as making a point about justice as a rugged optimization problem. Pace Sen, in rugged landscapes such as figure 2-4 a constant series of pairwise improvements can (i) lead to a local optimum (a low peak on figure 2-4) that is far inferior to the global optimum and (ii) lead us away (on the x-axis) from the globally optimal social world. If we are at world h in figure 2-4 we could move toward the nearby peak (a higher justice score) at world b, but this would take us further away from the ideal social world, u. Whether theories of justice are tasked with solving optimization problems in rugged or smooth landscapes is, then, the point on which a critical issue in ideal theorizing turns—whether the climbing or orientation model is most appropriate (§I.1.3). When landscapes are smooth the Orientation Condition is essentially otiose; Sen’s insistence that improvement does not require knowledge of the ideal is then sound.

2.2 How Rugged? High-Dimensional Landscapes and the Social Realizations Condition

I am supposing, then, that rugged landscapes are created by NK features of the pursuit of just institutions; a theory of justice is seeking to optimize over N dimensions with K interdependencies between the dimensions. Recall that if K = 0, the N dimensions are independent, the theory is faced with a simple aggregation problem: as we increase our success on any dimension we move higher on the landscape. However, as Stuart Kauffman stressed in his groundbreaking analysis, if there are many dimensions and interdependencies are very high, the landscape will be fully random.48 Let us use the term high-dimensional optimization landscape for one in which many dimensions display a large number of interdependencies; at the limit each dimension is affected by all others. In terms of our ideal theory model, in a maximally high-dimensional landscape there is no systematic relation between the justice of social world i and the justice of the worlds that are adjacent to it. Note that in such a landscape there is no point in getting close to the ideal point, u, but not achieving it: its near neighbors may not be at all just.⁴⁹ Any change of any institution (or rule) in any given world i produces a new social world, the justice of which has no systematic relation to the justice of i. Such landscapes have a very large number of poor local optima.⁵⁰ The crux of maximally high-dimensional landscapes is that the justice of any one rule or institution is a function of all others, producing what Kauffman called “a complexity catastrophe.”⁵¹

Because many philosophers are committed to a type of holism, they seem committed to modeling the Social Realizations and Orientation Conditions in a way that results in high-dimensional optimization problems. “A sensible contractualism,” writes T. M. Scanlon, “like most other plausible views, will involve holism about moral justification.”⁵² According to holist views, the justification of every element of a system of values or beliefs is dependent on many others—such systems are often depicted as “webs,” indicating a very high degree of interdependence among many elements. At the limit, the value of every element depends on the values of all other elements. It is precisely such systems that give rise to complexity catastrophes; a variation in the value of one element jumps the system to a radically different state.

Some models of evolutionary adaptation show how such high-dimensional landscapes can be successfully traversed (a species can avoid getting stuck at one of the numerous poor local optima—low peaks on a rugged landscape).53 However our concern here is a political theory that seeks to judge the justice of various social worlds, and recommends moves based on its evaluations of these worlds. In this context, the idea of a complexity catastrophe is entirely apropos, for the system will be too complex—really chaotic—for the theory to generate helpful judgments and recommendations.⁵⁴

Again, it should be pointed out that while the high dimensionality of an optimization problem can be the basis of a maximally rugged landscape—and so can help us to understand why maximally rugged optimization landscapes are so difficult to avoid for some perspectives—it is by no means necessary. Recall our example of arraying people’s heights by the alphabetical order of their first names; here the root of the problem is not the high dimensionality of our concept of height, but the fact that the perspective’s underlying structure (alphabetical ordering of first names) is entirely uncorrelated with height values. Whenever the underlying structural array of a perspective that orders the domain {X} is entirely uncorrelated with the justice values (whatever the root explanation) of the members of {X}, a perspective will face a maximally rugged optimization problem. For any element i ∈ {X}, its place in the perspective’s underlying structure tells us nothing about its score (on the y-axis) relative to its x-axis neighbors.

For the purposes of political theorizing, the problem such systems pose can be expressed in terms of:

The Maximal Precision Requirement: A political theory T, employing perspective Σ, can meet the Social Realizations condition in a maximally rugged optimization landscape only if Σ is maximally precise (and accurate) in its judgments of the justice of social worlds.

Let us say that a judgment of a social world i is maximally precise (and accurate) if and only if that judgment correctly and precisely distinguishes the justice of i from proximate social worlds. A straightforward if somewhat rough way of interpreting this requirement is that a judgment that world i is just to level a is maximally precise only if Σ’s judgment of world i does not attribute to it any features that proximate social worlds possess (say i±1), but which i does not possess. But while this is the basic idea, appealing directly to the features i “truly has” begs an important question, for a critical function of a perspective is to determine the relevant classificatory scheme—what the relevant features of each social world are. To say that world i truly has justice-relevant feature f would be to adopt a certain perspective, and this perspective may clash with Σ, which denies that f is a relevant classification applying to world i. As we shall see in the next chapter, different perspectives endorse different classificatory schemes, each of which insists that its is superior. If we could directly determine which is true, we would have no need to adopt a perspective on the world, but simply to report the truth about it (§I.3.3).

Rather than defining a maximally precise judgment of perspective Σ about i in terms of one that truly identifies the features of i, we can formulate a criterion that is internal to Σ. There is nothing more common in social theory than that our predictions about how a social world will function and its resulting justice end up disappointing us—we who made the prediction. Using Σ, we evaluated social world i as having features {f, g, h} with a resulting justice of a; when we actually sought to bring about that social world, we found either that all these features did not cohere (say, h was inconsistent with f and g, so we ended up with f, h*, g), or else our efforts to bring about i went astray, and we actually ended up with a neighboring social world with f, h, g*) with justice β. In this case ∑’s evaluation of i (or the move to i it recommended) was not stable before and after the move; on ∑’s own lights, it was wrong about i. Let us, then, say that Σ is maximally precise and accurate in its judgments about a social world if its judgments would be stable after moves to that world (or, we can say, ∑’s predictions would be precisely confirmed by Σ after the move).55

Now because in a maximally rugged landscape the justice value of any social world (as measured on the y-axis) is uncorrelated with the justice of its neighbors (as measured on the x-axis), unless a perspective’s judgments of social worlds are maximally accurate and precise, they do not convey useful (reliable) information. Suppose that a perspective’s judgment of a given social world is precise and accurate (in the way we have defined it) plus or minus one social world on the x-axis. Its error is in a very tiny range; it is accurate to plus or minus one feature (g⁻ or g⁺ rather than g). This would imply that, given this reliability range, the justice of the world could be of any value in the entire range of justice (i.e., y-axis scores). But this, in turn, implies that the perspective cannot generate a useful ordering of the justice of the social worlds in the domain, and so a theory employing this perspective would not meet the critical Social Realizations Condition of an ideal theory (§I.4). More generally, even if we relax the assumption of maximal ruggedness, it remains the case that in high-dimensional landscapes the (x-axis) areas in which proximate social worlds have correlated justice will be very small, and so useful judgments of justice will require great, if not maximal, precision. For reasons to be explored presently (§II.3), I take it that maximally (or approaching maximally) precise and accurate judgments of near, much less far-off, alternative social worlds is a will-o’-wisp; if so, a plausible ideal theory cannot suppose that the quest for justice is a high-dimensional optimization problem.

This is not an inconsequential result.56 Philosophers often combine commitments to justificatory holism with the aim of working to an ideal through a series of improvements. We now see that these two commitments do not cohere (at least not without a very complicated story). Consider, for example, the recent interest in so-called property-owning democracy as a core of a more just social world.⁵⁷ Suppose that a perspective’s modeling of how such an economy might work is almost spot-on, but misses one significant institutional fact or relevant psychological consideration; if the optimization landscape is maximally rugged (holistic), then the perspective’s evaluation, however sophisticated it may seem, tells us nothing about the justice of property-owning democracy.

2.3 How Rugged? Low-Dimensional Landscapes and the Orientation Condition

As K (the interdependencies between the dimensions to be evaluated) decreases (i) the number of local optima (peaks) decreases, (ii) the slopes lessen, so that the basin of attraction of the optima are wider (the same optimum is reached from a wider array of starting points), and (iii) the peaks are higher.58 Additionally, in low-dimensional landscapes (iv) the highest optima tend to be near each other⁵⁹ and (v) the highest optima tend to have the largest basins of attraction.⁶⁰ As K decreases the landscape becomes correlated within itself. More generally, in smoother optimization landscapes the underlying structure (x-axis) is correlated with the (y-axis) values of any element. In a smooth optimization landscape slight variants in current institutional structures (neighbors along the similarity array) produce new social worlds the justice of which is highly correlated with the current social order. As can be inferred from what has been said about smooth landscapes (§II.2.1), as the landscape approaches an entirely smooth optimization problem, Sen’s climbing model is adequate. That is, we do not really need an ideal to orient our improvements, for our underlying similarity ordering (SO) of the alternatives is an excellent indication of their justice: we are, essentially, always simply climbing gradients. Solving very smooth optimization problems does not require meeting the Orientation Condition, in which case Sen is right: we can do well without knowledge of the ideal.

2.4 Ideal Theory: Rugged, but Not Too Rugged, Landscapes

Formalizing the pursuit of the ideally just society as a complex optimization problem leads to an insight: ideal theory has appeal only if this pursuit poses a problem of a certain level of complexity. This point is, I think, barely recognized in the current literature, which supposes that whatever attractions “ideal theorizing” might have are independent of the complexity of the pursuit of justice.61 Recall Rawls’s key claim: “by showing how the social world may realize the features of a realistic Utopia, political philosophy provides a long-term goal of political endeavor, and in working toward it gives meaning to what we can do today.”⁶² If the problem of achieving justice is not sufficiently complex, Sen is right: all we need is to make the best pairwise choices we can, and we do not need to identify our long-term goal. If the problem is too complex, the ideal will not help, because any move “working toward” it is essentially a leap into the dark, which is not apt to provide much meaning. In these chaotic, high-dimensional landscapes a fear of movement is as reasonable as a relentless quest for the ideal.

In the remainder of this book, then, I suppose that a theory of ideal justice confronts a moderately rugged landscape. More specifically I assume there are a number of optima with significant basins of attraction—so that in a significant proportion of the option space there are gradients to be climbed. Thus, also, a significant proportion of the option space is correlated within itself; within a certain significant space, the justice of a social world is correlated with the justice of other near social worlds. However, we must suppose that the landscape is sufficiently rugged such that the Orientation Condition is well grounded: all the high optima (which would include the ideal) are not closely related to each other, so we really do need to locate the ideal before we can arrive at confident all-things-considered recommendations about which social worlds we should move to.

3 THE NEIGHBORHOOD CONSTRAINT AND THE IDEAL

3.1 Rawls’s Idea of a Neighborhood

In a seldom-noticed discussion responding to Derek Parfit’s objection to the difference principle, Rawls advances a conception of alternative social worlds in a “neighborhood.” Parfit’s objection is based on the example in figure 2-5.⁶³ The difference principle selects distribution (3) because the least well-off do best. But Rawls claims that a justification of the difference principle is that the shares of the better-off are not gained at the expense of the least well-off. As Rawls stresses in his later work, the difference principle expresses reciprocity, a commitment of the better-off not to gain at the expense of those who are already less well-off. Yet we see that under distribution (2) the Indians do better than they do under (3), so it would seem that the gains for British under distribution (3) do after all come at the expense of the Indians, who “lose” 5 units.

Rawls’s reply is multifaceted. He insists that the difference principle does not refer to rigid designators such as “Indians” and “British” but to whomever the least well-off class might consist. However, he continues on:

Ignoring the matter of names for a moment, consider what can be said to the Indians in favor of (3). Accepting the conditions of the example, we cannot say that the Indians would do no better under any alternative arrangement. Rather, we say that, in the neighborhood of (3), there is no alternative arrangement that by making the British worse off could make the Indians better off. The inequality in (3) is justified because in that neighborhood the advantages to the British do contribute to the advantages of the Indians. The conditions of the Indians’ being as well off as they are (in that neighborhood) is that the British are better off.

Figure 2-5. A “counterexample” to the difference principle

This reply depends, as does the difference principle itself, on their being a rough continuum of basic structures, each very close (practically speaking) to some others in the aspects along which these structures are varied as available systems of social cooperation. (Those close to one another are in the same neighborhood). The main question is not (3) against (2), but (3) against (1). If the Indians ask why there are inequalities at all, the reply focuses on (3) in relation to reasonably close and available alternatives in the neighborhood. It is in this neighborhood that reciprocity is thought to hold.64

In explaining the idea of a neighborhood Rawls explicitly relies on a distance metric—“a rough continuum of basic structures, each very close (practically speaking) to some others in the aspects along which these structures are varied as available systems of social cooperation.” We might take the idea of being “close (practically speaking)” as simply about feasibility, but that interpretation would suggest that distribution (2) is irrelevant simply because it is infeasible, but Rawls never defends that claim (also, the idea of “close and available alternatives in the neighborhood” would be redundant if the neighborhood is simply defined by the available alternatives). Rawls’s basic claim is that there is a continuum of structures that are close in terms of the variance in their structure, but not necessarily from the perspective of their inherent justice. A natural interpretation of this idea, given the analysis thus far, is that there is a continuum of basic structures (the x-axis in figure 2-4), on which some are close to others, but we should not confuse this type of practical closeness with close in terms of justice scores (as shown by the very different y-axis scores of many close x-axis worlds). This makes perfect sense if we model the problem of justice in terms of moderately rugged landscapes, in which it does not follow that those that are close in terms of basic structures are necessarily also close in terms of justice.

Rawls thus paints an especially interesting picture: there is a continuum of basic structures, somewhere on this continuum we can locate an ideal that orients our quest for justice, but at any given time, the recommendations of the principles of justice are confined to a neighborhood within this continuum. I believe that this idea of a neighborhood of related social worlds is fundamental to political philosophy and to the evaluation of ideal theory, as I shall now endeavor to show.

3.2 The Social Worlds We Know Best

The initial presentation of an evaluative perspective Σ (§II.1.1) and an array such as figure 2-4 presupposed that evaluative perspective Σ yields an evaluation of all social worlds in the domain {X}, which according the Social Realizations Condition must include the ideal. The discussion of the Maximal Precision Requirement (§II.2.2) introduced another variable: the precision (and accuracy) of Σ’s judgment for any social world. Now on what we might call the Comprehensive Knowledge Assumption, whatever level of precision (and accuracy) Σ’s judgments possess is, roughly, invariant across all social worlds. Given any two social worlds (i and m) in the domain {X}, Σ’s judgments are (again, approximately) equally precise and accurate. But this simply cannot be right: our current social world is in the domain, and the evidential basis for judgments about the justice of the world we actually live in must be greater than judgments about merely possible worlds. For all nonexistent social worlds, we must rely for the most part (but see below) on predictive models to judge their social realizations; for our current world we can employ our best models to understand it, but we also have masses of direct evidence as to its realization. Indeed, our models are often developed from our current data, or at least with the constraint that they must cohere with what we know about our social world (think how important reports of the Factory Commission were to Marx’s theory in Capital).65

I realize that some deny this: there is a persistent strain in political philosophy that the ideal world would be organized along straightforward and simple lines. In his utopian novel Looking Backward Edward Bellamy described “a social order at once so simple and logical that it seems but the triumph of common sense.”⁶⁶ Or, as Cohen seems to suggest, the motivational structure underlying a socialist economy in an advanced technological society can be crystallized in the ethos of a friendly camping trip.⁶⁷ The supposition that the social institutions of the ideal will be simple and predictable is by no means restricted to socialist utopias—anarcho-capitalists seem to truly believe that actual societies will function as predicted by relatively straightforward microeconomics and the theory of the firm.⁶⁸ We must not confuse simple models of the ideal (they are extraordinarily easy to create) with plausible predictions of the social realizations of a set of institutions for large-scale societies. Those confident that they know the “simple and logical” workings of ideal mass societies should, perhaps, reflect on the surprising intractability of social norms in small-scale societies in the face of concerted, well thought out, and well-funded interventions by the United Nations and other agencies. While there have been some notable and important successes in altering specific norms such as female genital cutting in some locations, in other places these interventions have not met with success, and sometimes initial success has faded as targeted norms were readopted.⁶⁹ And this concerns a few specific norms in villages whose population is measured in the thousands. And as we shall see in more detail presently, actual socialist utopian experiments were unable to achieve their sought-after social realizations (§II.4.1). When we realize new social worlds we are always struck by features we did not quite anticipate; important causal relations emerge that even our best models did not include.⁷⁰ This is not to dismiss the pursuit of ideals; it is, however, to dismiss the claim that we can be confident about a social realization of a far-off ideal because it will be such a simple and predictable world. And even if we granted this outlandish claim, it surely could not be said that all the worlds on the way to the ideal are likewise simple and knowable (unlike the world we actually live in, which is far less knowable because we have so much more information about it?). Moreover if we granted all this—if it were simple worlds all the way from here to utopia—then we would be facing a simple, not a complex, optimization problem, and the ideal would not be necessary (§II.2.3). It is the very complexity of the interactions of justice-relevant institutions that is the most plausible basis of the rugged optimization problem, which, in turn, requires orientation by an ideal.

I take it as a given, then, that the precision and accuracy of judgments of justice of our current social world are greater than yet-to-be realized worlds, most especially ones that are far off. We also know that in the sort of moderately rugged landscapes presupposed by ideal theory, the justice value (as measured on the y-axis) of a world is correlated with its x-axis (i.e., similar) neighbors; simply knowing the justice of world i is informative about the justice of worlds plus or minus some x-axis distance ∂ (the smoother the landscape, the larger ∂, §II.2.3). Take, then, our current social world, j, and consider social worlds j ± distance ∂ (we also assume that the terrain is moderately rugged, so that ∂ is considerably less than the range within the domain {X})—the entire landscape is not correlated throughout as it is on the climbing model.⁷¹ For these possible social worlds, not only do we have modeling information as to the extent they would realize justice, but we have correlation information based on our great knowledge of our present world, j; the justice of our present world’s neighbors (as determined by SO, the similarity ordering) is correlated with the justice of our present world, j. Only in high-dimensional landscapes is this not the case (§II.2.2). So we have a larger evidentiary base for conclusions about the justice of j ± ∂ than for social worlds a greater distance than ∂ from j. “Experience and information,” Xueguang Zhou concludes, “gained in the past decrease the cost of learning in the neighborhood of the familiar area and put a higher price tag on explorations into unfamiliar territory.”⁷² Again, this provides powerful evidence that our judgments about the social worlds j ± ∂ have greater precision and accuracy than those outside our neighborhood, where we do not possess correlation information. In a moderately rugged landscape the area outside our neighborhood (i.e., j ± ∂) almost surely encompasses many social worlds.

Another consideration should lead us to reject the Comprehensive Knowledge Assumption. As I have said, our models of social worlds are attuned to our current world, in which they have been developed. Given this, a natural way to predict other social worlds is to predict the nature of the most proximate social world, which, by the very nature of a perspective, is the world with the relevant features most similar to our current world (§II.1.2). A model attuned to our j world will have to be only minimally adjusted when applied to an almost identical world; its reliability in worlds j ± 1 thus should not be terribly far from j. Having done this, we can then apply the revised model to worlds that are j ± 2, and so on, each time adjusting for some slightly different features of the new social world. Now despite its obvious attractions, this procedure leads to rather quickly dropping reliabilities as we move away from j. Again, it needs to be stressed that for ideal theory to be a plausible alternative to Sen’s climbing model, determining the justice of social worlds must be a modestly complex problem insofar as the relevant dimensions of evaluation are interconnected.73 This means, though, that as the mapping function understands them, the justice of different features are coupled; varying one will result in changes in the way other features contribute to the overall justice score.

Our models of such complex, interconnected, systems are characterized by error inflation.⁷⁴ An error in predicting the workings of one feature will spread to errors in predicting the justice-relevant workings of interconnected features, magnifying the original error. As this new erroneous model is used as the basis for understanding yet further social worlds, the magnified errors become part of the new model, which is then itself subject to the same dynamic. In complex systems small errors in predicting one variable at an early application of the model lead to drastic errors in predicting the overall system state a rather small number of iterations out (depending on the complexity—ruggedness—of the system), as errors in the initial estimate of one variable both propagate to other variables and become magnified in subsequent periods (i.e., further-out social worlds). The quintessential example of this is weather forecasting. Our predictive models of weather systems ten days out are drastically inferior to our models predicting tomorrow’s weather (which in turn is much inferior to looking outside and observing the current weather). It is crucial to stress that this problem of error inflation is part and parcel of the very complex interdependencies that create rugged optimization landscapes, and only if we have such a rugged landscape is there good reason to move beyond Sen’s climbing model (§II.2.3). So the problem of error inflation is intrinsic to the ideal theorizing project. The only way to avoid it is to have a simple, aggregative view of the features related to justice, but then Sen is entirely right—in such smooth optimization landscapes the ideal is otiose.

3.3 The Neighborhood Constraint and the Ideal

We are now in a better position to understand the importance of a neighborhood of basic social structures or, as I have been saying, of social worlds. A neighborhood delimits a set of nearby social worlds characterized by relatively similar justice-relevant social structures. In this rough continuum of social worlds some are in the neighborhood of our own social world (and many are not); our understanding of the justice of alternative social worlds in the neighborhood of our own social world is far deeper than outside it. This neighborhood will include those social worlds whose justice is significantly correlated with that of our current world; if our predictive models are powerful, it may extend somewhat further. For simplicity, I assume that there is a clear boundary between the worlds that are in our neighborhood and those that are too dissimilar for us to make as firm judgments about, though of course this is an idealization (§I.3.3), which we will relax (§II.4.2).

Figure 2-6 incorporates the idea of a neighborhood into a rugged landscape model, with some indication in the shaded areas that our knowledge fades (and does not abruptly halt) when we leave our neighborhood, as our models become increasingly error prone. Here, our current world is j, our neighborhood runs from b to d, and b is the “local optimum” (LO)—the most just alternative in our neighborhood. We can immediately see the difficulty of pursuit of the ideal given the Neighborhood Constraint. While moving from j to b takes us to a more just social world, it also moves us further away from the global optimum (GO). So we face a dilemma. On the one hand, our understanding of the alternatives to our present world is limited. As we leave our neighborhood the precision and accuracy of our estimations of the justice of social worlds drops off sharply: we must rely solely on our predictive models (since outside our neighborhood the justice of other social worlds is not correlated with our present justice), and the reliability of these models rapidly decreases as we move to increasingly unfamiliar worlds. In contrast, within our neighborhood there may be relatively obvious local optima, about which our judgments are reasonably reliable, and we are in a position to make well-grounded recommendations that moves to them will increase justice.⁷⁵ And this is manifestly important. To make a reasonable recommendation that a society or polity should work toward creating a new social world, a theory must have reasonable grounds for predicting what that world would be like.⁷⁶ On the other hand, ideal theory is intended to orient our quest for justice, but if the ideal (i.e., global optimum) lies outside our neighborhood, we do not know a great deal about it. To be sure, we may have some suspicion as to the direction in which the ideal lies, but we must remember that judgments outside of our neighborhood are not very reliable; if the ideal is not near (and almost all ideal theorists suppose it is not)⁷⁷ we are apt to have only rather vague ideas as to how it will work. But then an ideal theory is faced with what I shall call

The Choice: In cases where there is a clear optimum within our neighborhood that requires movement away from our understanding of the ideal, we often must choose between relatively certain (perhaps large) local improvements in justice and pursuit of a considerably less certain ideal, which would yield optimal justice.

Figure 2-6. An idealized neighborhood

It is important to stress that The Choice is not an outlying hard case for the ideal theorist; it is precisely the sort of situation with which ideal theory is designed to deal. If local improvements never led us away from the ideal, Sen’s climbing model would be adequate; essentially, all we would need to do is to move toward the global optimum. But as Simmons’s reply to Sen (§I.1.3) makes clear, the Orientation Condition—and so ideal theory—gets traction when local climbing can move us away from the ideal. Nevertheless, though this is manifest, implicit in writings of some ideal theorists is that we never face a significant instance of The Choice. Recall Rawls’s conviction that ideal justice provides guidance for thinking about justice in our nonideal societies, assisting to “clarify difficult cases of how to deal with existing injustices” and to orient the “goal of reform,” helping us to see “which wrongs are more grievous and hence more urgent to correct” (§I.1.2). But implicit in The Choice is that to pursue the goal of the ideal we must forego some obvious increases in justice in our neighborhood; rather than the ideal informing us about urgent matters that we must correct, it must sometimes encourage us to turn our backs on some increases in justice—to recommend that we do not move in that direction, as it will take us further from the ideal. This is inherent in the very idea of an ideal theory that is distinct from Sen’s climbing model. If alleviating the most pressing problems of justice was always part of moving toward the ideal, we would not need the Orientation Condition, because we would not require the ideal to orient our search for justice: the ordering of social states by the Social Realizations Condition would suffice. Our model allows us to see that the ideal theorist cannot have it both ways: that we must orient ourselves by the ideal yet never forgo local opportunities for significant, perhaps great, increases in justice. It is inherent in the project of ideal theory that we must confront The Choice.

Many of the remarks made by ideal theorists strongly suggest that The Choice is relatively easy—to achieve “reconciliation[,] … to calm our frustration and rage against society and history,” “to accept and affirm our social world positively, not merely be resigned to it,”⁷⁸ we should orient ourselves by the ideal. But now we see that this is not obvious; by its very nature, when ideal theory is distinctive and plausible, it requires us to forgo local optimization, which typically has much clearer consequences. In the sort of moderately rugged landscapes that ideal theory implicitly assumes, there are highish peaks throughout the landscape (§II.2.4); so confronting The Choice should be understood as the standard case with which ideal theory is designed to cope. To forgo relatively clear and perhaps great improvements in justice so that we can seek out an ideal that one’s theory tells us lies away from these improvements is by no means an obvious way to reconcile ourselves to our social world. The Choice is by no means a trivial one.

In figure 2-6 the ideal is outside of our neighborhood, and it lies in the opposite direction of the local optimum. This may seem simply an unfortunate case. Suppose instead that the global optimum is identical with the local optimum. Now the rankings derived from the Social Realizations Condition converge with the advice of the Orientation Condition. Surely this is the happiest of all cases for ideal theory.⁷⁹ But a worry remains even here: how can we be confident that what we take as the global optimum is truly the global optimum? We know that we have a local optimum that seems ideal—presumably it scores very high on justice—but until we have good knowledge of the entire landscape, it will be exceedingly difficult to determine whether it is truly the global optimum, or whether it is a high local optimum that we have mistaken for the ideal, since the true global optimum lies beyond our ken. True utopia may well lie just beyond the horizon; as long as we are significantly constrained by our own neighborhood, we might never know.

3.4 Progressive v. Wandering Utopianism

Karl Popper recognized this problem and claimed it undermined utopian thought. If we accept that any perspective’s vision of an ideal that lies outside our current neighborhood is vague and subject to revision as we move toward it, we are apt to find that utopia moves as we approach it. The social world that we thought was the global optimum turns out to be on a gradient, and so we must revise our judgment as to where the ideal lies. But, Popper continues, as we learn more about the ideal we might well conclude that we must “change our direction.”80 Elster advances a similar criticism, implicitly drawing on the idea of a perspective. “From my present point of view,” he writes, “I have a full awareness of the front of all the objects in my visual field, but only a formal and empty awareness of the side that is hidden from my view. I know, that is, that they must have a backside, but I do not know how it appears in specific detail. Also from my present point of view I know that if I were to deplace [sic] myself to some other point in the field, new objects would become visible to me, even if I do not know which objects.”⁸¹ Applying this to the utopian analysis of seeking ideal possible worlds, Elster argues that one may know that worlds even better than the currently postulated ideal world most likely exist, but this fact is graspable only “in an empty and general manner.” Once we reach our current ideal, however, we will have a new horizon of possibility, with new possible ideals accessible us. Thus, he argues in support of the inherent intransitivity of reform (§II.1.3): we cannot jump directly to ideal u* from our current world, because until we get to the “proximate ideal” u, which we now appreciate given our current perspective as a describable ideal, we cannot even really see the better-than-that ideal u*. Thus we need to pass through u to even know what u* is.⁸²

Some utopians have accepted this and thus upheld what we might call progressive ideal theory. To complete chapter I’s epigraph from Oscar Wilde, “A map of the world that does not include Utopia is not even worth glancing at, for it leaves out the one country at which Humanity is always landing. And when Humanity lands there, it sees a better country, and sets sail. Progress is the realization of Utopias.”⁸³ This image of progress qua going from one ideal to the next makes sense if the search meets two constraints. (i) The new, even better ideal, must be visible once we reach our current ideal. In my explication of Elster’s analysis in the previous paragraph, I supposed that once we reach our current ideal, u, the next ideal, u*, will be graspable. Here progress from one utopia to the next seems intelligible. But it also could be that u* is not graspable from u; perhaps, say, our current ideal stressed social harmony to such a point that intellectual disagreement was lessened, and so we will miss the better ideal, u*, if—to employ the standard voyage imagery—we first land at “land” at u. If we fell short of u, our current ideal, and landed instead at some less-than-ideal world, perhaps we could glimpse u* from it. How, though, could we know this when we set out for u?

(ii) The idea of “progress,” as opposed to wandering, requires that the new and better ideals are in some sense in the same direction as the ideal we are presently seeking. Again, the conception of a perspective on ideal justice is enlightening. If the new horizons of ideals are worlds where the world’s features (WF) are developments of the institutions and other features that we have brought about in our current quest for ideal justice, then Wilde’s image of setting sail again looks like “pushing onward.”84 But if, from our new vantage point, we see that we misunderstood the structures that we earlier rejected, which now must be reinstituted, our quest for the ideal looks more like wanderers searching back and forth across the landscape for the Holy Grail. In Sidgwick’s imagery, the pursuit of the ideal confronts us “with an illimitable cloudland surrounding us on all sides, in which we may construct any variety of pattern states.”⁸⁵ It can pull us first in this way, and then in that, as we change our orientation to the ideal. This is no mere theoretical possibility. Up to the middle of the twentieth century mainstream socialism resolutely rejected markets: states that moved toward that socialist ideal developed state structures with tremendous authority and shrunk markets as far as was consistent with medium-term economic viability. When analytically and economically informed socialists (such as Elster) rediscovered market processes, two facts became evident. First, the states that had continued furthest along the old socialist path were less likely to appreciate these new insights, having systematically trained generations that they were illusions (point [i] above). Second, when they did appreciate economic analysis, these states were committed to reinstituting many of the features of market systems that the Soviet and Eastern European “People’s” regimes spent so much time and effort destroying. An ideal theory must be able to identify with great confidence the neighborhood in which the ideal lies. If it cannot do so, then we must wonder why, when we confront The Choice, we should turn our back on relatively clear local optimization to pursue what may well be a wandering search for the ideal—perhaps in the end the global optimum lies in the opposite direction we initially supposed, and so toward, not away from, our local optimum. Making The Choice to pursue the ideal looks irresponsible.

This line of analysis led Popper to conclude that “the Utopian approach can be saved only by the Platonic belief in one absolute and unchanging ideal, together with two further assumptions, namely (a) that there are rational methods to determine once and for all what the ideal is and (b) what the best means of its realization are. Only such far-reaching assumptions could prevent us from declaring the Utopian methodology to be utterly futile.”86 This, I think, is rather too strong: a theory of the ideal need not identify at the outset a specific, unchanging, destination. Simmons is entirely correct that “for a while we can just aim ourselves in the general direction of the Himalayas, adjusting our paths more finely—between Everest and K2, say—only when we arrive in India” (§I.1.3).⁸⁷ We do not need the precise location of the ideal before we set out, nor do we need to know every one of its features. However, a theory of the ideal that accepts the revisability of the ideal, but avoids wandering utopianism, must give strong grounds for believing that there is some neighborhood that contains the global optimum in which further searching should be concentrated, and that this will be so for a considerable period of time.

Popper is, though, correct that all too often political theorists have been insufficiently attentive to integrating revisability into their theories of the ideal, being consistently attracted to principles and institutional schemes that settle matters of justice “once and for all.”88 The claim to accurate and precise knowledge of an unchanging ideal struck Popper as both absurd and dangerous—absurd because our limited knowledge of the workings of social institutions is always open to revision and what is best depends on circumstances;⁸⁹ dangerous because those who are convinced that they have a perfect vision of an unchanging utopia are all too likely to give into the temptation to march us toward their promised land of justice, their “Paradise Island.” John Stuart Mill, another philosopher keenly sensitive to the limits of our knowledge and the need for experimentation, also was deeply wary of such utopian projects. In referring to the “revolutionary socialists” of the nineteenth century, possessed of a clear vision of the ideal that they sought to immediately implement (i.e., without tentative experimentation), Mill writes:

It must be acknowledged that those who would play this game on the strength of their own private opinion, unconfirmed as yet by any experimental verification—who would forcibly deprive all who have now a comfortable physical existence of their only present means of preserving it, and would brave the frightful bloodshed and misery that would ensue if the attempt was resisted—must have a serene confidence in their own wisdom on the one hand and a recklessness of other people’s sufferings on the other, which Robespierre and St. Just, hitherto the typical instances of those united attributes, scarcely came up to.⁹⁰

A century witnessing Stalin, Mao, and Pol Pot disastrously confirmed Mill’s judgment; Robespierre is insignificant compared to this utopian trinity. The worry that certain judgments of the unchanging ideal will give rise to recommendations for immediate implementation is by no means a “utopophobia” of the liberal fallibilist.

4 INCREASING KNOWLEDGE OF THE LANDSCAPE AND EXPANDING THE NEIGHBORHOOD

4.1 Experiments in Just Social Worlds

The problem confronting epistemically bounded creatures (and even ideal theorists are so bounded) is to devise ways to explore different parts of their perspective’s justice landscape without actually setting real societies and their populations on potentially wandering and destructive searches. A perspective worth taking seriously seeks reliable information about the justice of as many different social worlds as possible and hopefully gives good grounds for identifying the ideal or its neighborhood. The most straightforward method—one favored by Mill—is actual social experimentation. Given the variety of social worlds in the domain {X}, small-scale experiments might be conducted that seek empirical information about the ways in which principles, sets of rules and institutions, work out under certain background conditions.

We might think of social experiments as starting with an initial situation: a set of initial parameters within which the experiment commences (§I.3.1). This can be seen as setting up a small-scale social world. This initial situation need not be in our current neighborhood: we might commence our experiment seeking to institute a rather distant social world (e.g., a social system designed along the lines of Cohen’s camping trip). Commencing from this initial situation, different experimental social systems (e.g., utopian communities) might then search alternative social worlds in the neighborhood of the initial situation and, perhaps, beyond, and so together, effectively explore significant areas of the perspective’s justice landscape. Each group may employ different procedures for selecting what they see as viable changes in the initial situation, which, they believe, will lead closer to the perspective’s ideal.

Mill was a strong advocate of these sorts of social experiments in living. In contrast to his condemnation of “revolutionary” socialism (§II.3.4), Mill was supportive of socialist experiments along the lines of Robert Owen’s New Lanark community, and those inspired by Charles Fourier. All these proposals, he stressed, had the advantage of being subject to relatively small-scale experiments: they can be “tried first on a select population and extended to others as their education and cultivation permit.”⁹¹ Ideal theorists who appreciate the difficulties of knowing their entire justice landscape, and who agree with Mill that actual experiments are useful, will come to think of ideal theory as less of a political program than as a research agenda. Coping with the Neighborhood Constraint requires a diversity of communities exploring different parts of a perspective’s landscape and sharing their results. Here ideal theory does not suppose a fixed point providing a beacon in orienting improvements in justice, but conceives of itself as a quest to discover what and where the ideal—or at least the distinctly better—might be. If successful, over time Σ adherents’ knowledge of the terrain of justice could be greatly enhanced as different groups see how different social worlds might work.

Despite Mill’s—and many current libertarians’—great attraction to local experimentation, conducting and then drawing inferences from these “social experiments” is extraordinarily difficult. Consider, for example, the fascinating experimental efforts of Robert Owen and his followers. Lenark was a village in Scotland, consisting of mills and workers’ dormitories, founded in 1786 by David Dale. Owen, Dale’s son-in-law, became manager and part owner by 1810 and set about reforming the community on something akin to socialist principles.⁹² Owen’s evaluative standards (ES) stressed rationality, cooperation, and minimizing competition, and his conceptions of the relevant features (WF) focused on the importance of social institutions in shaping personality as well as in producing more cooperative people and, especially, educating citizens to make them more rational, which, in turn, would render them more social. Owen’s fame from this initial experiment led to the establishment of a number of experimental Owenite communities in Great Britain and the United States—something like twenty-three in all.93 The great American experiment was New Harmony, Indiana, founded in 1825. Like most Owenite communities it was characterized by internal disputes, effectively breaking into three communities in 1826, with the experiment effectively ending in 1827. The Lanarkshire community lasted from 1825 to 1827; the one in Ralahine in Ireland from 1831 to 1833; and the one in Hampshire from 1839 to 1845, but it was riven with division by the end.⁹⁴ Many other communities expired very quickly.

That the communities did not persist by no means shows that they were not valuable as experiments; we might think that a good deal was discovered as how not to organize an Owenite social world.⁹⁵ (Recall Thomas Edison’s remark about his long search for the incandescent light bulb: “I have not failed. I’ve just found 10,000 ways that won’t work.”) The problem with drawing inferences from the communities’ failures is rather subtler. In order to view these as experiments seeking to fill out the Owenite perspective’s justice landscape, we have to see them as all seeking to base their pursuit of justice on Owenite evaluative standards and stressing the features of social worlds that Owen thought were relevant to achieving a just society. Thus the experimental aim would be to vary, say, the rules by which the various communities lived and the way they educated members, and then observe how well these realizations scored on Owenite standards. Owen clearly recognized that the variations and changes within communities needed to be constrained to social rules and not basic matters of justice (i.e., Owenite evaluative standards could not be challenged), and so he sought to restrict the ambit of committees running the experiments to nonbasic changes. But this restriction of the decision-making powers of the community—which was necessary if the experiments were to genuinely explore the Owenite perspective—was a critical cause of disputes that unraveled the communities, as some members sought to revise the evaluative standards (as in New Harmony, where inequality was a cause of dispute, leading to a splinter community, the Community of Equality).96 Thus in the end, restricting communities to exploring only the Owenite perspective, which was necessary to see them as actual Owenite experiments, was itself a critical source of failure. Residents rebelled against the limitations on their decision making that the very nature of a true experiment required. Perhaps Owen was not merely making excuses when, in 1840, he declared, “My principles have never been carried out.”⁹⁷

To generalize the lessons of the Owenite tale, in order for actual small-scale experiments to help the proponents of perspective Σ to better know the social worlds in domain {X} almost all the elements of Σ’s perspective would have to characterize each experimental group. They would have to concur on evaluative standards, on an understanding of the relevant features of the social world, on the trade-offs of values characterizing the (second part of the) mapping relation, on an understanding of the similarity of the underlying structures, and on the distance metric (§II.1).⁹⁸ They would, presumably, omit the modeling part of the mapping function (element [i]), as they are engaging in actual experiments to discover the social realizations of these worlds. Enthusiasts of social experiments often fail to perceive that the very value of the experiment depends on not permitting a number of variables to be altered; if the basic features of the perspective can be altered, the results of the experiment will not be helpful in orienting a particular perspective’s pursuit of justice. It is noteworthy that Owenite communities such as New Lanark and Ralahine were owned by proprietors or subscribers; this sought to convert the social experiment into a management exercise, where the manager (owner) sets the criteria for success and leaves the employee teams to find the optimal solutions. Owen was prescient: these are precisely the cases where it can be shown that different teams exploring the same problem can yield real benefits.99 The great barrier to social experimentation, as the fate of the Owenite communities suggests, is to stably maintain this model when the experiment is about the pursuit of justice.

Needless to say, the problems of small-scale social experimentation are even more daunting if it is supposed that these small-scale experiments are informative about the application of the perspective to large-scale societies. If they are so intended, it will often be doubtful that their results scale up; large-scale societies will have features that small-scale societies lack, so it will be uncertain how well their lessons apply to large-scale societies. Moreover, unless we have a rather large number of experiments, we will not be able to employ statistical techniques for judging our inferences, which will be based more on hunch. If we try to avoid the problem of inferences from small-scale experiments by conducting large-scale ones, then we know that the number of experiments will be very limited, and so drawing inferences from such small-n experiments will generally be uncertain.¹⁰⁰

4.2 Improving Predictions: Diversity within, and the Seeds of It between, Perspectives

Perhaps Owen’s son had it right: “the enjoyment of a reformer is … much more in contemplation than reality.”101 Not all investigations of alternative social worlds need to engage in actual social experimentation. As one scholar has put it, “Just as the artist invents imaginary worlds, so the social theorist invents pure states of society.”¹⁰² Of course our problem is that, while in one sense the social theorist is “inventing” a social world in the model, the theorist is also trying to discover its justice so as to make sound recommendations about where utopia lies—and to do that, the theorist needs to figure out how the recommended social world will function. An ideal theorist seeks to understand far-off social worlds and then report back to the rest of us on how they function, and how just they are.

We saw that actual social experiments more or less bracket one part of a perspective on justice—the modeling of worlds and their social realizations—which is replaced with social experimentation. This suggests a way forward for the utopian: an ideal theory might employ multiple predictive models, and see when these different models agree and when they diverge. In this sense we can think of an internally diverse perspective, one that adopts a variety of ways of modeling social worlds, and which seeks to combine them to arrive at an overall prediction of the way a world might work. One case seems especially clear: namely, when the perspective’s models almost converge on estimates in the same small range. To be sure, even here we can go wrong, but our confidence in our estimates will be much higher.103 If we suppose the approach to understanding complex systems that we considered earlier (§II.3.2), convergence of models is most expected up to the borders of our neighborhood. Take again a case drawn from meteorology—hurricane prediction. Predictions typically draw on a number on models based on very different methods and assumptions. As those of us who have lived in New Orleans and have closely followed the models in the summer months know, they usually have markedly diverging results three to five days ahead but converge on one-to-two day predictions. This is precisely what we would expect in modeling complex systems: the further we get from the observed system the more even our best models diverge.

However, an internally diverse perspective can improve the reliability of its predictions even without such convergence. Scott E. Page has stressed what he calls the Diversity Prediction Theorem, according to which collective predictive error = average individual predictive error minus predictive diversity. The upshot of the theorem (explained in appendix B) is that “individual ability (the first term on the right-hand side) and collective diversity (the second term) contribute equally to collective predictive ability. Being different is as important as being good. Increasing diversity by a unit results in the same reduction in collective error as does increasing average ability by a unit.”104 Although an excellent predictive model can still beat a collective prediction, the theorem tells us that we can compensate for an error in our predictive model by employing a greater diversity of models, and essentially averaging the result. This is an important theorem: even if our predictive models are not very good, a perspective that draws on diverse predictive models can significantly enhance its confidence in its estimates of the justice of alternative social worlds. Predictive diversity thus can expand the neighborhood by expanding the range of sound predictions, and so mitigate the problems posed by the Neighborhood Constraint. By drawing on a variety of models, diverse information can be put together to form a more adequate composite prediction.¹⁰⁵ Any ideal theory committed to the Social Realizations Condition and cognizant of the Neighborhood Constraint must value predictive diversity.

The ability of predictive diversity to expand our neighborhood by improving our predictions is of real importance, but it is in one respect critically limited. It remains the case that an excellent predictive model can beat the average of a mediocre collection, but how can we know which models are especially powerful and which are mediocre? “Finding out about” the terrain of justice and the social realizations of other social worlds is not so much about making a prediction about the justice landscape that a perspective can subsequently check as it is about, in a very real sense, constructing that landscape. Recall the idea with which we began this section—“the social theorist invents” social worlds. This is not to say that we are making justice up or constructing the principles of justice (though we could be), but that our only knowledge of a far-off social world is our models of it. We have no independent measurement techniques to determine when a model has gone astray, or to decide what model performs best. We cannot, at the end of the day, compare the results of the model to how the world really is—at least, not until we have actually brought it about. As far as our theory right now is concerned, we are making up the landscape while we are investigating it.

So how should we “explore” landscapes where our exploration via a model in some sense also constructs them? Once we see that the social worlds are “being made up as we go, we can see, clearly, that there is nothing interesting to be said about how the space should be explored, except to say that it should be explored (as it is made up) in the various ways in which various enquiry teams think best. We should, in other words, devolve decision-making about enquiry to the enquiry teams and let them get on with it.”106 D’Agostino identifies this with a “liberal solution to the problem of enquiry in complex environments. Each team will construct and traverse that region of the space which they find interesting,” using the tools and models they think best.¹⁰⁷

This liberal solution is apt to encourage maximum discovery, as each team (or ideal theorist) seeks to model the possible social worlds it is studying in a way that it deems most fruitful. The likelihood that the most appropriate tools and information ultimately will be used is thus greatly enhanced. The drawback is that each “team” judges for itself whether it has been successful; in the absence of shared standards of evaluation, a genuine insight of one investigator is not apt to be taken up by others. In the absence of shared standards of success we cannot suppose that once an inquirer has modeled a far-flung part of the landscape, and announces the resulting heavily model-dependent results, the rest take the report as veridical. Thus, the liberal approach cannot be well integrated into a single perspective. Those using diverse models tend to go their own way. Given the controversies surrounding which models are most appropriate, rather than seeing modeling diversity as occurring within a perspective on justice, it seems more a diversity among those who understand the problem in different ways. We can discern here the seeds of a dynamic that we will observe in the rest of this, and the coming, chapter: as we seek to take increasing advantage of the fruits of diversity we find that we introduce diversity not simply within a perspective, but between perspectives. The way we see the world tends to influence the models that we think are best to understand it, and so one perspective has some tendency to see some models as more sensible and reliable than do other perspectives. As Page recognizes, a perspective encourages its proponents to employ a specific set of tools for understanding its social worlds and its problems.108 To the extent this is so, maximum predictive diversity is apt to occur when many different perspectives (“crowds”) interact, and bring very different tools to bear on a predictive problem. Overall, “the logic of collective intelligence is that different individuals will apply different ‘theories,’ or more appropriately heuristics, to the guessing task, the aggregate of which results in a highly precise estimate of the variable in question. While each ‘theory’ would only be able to predict part of the variance in the observed outcome, the collection of theories brought together can explain much of the variance and lead to a highly precise result.”¹⁰⁹ As we move from a single perspective employing diverse models, to the interaction of diverse individuals with different perspectives, we move from the importance of internally diverse perspectives to the diversity of perspectives.

The alternative to the liberal, individualistic, and diversity-maximizing approach is what D’Agostino deems the “republican approach,” where inquirers possess common standards of assessment.¹¹⁰ Here we would expect agreement as to what constitutes a good model and how to interpret its results. The “republican” approach seems consistent with diversity within a perspective: when one announces to one’s Σ-perspective colleagues that one has expanded our neighborhood by identifying the working of a new social world at its edges, they are apt to see this as a real advance, as they embrace the assumptions of one’s search. Or, when one group develops a new model, others will grasp its usefulness, and so add it to the basket of models used in the perspective. But while the republican approach enhances communication of results and mutual comprehension, in our case it does so by restricting the lessons that can be drawn from different models of any given social world: only models embraced by the republican community count as informative. On the liberal approach, in a rugged landscape rugged individualist investigators can use innovative techniques to understand the justice of far-off possible social worlds; in contrast, in a republican community that commences with agreement as to what constitutes the correct range of approaches and subjects of inquiry, insightful and highly innovative approaches may be excluded. Thus an exploration of a rugged landscape that is being constructed as we explore it (in the sense that our best models of a social world are the only way to know it—until we arrive at it) confronts the critical trade-off between innovation and communication, a theme that we will explore in some depth in the coming chapter. The greater the diversity of inquiry is embraced, the more apt we are to actually uncover the best insights; but many techniques might not be accepted as reliable by others, and so these insights may not be accepted as veridical. As approaches are constrained to those endorsed by a republican community, they achieve communication of insights, though at a cost of excluding some approaches and their insights.

4.3 Introducing Explicit Perspectival Diversity

The costs and benefits of employing different search strategies dominate much of the literature on exploring rugged landscapes.111 However, once an inquirer seeks to evaluate social worlds beyond our neighborhood, our rationale for adopting the neighborhood constraint leads us to take these findings with more than a grain or two of salt. Ex hypothesi, the inquirer is making claims about the justice of social worlds about which we cannot be confident; the resulting model(s) is (are), in an important sense, creating the very world we are evaluating. If we are to search more widely and yet accept the reasoning behind the Neighborhood Constraint, then a theory of the ideal must explore ways to expand its ken within this constraint. Besides improving its predictive models, are there other ways it can do so?

Recall again the five elements of a perspective: (ES) a set of evaluative standards or principles of justice; (WF) an identification of the relevant features of social worlds; (MP) a mapping relation from the evaluative standards to the features of the social worlds, yielding an overall justice score; (SO) a similarity ordering of the underlying features that provides a meaningful structure to the domain {X} of worlds to be evaluated; and (DM) a distance metric (§II.1). Thus far I have been supposing that everyone shares all these features except, perhaps, the modeling element of the mapping function. Such thorough agreement on the elements of a perspective is, however, an extreme assumption. Analytic results indicate that if we relax the assumption of a thoroughly common perspective, and consider searches among individuals who posses different perspectives, results can be greatly improved.¹¹² Let us, then, introduce a modest degree of diversity among perspectives more formally into the analysis. Suppose that the investigators now all agree on every element of perspective Σ except the metric of distance (DM) between social worlds (§II.1.2).

To illustrate the significance of this sort of perspectival diversity consider the idea of a distance-contracting metric. A distance-contracting metric is any metric that increases the effective size of our current neighborhood relative to some other metric. Consider for instance the most minimal and straightforward way in which two distance metrics, d₁ and d₂, might differ from one another, namely if d₂ were to be a scalar transformation of d₁. In this case if d₂ = kd₁ where k ∈ (0,1), then d₂ would be a distance contracting metric relative to d₁. Thus a perspective Σ₂—identical to Σ₁ except for its distance metric, d₂—will view moves between certain social worlds (say, from our current socioeconomic system to property-owning democracy) as moves within our neighborhood, while Σ₁, employing the d₁ metric, will see these moves as beyond our ken. The result of this sort of difference is likely to be debate about the real size and scope of our current neighborhood. An upshot of this debate sometimes will be the effective expansion of our neighborhood. Should those with distance-contracting metrics like d₂ convince Σ₁ adherents that Σ₂ is a more plausible perspective, we take a small but significant step toward mitigating the Neighborhood Constraint.113

The point here is subtle and important: just what is within our neighborhood is partially a matter of how far away—how dissimilar—we view certain social worlds. For example, two metrics d₁ and d₂ might agree on the similarity of the underlying structure of social worlds, such that they both arrange them a–b–c–d–e. On d₁, though, a and b are very close social worlds, both considerably distant from c, which is nearer to d and e, while on d₂ c is much closer to b than to d (recall figure 2-1). Suppose we are at world a; d₁ will not see knowledge of a and its neighbor b as very informative about c, while d₂ will; as a result d₂ may include c in the neighborhood of a while d₁ will not. If Alf can convince Betty, who employs d₁, that his d₂ is the superior metric, then he will have somewhat mitigated the Neighborhood Constraint by bringing new social worlds into Betty’s current neighborhood, and thus perhaps a better local optimum. This constitutes a minimal difference in perspectives: while the similarity ordering (SO) of the domain {X} is the same, the distance metrics (DM) are different.

Once again, it is important to stress that the benefits of perspectival diversity are not merely an upshot of my formal representation of the problem. Consider again Mill’s case for what he called “socialism.” From Mill’s perspective Victorian capitalism fell far below the moral optimum; a form of society centered on worker cooperatives was far better, and perhaps even the ideal. Mill did not simply analyze this ideal, though. Instead he sought to show how a society that might appear very far from the one he inhabited could be achieved via the institutions already in place. Mill insisted that the evolution of new forms of partnerships and corporations that render capitalism more efficient would also allow competitive market processes within capitalism to test the viability of socialist experiments. By connecting the idea of worker cooperatives to a series of intermediate social worlds, he sought to bring socialism into the neighborhood of Victorian capitalism. Rather than a leap into the dark, Mill depicted socialism as a form of industrial organization within the current neighborhood.

5 THE LIMITS OF LIKE-MINDEDNESS

As we saw in chapter I, for a theory of ideal justice to orient the search for improvements in justice and, simultaneously, for the identification of the ideal to be critical, two conditions must be met: Social Realizations and Orientation. The first ensures that our ideal theory will help us make the choices between less-than-ideal social worlds that usually compose our option set. When the Social Realizations Condition is met our theory can provide guidance as to whether one social world secures more or less justice than another. This allows us to form judgments of comparative justice as well as sometimes being in the position to recommend reforms that increase justice. As I stressed, however, the Social Realizations Condition alone does not require reference to an ideal; Sen’s “climbing” model meets this condition, and it strenuously abjures any appeal to the best or optimal social realization of justice. The ideal is necessary to orient us not simply when we are concerned with ranking the options in terms of their justice, but when our choice confronts at least two dimensions: how just a social world is, and whether changing the features of the world moves it closer to the features of the ideal.

In this chapter I have explored a model of these types of searches—rugged landscapes, which have been developed in other contexts in different ways. Some model along these lines, I have argued, is implicit in the very idea of ideal theory as a distinctive and necessary approach to political theory. To be sure, the model I have developed is a simple one; for example, “directionality” has been assumed to be unidimensional (there is only one overall dimension of similarity). This, of course, is an “idealization” (i.e., simplification), but if we make the model more complex (say by assuming that we could move north, south, east, or west, rather than simply to the right or left),114 the problems for the ideal theorist become more, not less, difficult. In this simple model I have supposed that ideal theory identifies a perspective on justice that, in principle, generates a terrain of justice for a set of possible social worlds; the aim is to find the global optimum while also making improvements in justice in the less-than-perfectly just social worlds that confront us.

The critical claim of this chapter is that in this terrain the ideal theorist confronts the Neighborhood Constraint: we have far better information about the realization of justice in our neighborhood than in far-flung social worlds. I have tried to show that a variety of considerations lead to this conclusion: the correlation of the justice values of proximate locations in moderately rugged landscapes (which are the sort that ideal theory must be supposing); error inflation as models of complex social dynamics depart from observed social worlds (and ideal theory must be assuming moderate complexity); the fact that our models are calibrated to our social world; the mass of empirical evidence about the dynamics of our social world; and the differential costs of discovery about our neighborhood and far-flung worlds. Despite all this I have found that philosophers often simply deny the Neighborhood Constraint (given the status of Plato in the profession, perhaps this should not be surprising). Distant possible social worlds, they have insisted, may be quite simple to understand and model, and so we know them better than our neighboring worlds. I know of no systematic analysis that supports this conclusion: it is at best a mere, not terribly plausible conjecture. The question, I have stressed, is not whether a philosopher can “invent” a simple world of perfect justice that looks like a camping trip or a perfectly competitive market in private security firms, but rather the philosopher’s grounds for concluding that the realization of this social world, with the features ascribed to it and the assumed parameters, will behave in the simple way that is predicted. Admittedly, the theorist can fix the parameters to ensure this result, while acknowledging that this simple world is unrealistic and could not be implemented. But even if this is acceptable for “the ideal” it cannot be acceptable for all worlds outside of our neighborhood, for if the parameters are not ones that are plausible for us to meet in any of these further-off worlds, then ideal theory cannot recommend movement outside of our neighborhood. The ideal becomes mere dreaming or lamentation, as it no longer orients our efforts at reform (§I.1.4).

Because of the terrain of justice, which motivates the Orientation Condition, local optimization often points in a different direction than pursuit of the ideal. We then confront what I have called The Choice: should we turn our back on local optimization and move toward the ideal? Given the Neighborhood Constraint our judgments within our neighborhood have better warrant than judgments outside of it; if the ideal is outside our current neighborhood, then we are forgoing relatively clear gains in justice for an uncertain prospect that our realistic utopia lies in a different direction. Mill’s revolutionaries, certain of their own wisdom and judgment, were more than willing to commit society to the pursuit of their vision of the ideal; their hubris had terrible costs for many.

For the ideal theorist to make a reasonable Choice in favor of pursuit of the global optimum, it would seem that much better information is needed about the terrain of justice, at least mitigating the asymmetry of knowledge expressed by the Neighborhood Constraint. I have briefly examined two important proposals: actual social experiments and internal diversity of predictive models. In the context of pursuit of a given perspective’s view of optimal justice, both have shortcomings. I concluded with the possibility of expanding the neighborhood, perhaps bringing better optima into our neighborhood.

Although this last inquiry—into how the boundaries of the neighborhood might be expanded—was in one way modest, in another way it was the first step to a more radical solution. In expanding the neighborhood we varied the distance metric, which was an element of the perspective. So rather than searching under simply one common, normalized perspective on justice, we now have multiperspectival searching. In recent years powerful analytic treatments have demonstrated that under some conditions multiperspectival searching has tremendous advantages over single perspective searching in rugged landscapes. We now turn to these results, and their application to the search for ideal justice.

1 Rawls, Lectures on the History of Political Philosophy, p. 226.

² Ibid., pp. 229ff.

³ Ibid., p. 224.

⁴ Ibid.

⁵ See Page, The Difference, pp. 30ff.; Muldoon, Diversity and the Social Contract.

⁶ See §I.2.1.

⁷ On specifying possible worlds in a theory of justice, see Wiens, “Against Ideal Guidance,” pp. 437ff.; Lawford-Smith, “Non-ideal Accessibility”; Goodwin and Taylor, The Politics of Utopia, pp. 210–14; Elster, Logic and Society.

⁸ They thus give rise to deontic judgments, partitioning acts into the permissible, required, and impermissible. On the contrast between deontic and optimizing evaluations, see Wiens, “Against Ideal Guidance,” pp. 437ff. Such deontic judgments are crucial to Estlund’s and others’ evaluations of ideal justice. See, for example, Estlund’s “Human Nature and the Limits (If Any) of Political Philosophy” and Lawford-Smith, “Non-ideal Accessibility.” It is important to stress that such deontic judgments may well be a part of the principles and rules that partially constitute a social world and its evaluation; the point is that the overall evaluation of the justice of the world, required by the Social Realizations Condition, employs a richer classification system.

⁹ Broome, Ethics Out of Economics, p. 164.

¹⁰ For an argument that our moral thinking is characterized by such cycles, see Temkin, Rethinking the Good.

¹¹ The Condorcet Paradox is the core case of this. If, however, individuals are evaluating the options {x, y, z} in terms of a single dimension of evaluation (say, on a left-right continuum), intransitive social preferences over the triplet cannot arise. See my On Philosophy, Politics, and Economics, pp. 154–64. See also D’Agostino, Incommensurability and Commensuration.

¹² Somewhat weaker conditions than full transitivity would suffice to avoid this outcome. See Sen, Collective Choice and Social Welfare, pp. 7–20.

¹³ Which is to say that the mapping function, MP (part ii), need not be linear with respect to the satisfaction of the evaluative standards. This is another reason to distinguish the mapping relation (MP) from the evaluative standards (ES); two theories could share the same evaluative standards and the same understanding of the relevant features of social worlds (WF), but one evaluates worlds in a strictly linear way in relation to their satisfaction of the principles while another is distinctly nonlinear, yielding very different perspectives on justice.

¹⁴ See here Hamlin and Stemplowska, “Theory, Ideal Theory and the Theory of Ideals,” pp. 53ff.

¹⁵ I have greatly benefitted here from Wiens, “Against Ideal Guidance.”

¹⁶ Gilabert, “Comparative Assessments of Justice, Political Feasibility, and Ideal Theory,” p. 45.

¹⁷ See Wiens, “Will the Real Principles of Justice Please Stand Up?”

¹⁸ The case for an optimization model of the ideal has been powerfully advanced by Wiens. See his “Against Ideal Guidance” and “Will the Real Principles of Justice Please Stand Up?” In this latter essay Wiens seeks to identify principles of justice with the widest possible application to, as I put it, the domain of worlds to be evaluated.

¹⁹ As I pointed out at note 13, it could be nonlinear, giving very high scores to “pretty just situations” and very low scores to “less than pretty just social worlds.”

²⁰ This would imply a one-to-one mapping of social worlds on to realizations. Should an ideal theory have such confidence, it can be included in the model.

²¹ This is, perhaps, one sense in which a theory can be “utopian” without being “unrealistic”—when examining a set of features the theory assumes that they all work in, if not the best possible way, something very near to it. This, for example, is a manifest feature of Bacon’s New Atlantis, where the ship arrives at what seems to almost be a “Land of Angells” (p. 12).

²² Thus dystopias. In contrast to utopias, which are typically worlds far distant from our own (for an exception, see Bellamy, Looking Backward), dystopias are usually worlds (too) close to our own where things work out about as badly as we could fear. Orwell’s 1984 is a prime example. In contrast to 1984, in Rand’s dystopic novel, Anthem, the hero’s self is not destroyed; this could be understood as not depicting the worst possible realization, or a difference about the internal feasibility of thoroughly soul-destroying realizations.

²³ See Brennan’s “Feasibility in Optimizing Ethics” for a model that seeks to weigh the probabilities of good and bad realizations.

²⁴ Similarity thus understood is a pairwise relation between overlapping pairs, a common idea. See Morreau, “It Simply Does Not Add Up,” p. 484. Morreau’s definition is somewhat different but also focuses on pairwise relations among pairs. See also Keynes, A Treatise on Probability, p. 36. For further specification of the properties of such an ordering, see appendix A.

²⁵ Weitzman, “On Diversity,” p. 365.

²⁶ Page, The Difference, pp. 48, 33.

²⁷ Weitzman, “On Diversity,” p. 365.

²⁸ Read as “the distance between i and j.” I assume in (4) that i, j, and k are different social worlds. See Weitzman, “On Diversity,” pp. 364–65. Weitzman develops distance metric of similarity with attractive features.

²⁹ Condition (4) ensures that the distance metric is constrained by the similarity ordering.

³⁰ Simmons, “Ideal and Nonideal Theory,” p. 35.

³¹ See Brennan and Pettit, “The Feasibility Issue.”

³² Gilabert and Lawford-Smith, “Political Feasibility,” p. 812.

³³ See Wiens, “Political Ideals and the Feasibility Frontier,” p. 12.

³⁴ Ibid., p. 7.

³⁵ Ibid., p. 14.

³⁶ Ibid., p. 12.

³⁷ See Hamlin, “Feasibility Four Ways.”

³⁸ Rawls, The Law of Peoples, p. 138.

³⁹ Brennan, Eriksson, Goodin, and Southwood, Explaining Norms, pp. 107, 128–29.

⁴⁰ See here Gilabert, “Comparative Assessments of Justice, Political Feasibility, and Ideal Theory,” pp. 47ff.

⁴¹ Elster, Logic and Society, p. 57. For a discussion see Goodwin and Taylor, The Politics of Utopia, pp. 213–14.

⁴² As I pointed out in §I.1.3, ideal theorists often wrongly charge Sen’s “pairwise” comparison approach with being “incrementalist.” I do defend incrementalism in chapter IV, so this lack of transitivity of feasibility is entirely consistent with the view I shall defend.

⁴³ Räikkä, “The Feasibility Condition in Political Theory,” p. 30. Emphasis in original.

⁴⁴ Rawls, The Law of Peoples, p. 16.

⁴⁵ See, for example, Tomasi’s Free Market Fairness.

⁴⁶ Thanks to Fred D’Agostino for suggesting this case.

⁴⁷ The classical work exploring these problems is Kauffman, The Origins of Order, especially chap. 2.

⁴⁸ Kauffman, The Origins of Order, pp. 45ff.

⁴⁹ In this case the reliance on the “theorem of the second best” (§I.1.4) is appropriate—arranging a world so that it almost instantiates ideal social structures does not tend to yield an almost-ideally just social world. However, in other cases, such as low-dimensional optimization problems (§II.2.3), it is by no means a fallacy to suppose that coming close to ideal structures will approximate ideal justice. One of the great benefits of the model developed here is that we can distinguish when it makes sense to seek an approximation to the best and when it does not, going beyond rather vague invocations of the theorem of the second best. Note also that our model is derived from the analysis in chapter I of the inherent structure of ideal theories, rather than simply seeking to import analysis of economic efficiency into a theory of ideal justice.

⁵⁰ Kauffman, The Origins of Order, p. 47.

⁵¹ Ibid., p. 52. See also McKelvey, “Avoiding Complexity Catastrophe in Coevolutionary Pockets,” esp. pp. 301–2.

⁵² Scanlon, What We Owe Each Other, p. 214. See also Estlund, “Utopophobia,” p. 121.

⁵³ See Gavrilets, “High-Dimensional Fitness Landscapes and Speciation.” The issues here are complex. If we consider landscapes in which data points are individuals and species are groups of points spread over an N-dimensional area, high-dimensional landscapes can display fitness ridges that provide paths from one optimum to another. I have greatly benefitted from discussions with Ryan Muldoon about these matters.

⁵⁴ Complexity is often defined as existing at the “edge” of chaos. For an especially clear and up-to-date analysis, see Page, Diversity and Complexity, esp. chap. 1; Waldrop’s popular treatment, Complexity, is a classic, yet still enlightening. For a good overview of complexity applied to different fields, see Auyang, Foundations of Complex-Systems Theories in Economics, Evolutionary Biology and Statistical Physics. For a philosophically informed analysis of chaos, see Peter Smith, Explaining Chaos; for a very accessible (and another classic) treatment, see Lorenz, The Essence of Chaos.

⁵⁵ There are complications that might be explored here: for example this might seem to imply that Σ is always correct about the current social world. We could develop more conditions (say, some required reflection or information condition), but this basic idea suffices for present purposes.

⁵⁶ Neither is it new. Popper insightfully analyzes the way that holism undermines social experimentation in The Poverty of Historicism, chaps. 20–25.

⁵⁷ Rawls identifies such a system in A Theory of Justice, pp. 241ff.; see also Rawls, Justice as Fairness, pp. 135ff. Rawls credits the economist J. E. Meade with the idea, citing a 1964 work. For a later treatment of Meade’s, see his The Just Economy. For a recent collection of essays aimed at philosophers, see O’Neill and Williamson, eds., Property-Owning Democracy. For a critical treatment, see Vallier, “A Moral and Economic Critique of the New Property-Owning Democrats.”

⁵⁸ Kauffman, The Origins of Order, p. 243. D’Agostino notes these features in Naturalizing Epistemology, pp. 118–19.

⁵⁹ Kauffman, The Origins of Order, p. 60.

⁶⁰ Ibid., pp. 62–63.

⁶¹ For an important exception see Satz, “Amartya Sen’s The Idea of Justice.”

⁶² Rawls, The Law of Peoples, p. 128.

⁶³ Rawls, Justice as Fairness, p. 69.

⁶⁴ Ibid., p. 70. Emphasis added.

⁶⁵ Marx, Capital, vol. 2, pp. 613ff.

⁶⁶ Bellamy, Looking Backward, p. 1.

⁶⁷ Cohen, Why Not Socialism? I have argued that Cohen’s small-scale camping trip model does not capture the egalitarianism of small-scale societies—it is not, I think, credible that it models a viable large-scale egalitarian order. See my “The Egalitarian Species,” pp. 18–19.

⁶⁸ E.g., David Friedman, The Machinery of Freedom.

⁶⁹ See Bicchieri, Norms in the Wild.

⁷⁰ This includes even relatively close worlds. Tanner concludes his overview of policy interventions by observing, “What is almost a constant, though, is that the real benefits usually are not the ones we expected, and the real perils are not the ones we feared.” Why Things Bite Back, p. 272.

⁷¹ In Sen’s climbing model (§I.1.3), every social world (except those at the top and the bottom of the ordering) is one rank more just than the social world below it, and one rank less just than the world above it, hence the perfect correlation of justice and rank.

⁷² Zhou, “Organizational Decision Making as Rule Following,” p. 262. See also D’Agostino, Naturalizing Epistemology, p. 53.

⁷³ Unless the perspective is a relatively “dumb” one, as in our height example. See §III.1.3.

⁷⁴ The analysis here draws on my “Social Complexity and Evolved Moral Principles.”

⁷⁵ Though, of course, we cannot assume that all moves in our neighborhood are feasible in all the various senses of that protean concept. However, we do have better grounds for concluding that intervention ϕ would produce outcome O, part of Gilabert and Lawford-Smith’s analysis of feasibility (§II.1.3).

⁷⁶ To be sure, there have been some “political theories” according to which what is important is not knowledge of what we are getting ourselves into, but following the will of the leader. Goodwin and Taylor judiciously consider whether such theories can be considered “utopian”—I shall simply set them aside. The Politics of Utopia, p. 18.

⁷⁷ But see §III.2.3.

⁷⁸ Rawls, Justice as Fairness, p. 2.

⁷⁹ Though it is, alas, a case where the ideal is not needed—Sen’s climbing model would suffice.

⁸⁰ Popper, The Open Society and Its Enemies, vol. 1, p. 160.

⁸¹ Elster, Logic and Society, p. 57.

⁸² Ibid., pp. 57–58.

⁸³ Wilde, The Soul of Man under Socialism, p. 40. See also Goodwin and Taylor, The Politics of Utopia, pp. 213ff.

⁸⁴ Or, as Bellamy puts it, “ever onward and upwards.” Looking Backward, p. 1.

⁸⁵ Sidgwick, The Methods of Ethics, p. 22.

⁸⁶ Popper, The Open Society and Its Enemies, vol. 1. p. 161. Compare Bacon. We are informed by the lawgiver of New Atlantis: “And recalling into his Memory, the happy and flourishing Estate, wherein this Land then was; So as it might bee a thousand wayes altered to the worse, but scarse any one way to the better; thought nothing wanted to his Noble and Heroicall Intentions, but onely (as farr as Humane foresight miught reach) to give perpetuitie to that, which was in his time so happily established” (New Atlantis, p. 22).

⁸⁷ Simmons, “Ideal and Nonideal Theory,” p. 35.

⁸⁸ See, e.g., Rawls, Justice as Fairness, pp. 105, 115, 118n, 119, 194. Despite the repeated use of this phrase, Rawls probably does not have in mind anything quite so Platonic. “Political liberalism … does not try to fix public reason once and for all in terms of one form of favored conception of justice.” “The Idea of Public Reason Revisited,” p. 582. See further §IV.1.3.

⁸⁹ Consider how quaint the utopias of Plato, More, Bacon, or Bellamy strike us today. Of course some believe that we are at the end of history, and their theory of the ideal has glimpsed the owl of Minerva. Popper was quite right to see historicism as a complement to Platonism (The Open Society and Its Enemies, vol. 2).

⁹⁰ Mill, Chapters on Socialism, p. 737.

⁹¹ Mill, Chapters on Socialism, p. 737.

⁹² For Robert Owen’s own account, see his A New View of Society.

⁹³ Haworth, “Planning and Philosophy.”

⁹⁴ Kumar, “Utopian Thought and Communal Practice,” p. 18.

⁹⁵ Ibid., p. 19. A grave problem, however, is to distinguish endogenous causes of collapse (which suggest problems with how the social world was ordered) from exogenous ones, e.g., the environment in which the experiments took place. That the Ralahine experiment failed after its “proprietor, John Scott Vandeleur, gambled away his fortune in the clubs of Dublin and fled the country to escape his creditors” can hardly be said to show us much about the viability of Owenism. Haworth, “Planning and Philosophy,” p. 151.

⁹⁶ Haworth, “Planning and Philosophy,” p. 153; Kumar, “Utopian Thought and Communal Practice,” p. 18.

⁹⁷ Quoted in Haworth, “Planning and Philosophy,” p. 152.

⁹⁸ Without this prohibition the groups develop different perspectives, rather than exploring the same one. As we will see presently, and especially in the next chapter, different perspectives can be of great use in solving optimization problems, but their benefits and problems are different from models in which the teams share all the elements of a perspective, and simply explore different parts of the same optimization space.

⁹⁹ For an extended discussion of an example from management, see D’Agostino, “From the Organization to the Division of Cognitive Labor.”

¹⁰⁰ Although, of course, regression and other statistical techniques can allow us to draw some useful inferences from natural experiments.

¹⁰¹ William Owen, Diary of William Owen, p. 129.

¹⁰² Kumar, “Utopian Thought and Communal Practice,” p. 1.

¹⁰³ For example, before the North American Free Trade Agreement was launched extensive and varied modeling was used to predict effects; even what seemed like consensus conclusions of the models often turned out quite wrong on critical matters. See Shikher, “Predicting the Effects of NAFTA.”

¹⁰⁴ Page, The Difference, p. 208. Emphases in original.

¹⁰⁵ Surowiecki discusses the example of the search in 1968 for the lost United States submarine, the Scorpion, in which diverse predictions within a group as to its location were aggregated to arrive at group prediction that was accurate to within 225 yards. The Wisdom of Crowds, pp. xx–xxi.

¹⁰⁶ D’Agostino, Naturalizing Epistemology, p. 138.

¹⁰⁷ Ibid.

¹⁰⁸ Page, The Difference, p. 286.

¹⁰⁹ Wagner, Zhao, Schneider, and Chen, “The Wisdom of Reluctant Crowds.” See also Sunstein, Infotopia, chap. 1.

¹¹⁰ D’Agostino, Naturalizing Epistemology, pp. 138–41.

¹¹¹ This subsection draws on work that I conducted with Keith Hankins. I thank him for permission to use it.

¹¹² See Hong and Page, “Groups of Diverse Problem Solvers Can Outperform Groups of High-Ability Problem Solvers”; Hong and Page, “Problem Solving by Heterogeneous Agents.”

¹¹³ On the other hand, even this very modest degree of diversity can lead to problems of communication, a worry that will occupy us in the next chapter. If we deeply disagree about how to measure similarity (or distance) between social worlds, a modification to some relevant feature of the world that I consider to be relatively minor might appear quite radical to you. For instance, I might ask you to imagine a world that is otherwise like ours, but in which people are slightly more equal, though at the cost of being slightly less free, and I might judge that world to be superior to our own. If you have a different conception of what counts as slightly less free, though, you might imagine an entirely different world—one which you, reasonably, might think is much less just than our own—and in this case, it is almost inevitable that we will find ourselves talking past one another.

¹¹⁴ See appendix A, point (iii).