Rationality and Reasoning

Throughout most of this book, we have argued our thesis that people are basically rational₁ to a reasonable degree—i.e. that they think, reason, and act in such a way as to achieve many basic personal goals. We have also shown that judgements of irrationality are typically made by authors who demand that people should have rationality₂—i.e. think or reason according to some impersonal normative system, such as some version of formal logic. These authors expect subjects to be rational₂ in abstract or unrealistic experiments, where they may not interpret the instructions in the way the experimenters intend. Against this, we have emphasised the importance of subjective relevance and the role of tacit processing in thinking and reasoning, and in interpreting verbal instructions. However, we have not claimed that there is no place for rationality₂. It is a necessary goal itself at times, to advance ordinary and scientific knowledge, and ordinary people do possess it up to a point. They follow logical rules to some degree in their reasoning, and there is an explicit as well as implicit cognitive system.

In this chapter we will ask to what extent subjects can follow logical rules, and whether this is linked to explicit thinking of the kind that is susceptible to verbal instruction and measurable through verbal protocol analysis. Traditional approaches to reasoning research presuppose certainly the first of these concepts—the possession of deductive competence—and in many cases the second also. However, this to confuse the study of reasoning processes with the study of reasoning experiments. We have already shown that much of the behaviour in the experiments cannot be attributed to the following of logical rules or to other explicit processes.

Although our broad objective is to address both reasoning and decision processes in one framework, we make no apology for focusing this chapter entirely on the fundamental case of logic and deductive reasoning—a complex issue in its own right. We do believe that the outcome of our analysis will have broader implications for thinking and decision making as a whole as we hope to show in Chapter 7. We start by considering the evidence for deductive competence, which is provided by experiments in the deductive reasoning literature, and then give critical consideration to the current theories of deductive competence in the literature.

Competence and Bias in Reasoning: The Evidence

Conditional reasoning

Let us return to the Wason selection task, which has been discussed extensively in earlier chapters. Of all the problems in the reasoning literature, in our view this task provides the least evidence of deductive competence. Perhaps we should not be surprised by this, because the task involves meta-inference, hypothesis testing, etc. and is not a direct test of people’s ability to draw deductive inferences. We have provided detailed theoretical discussion of what we believe to be the causes of card choices on various versions of this task: subjects choose cards that appear relevant. Factors affecting relevance that we have discussed include linguistically based heuristics, probability judgements, epistemic utility, and concrete pay-offs or costs. We have suggested that realistic versions of the task that facilitate p and not-q choices alone, usually taken as the correct responses, do so through helpful pragmatic cues rather than by inducing logical rule-following. Above all, we have argued that the task should be viewed as a measure of thinking and decision making, and not as a problem that elicits deductive reasoning.

That some authors persists in viewing the selection task as a deductive reasoning problem has unfortunate consequences. For example, the study of Cheng, Holyoak, Nisbett, and Oliver (1986) addressed the question of whether training in logical principles and attendance at logic courses could facilitate ability for general deductive reasoning. In our view, the choice of the selection task was not the right one for this study. It surprises us not at all that the results were negative, because few people get past a relevance judgement on this task. In terms of the Evans (1984, 1989, Chapter 3) heuristic-analytic theory, people’s card choices are entirely heuristic and the analytic processes do not get engaged.

Now the fact that people do not generally exhibit deductive competence on the Wason selection task does not mean that they do not possess it. In fact, one of the main theoretical challenges in considering this task is to explain why people do not apply the competence that is exhibited on other conditional reasoning problems. For example, in the studies reviewed by Evans et al. (1993a, Chapter 2) subjects are shown on average to endorse modus tollens more than 60% of the time with affirmative conditionals. In other words, given a conditional such as:

If there is an A on one side of a card, then there is a 3 on the other side of the card

and the information that the card does not have a 3, most subjects will correctly infer that the card cannot have an A. However, only around 10% will choose to select the number that is not the 3 on the selection task. Similarly, the experiments using the truth table task show that most subjects can correctly identify the falsifying case of such a conditional: a card that has an A on one side and does not have a 3 on the other. Subjects can follow modus tollens as a general inference rule, but they do not apply it to make their choices in the indicative selection task. Even if they are told that the conditional is about only the four cards, an implicit heuristic makes certain cards appear relevant, as we have already argued. These cards represent cases that would help them to investigate efficiently a realistic conditional about many more objects in the real world.

We have discussed in Chapter 5 what happens if the cards are turned over at the end of an indicative selection task. When the not-q card is turned over and a p revealed on the other side, subjects will state that this makes the conditional false. Here subjects are surely following a logical and semantic rule for determining when a conditional is false. This is an example of explicit thought, displaying deductive competence, in which the subjects have a good reason for what they say—they are rational₂. In our example about ravens, remember that it is inefficient to investigate the conditional by checking things in the unlimited and heterogeneous set of non-black to see if they are ravens. It is efficient for an if-heuristic to make ravens relevant. One then focuses on the homogeneous set of ravens and uses one’s implicit recognition ability to pick these out and observe their properties. However, if one does happen without extra trouble to come across, or be told about, a non-black raven, then one benefits from awareness that this falsifies or strongly disconfirms the conditional. Consciousness can here fulfil its role of taking account of the unexpected, which cannot be done by a fixed heuristic. Good implicit heuristics and explicit thought can work together effectively.

Recall as well that looking for the false consequent case can become efficient when the consequent has a negative in it, such as “not a raven”. The not-heuristic here makes ravens relevant, and so subjects again focus on a relatively small and homogeneous set. Actually it is wrong to say that subjects do not explicitly reason in the selection task, but rather we should say that this reasoning does not normally affect the choices made. In the studies of card inspection times described in Chapter 3 (Evans, 1995b and in press), subjects are shown to spend some time justifying decisions apparently already made, and an analysis of concurrent verbal protocols also indicates reasoning about the hidden values of cards focused on (see also Beattie & Baron, 1988). There is no necessary conflict between heuristics that determine choices, and lead to good search processes, and the later explicit justification of choices. In fact, when asked to justify their choices, subjects do explicitly refer to an attempt to establish the truth of a conditional of the form “if p then q” by picking the true antecedent card, and attempt to establish the falsity of one of the form “if p then not-q” by choosing the false consequent card (Wason & Evans, 1975). That is not inconsistent, as it is generally efficient to look for confirmation in the former case, and disconfirmation in the latter.

Subjects cannot be explicitly reasoning about each card in turn, to infer what is on the back or to work out expected epistemic values, if they do not attend to the cards they do not choose. This point has been made in criticism of the pragmatic reasoning schema theory of deontic selection tasks (Cheng & Holyoak, 1985) in a recent paper by Evans and Clibbens (1995). Recent work (Green & Larking, 1995; Love & Kessler, 1995; Platt & Griggs, 1993) does show that subjects will change their choices even in indicative tasks given certain verbal manipulations, which seem, at least in part, to draw attention to counter-examples and invoke analytical (explicit) rather than heuristic (implicit) thought. Thus it is possible that even on the selection task, explicit reasoning processes may sometimes determine choices, as opposed simply to providing justifications for them. However, exactly what is happening in these experiments will have to await future research.

With other conditional reasoning tasks, the evidence for deductive competence is much easier to locate. For example, we saw in Chapter 3 that the conditional truth table task is also strongly susceptible to matching bias. That is, mismatching cases are more likely to be classified as irrelevant. However, matching has nothing to do with how the relevant cases are sorted into true and false: these decisions are clearly influenced by logic. For example, the true-antecedent and false-consequent combination (TF)—when perceived as relevant—is nearly always classified (correctly) as falsifying the conditional. Interestingly, when explicit negative cases were used to release the matching bias effect (Evans, 1983; Evans et al., in press a), this led to an increase in logically correct responding that does not occur when matching bias is released by the same manipulation on the selection task (see Chapter 3). In the latter case, all mismatching cards—both logically right and logically wrong—are more often selected (Evans et al., in press a). This confirms our proposal that analytic reasoning occurs much more on the truth table task.

Conditional inference tasks test the following types of inference:

Modus ponens (MP)	If p then q, p, therefore q
Denial of the antecedent (DA)	If p then q, not-p, therefore not-q
Affirmation of the consequent (AC)	If p then q, q, therefore p
Modus tollens (MT)	If p then q, not-q, therefore not-p

Classically, MP and MT are described as valid inferences and DA and AC as “fallacies”. However the latter are not fallacious if the conditional is read as the biconditional “p if and only if q”, which is sometimes the natural pragmatic reading in context. With abstract affirmative conditionals, the data summarised by Evans et al. (1993a, Table 2.4) show consistent near 100% MP rates, but with considerable variability on all the others. The median values observed were DA 48%, AC 42%, and MT 63%. Apart from modus ponens, these figures are not very informative about the competence/analytic reasoning issue. Clearly MT is more difficult than MP, but how do we know if subjects are reasoning or mostly guessing with the former? Do the frequent endorsements of DA and AC indicate poor reasoning (fallacies), biconditional readings, or again just guessing?

Curiously enough, it is the evidence of bias in the conditional reasoning task that shows us that subjects are reasoning. It was discovered some while ago (Evans, 1977a) that when negations are introduced into the conditional statements, as on the selection and truth table tasks, they have a massive influence on the frequency of inferences drawn. In this case it is not matching bias but a separate effect called “negative conclusion bias” (Evans, 1982) that has been claimed. The early evidence suggested that on all inferences except modus ponens, subjects were more likely to endorse conclusions that were negative rather than affirmative. The effect appeared to be a response bias, external to any attempt at reasoning, and was discussed in these terms (for example, by Pollard & Evans, 1980). However, recent experimental evidence has provided a rather different perspective.

Evans, Clibbens, and Rood (1995) reported three experiments investigating the effect of negations on conditional inference. These experiments included both conclusion evaluation and conclusion production tasks, and extended the investigation to three forms of conditional statement:

If (not) p then (not) q

(Not) p only if (not) q

(Not) q if (not) p

The results were consistent. Subjects did indeed endorse or produce negative conclusions in preference to affirmative ones but only consistently did so on DA and MT inferences. There was no evidence of a bias on MP and little on AC. The two inferences affected are ones that Evans et al. call “denial inferences” because they lead from the denial of one component to denial of the other. Compare the following two modus tollens arguments:

6.1	If the letter is G then the number is 7
	The number is not 7
	Therefore, the letter is not G

6.2	If the letter is not T then the number is 4
	The number is not 4
	Therefore, the letter is T

Subjects consistently make more MT inferences with problems like 6.1 than with ones like 6.2.¹ The critical variable is a negation in the part of the conditional denied in the conclusion of the argument. Hence, DA inferences are less frequently made when the consequent of the conditional is negative. Evans et al. (1995) discuss in some detail how this might come about and consider ways of extending both the mental logic and mental model theories to account for it. In either case it seems that some kind of double negation effect is involved. Difficult forms always require the denial of a negative in order to infer an affirmative— for example, it cannot be the case that there is not a T, therefore it must be the case that there is a T.

These findings lead us to two very important conclusions. First, the bias cannot be a response bias—otherwise it would show on affirmation as well as denial inferences. Next, the bias must be a consequence of an effort at reasoning because otherwise the double negation difficulty would never be encountered. Hence, the picture that emerges is one of subjects attempting to reason according to the instruction, but whose ability to do so is subject to some problem connected with double negation.

There is more than this to explain for a theory of conditional reasoning, however. For example, on the affirmative conditional, MT is not inhibited by the double negation effect and yet is still only made around 60% of the time. Recall from Chapter 1 that one should sometimes reject the conditional rather than perform MT in ordinary reasoning from uncertain beliefs. One should sometimes be even more doubtful about applying MT when one would have an affirmative rather than a negative conclusion after the application of double negation. It is more risky to draw a conclusion about what is a raven (i.e. a member of a homogeneous and relatively small set) than one about what is not a raven (i.e. a member of a heterogeneous and unbounded set). Recall as well from Chapter 1 that one almost never has grounds for rejecting the conditional rather than performing MP in ordinary reasoning from uncertain beliefs. The same points can be made about DA and AC under a pragmatic interpretation of the conditional as a biconditional. It may be possible to combine these points with the hypothesis of a caution effect in reasoning (Pollard & Evans, 1980) to get some insight into negative conclusion bias, or why double negations are not always removed, with the result made explicit in mental representations.

It will also be important for later discussions to appreciate that matching bias (see Chapter 3) is not a response bias either. We now know that matching bias only occurs under certain specified conditions. First, there is evidence that the bias may be specific to the linguistic form of the connective (Evans & Newstead, 1980), although recent evidence calls some of the earlier conclusions into question on this point (Evans, Legrenzi, & Girotto, submitted). Next the effect disappears when thematic materials or scenarios are used, even when these do not facilitate correct choices (Evans, 1995b; Griggs & Cox, 1983; Reich & Ruth, 1982). Finally, the effect is critically dependent on the use of implicit rather than explicit negative cases (Evans, 1983; Evans et al., in press a). Before considering the implications of these findings for theories of competence, however, we examine some relevant evidence from other parts of the deductive reasoning literature.

Syllogistic reasoning and belief biases

Apart from the Wason selection task, the most intensively studied reasoning tasks in the psychological literature derive from the logic of syllogisms. This is a very restricted from of logic in which arguments consists of a major premise, minor premise, and conclusion. For a conclusion of the form A–C, the major premise links a middle term B to C and the minor premise links the middle term B to A. For example:

6.3	No good researchers are good administrators
	Some teachers are good researchers
	Therefore, some teachers are not good administrators

There are four figures of syllogisms depending on the order of terms B, C in the minor premise and A, B in the major premise. There are also 64 moods as major premises, minor premises and conclusion can take one of four forms conventionally labelled by the letters A, E, I, and O:

A	All X are Y
E	No X are Y
I	Some X are Y
O	Some X are not Y

Of the 256 logically distinct syllogisms, only 25 have been reckoned to be valid by logicians. Psychologically speaking we can double these figures to include syllogisms in which the order of major and minor premises are reversed. Premise order is important as the conclusions favoured are subject to a “figural bias” (Dickstein, 1978; Johnson-Laird & Bara, 1984). Many psychological experiments have been reported in which subjects have been asked to assess the validity of such syllogisms, or to generate conclusions from pairs of major and minor premises. This research, together with numerous psychological theories of syllogistic inference, is reviewed in detail by Evans et al. (1993a, Chapter 7) and we will summarise here only the main points of relevance to our argument.

The most common method involves presenting subjects with a choice of four conclusions of the type A, E, I, and O, plus an option “no conclusion follows”. This provides a chance rate of 20% for correct solution. Actual results show solution rates well above this chance rate: for example Dickstein (1978) using a wide range of syllogisms reports 52% correct responses. As with research on conditional inference, then, we have clear evidence of deductive competence on syllogistic reasoning problems. Because it is normal practice to exclude subjects with training in formal logic, we can also infer that such competence is exhibited with abstract and novel problems that are of a nature not previously encountered by the subjects of these experiments.

As with conditional reasoning also, logical error rates are high and are subject to systematic biases. With abstract syllogisms, people are influenced by the mood of the premises beyond their effect on the logic of the problems—an effect variously interpreted as an “atmosphere effect” or conversion errors. Similarly, the figure of the premises produces non-logical biases such that the order of terms in premises influences the preferred order in the conclusion. As we have already seen in Chapter 5, when pragmatically rich content is introduced, subjects are strongly influenced by a “belief bias” in which believable conclusions are judged to have greater validity than unbelievable ones.

Excluding a certain amount of random error, syllogistic reasoning performance—like conditional reasoning performance—is subject to two systematic influences that we may term the logical and non-logical component. In all these experiments subjects are significantly influenced by the logic of the problem—our inescapable evidence for deductive competence—but also influenced by a variety of non-logical “biases”. The data of Evans et al. (1983) discussed in Chapter 5 and summarised in Fig. 5.1 are paradigmatic. When analysed by logic, 72% of subjects accepted valid conclusions compared with only 40% who accepted invalid conclusions. When analysed by the non-logical belief factor, however, 80% accepted believable conclusions and only 33% accepted unbelievable conclusions.

We have indicated that we are interested in this chapter not only in the question of whether subjects possess abstract deductive competence—which it seems they do—but also in whether such competence is linked to explicit, verbal reasoning. We have already seen some evidence for this linkage in our discussion of the selection task in the previous chapter. Research on belief biases provides further relevant evidence. An important question concerns the ability of subjects to respond to deductive reasoning instructions that tell them to disregard their prior beliefs and base their reasoning only on the premises given. Now in fact, some kind of instruction to this effect is included in all the standard belief bias literature including the study of Evans et al. (1983), so we might conclude that subjects’ ability to take account of such instructions is very limited. However, recent studies have been reported in which the standard type of instructions have been both weakened and strengthened.

Newstead et al. (1992, Experiment 5) employed augmented instructions that gave added emphasis to the concept of logical necessity. Standard instructions, given also to the control group, included the following:

You must assume that all the information that you are given is true; this is very important. If and only if you judge that the conclusion logically follows from the information given you should write “YES” in the space below the conclusion on that page.

Thus even for the control group, it is clear that deductive reasoning is required and prior belief irrelevant—without such instruction the claim that prior belief exerts a “bias” would be empty. However, augmented instruction groups additionally received this:

Please note that according to the rules of deductive reasoning, you can only endorse a conclusion if it definitely follows from the information given. A conclusion that is merely possible but not necessitated by the premises is not acceptable. Thus, if you judge that the information given is insufficient and you are not absolutely sure that the conclusion follows you must reject it and answer “NO”.

The emphasis on logical necessity is particularly relevant in the attempt to remove belief bias, because the effect principally reflects the acceptance of invalid but believable conclusions (see Fig. 5.1). The effect of these additional instructions in the Newstead et al. study was dramatic. Acceptance rates for invalid-believable arguments dropped from 50% under standard instructions to only 17% under augmented instructions, effectively removing the belief bias effect entirely. However, subsequent experiments reported by Evans, Allen, Newstead, and Pollard (1994) showed that belief bias could be maintained even in the presence of such augmented instructions. The conclusion of Evans et al. (1994) was that instructions reduce but do not eliminate the effects of belief.

Two recent studies of the influence of belief on reasoning have taken the opposite strategy of weakening the instructional requirements for deductive reasoning. Stevenson and Over (1995) were concerned with the suppression of modus ponens by use of auxiliary premises that served to weaken belief in the conditional premise (see Chapter 5). In Experiment 1 they used conventional deductive reasoning instructions in which subjects were told to assume that the premises were true and to indicate what conclusion followed. In Experiment 2, however, they were asked to imagine they were listening to a conversation and to indicate what they thought followed. The effect of this change on responses was very substantial. In Experiment 1, rates of modus ponens and modus tollens were 83% and 79% with an additional premise supporting the conditional, dropping to 40% and 40% with a premise undermining the conditional. Note that this is a substantial suppression effect despite the use of deductive reasoning instructions. However, in Experiment 2 the corresponding figures were 60% MP and 38% MT, dropping to 12% and 12%.

George (1995) also looked at suppression of valid conditional inferences based this time on a priori belief in the conditional premise. In his Experiment 3, one group was told to assume absolutely the truth of the premises and another to take into account the uncertainty in the premises. He concluded that belief-based reasoning was easier than premise-based reasoning because in the second group 96% of subjects conformed with instructions and took account of belief, whereas in the first group only 43% were able to comply with the instruction to assume the premises and ignore prior belief.

The evidence from these studies points to some conclusions about deductive competence. In discussing the belief bias effect in Chapter 5, we argued that it is rational₁ to reason in real life from all relevant and well-justified beliefs, and not just from some arbitrary set of assumptions. However, it is clearly irrational₂ not to restrict oneself to given assumptions in a logical reasoning task. All of the studies described show that the use of deductive reasoning instructions, however strongly worded, cannot totally suppress effects of belief on reasoning, thus supporting our argument that rational₁ reasoning reflects habitual, tacit processes. However, these studies show that when presented with deductive reasoning instructions, subjects can to a significant degree evaluate the arguments in a logically sound manner. They also show clearly that the extent to which people actually apply a deductive reasoning strategy is highly open to influence by verbal instruction, suggesting that rational₂ reasoning does indeed reflect the operation of explicit verbal processes, under some conscious control.

The Mechanism of Deduction:
Rules or Models?

We have now established our case that there is a human facility for abstract deductive competence demonstrable in the experimental psychological literature on reasoning. Human beings have been able to build on this faculty to axiomatise logic, develop mathematics and science, and create advanced technologies. Before turning to possible theoretical accounts of deduction, let us summarise what we know about this facility. First, it is fragile not robust, without special training in logic. The natural ability to reason in an explicit deductive manner is limited and much prone to errors and biases that cannot be removed by any amount of verbal instruction to reason logically. However, the deductive reasoning mode does appear to be under some degree of conscious control because people can turn it on or off in response to verbal instructions.

Of the four major theoretical approaches in the reasoning literature (see Evans, 1991), only two address the question of abstract deductive competence. The heuristic approach in general is mainly concerned with the explanation of biases and the heuristic-analytic theory of Evans (1984, 1989) specifically omits a description of the mechanism of analytic reasoning. The theory of pragmatic reasoning schemas (as in Cheng & Holyoak, 1985) is concerned with how people reason with realistic and familiar problem content where they have the opportunity to retrieve and apply schemas learned previously. This kind of theory is important because it is quite possible that people reason differently with familiar than with abstract materials. However, the theory as formulated can provide no explanation of the the general deductive competence for which we have provided evidence in the first part of this chapter.

This leaves us then with essentially the two choices discussed briefly in Chapter 1. One is the proposal that people reason by following abstract inference rules, in a restricted natural deduction version of “mental logic”. The alternative is the theory of reasoning by mental models, which is particularly associated with the work of Johnson-Laird and his colleagues. We consider the merits of each approach in turn for explaining the competence we have identified.

Mental rules

The basic idea of the rules approach is that there is an in-built logic in the mind comprised of a set of primary inference rules plus a reasoning program or strategy for their application to sets of assumptions. Because the rules are general purpose and abstract, reasoning must be preceded by a stage of encoding the actual problem content into an abstract form, and succeeded by a decoding of any conclusion drawn back into the problem domain. It might appear to be only in these encoding and decoding phases that the theory has the potential to account for the hugely influential effects of thematic content on reasoning. However, in the words of O’Brien (1993, p.131), “Mental logic theorists have never claimed exclusivity” and propose that an abstract logic may co-exist with pragmatic reasoning procedures (see also Rips, 1994).

There are currently two major forms of the mental logic, or inference rule, theory which we present here only in conceptual outline (for detailed exposition and review see Evans et al., 1993a, Chapter 3). First, there is the three-part theory of Braine and O’Brien (see Braine & O’Brien, 1991; Braine, Reiser, & Rumain, 1984; O’Brien, 1993). The three components of this theory are the set of natural inference schemas, the reasoning program that applies these schemas, and the pragmatic reasoning system. The first two parts should explain abstract competence. There are simple inference rules that can be applied by a direct reasoning program and that should be immediate and easy to draw: for example, modus ponens. Then there are compound inference rules that require indirect reasoning by use of suppositions and are more error-prone. For example, there is no simple rule for modus tollens—it must be effected by an indirect line of reasoning using reductio ad absurdum. Given “If p then q” and not-q, one has to make a supposition of p, derive q by modus ponens, and so get the inconsistency of q and not-q. From that, one infers that the supposition p cannot hold and derives not-p.

The major alternative is the theory of Rips’ (1983) ANDS model recently expanded and refashioned as PSYCOP (Rips, 1994). Rips’model is implemented as a working Prolog computer program and differs from that of Braine and O’Brien in being primarily aimed at proving theorems where the conclusion of the argument is known in advance. This is achieved by a combination of forward reasoning in which rules are applied to derive conclusions from premises and backward reasoning in which sub-goals are generated, thus greatly reducing the search space required by forward reasoning, in a manner analogous to the early Logic Theorist of Newell & Simon (1972). When no conclusion is specified, only forward rules can be used, which have limited competence. In fact, there are close parallels between this system and that of Braine and O’Brien: for example, the simple inference schemas of the latter are essentially similar to Rips’ forward reasoning rules, and Rips’ backward rules likewise involve suppositional reasoning of the kind described by Braine and O’Brien as indirect.

The mental logic or inference rule theory does account for basic deductive competence, but has serious limitations as a psychological theory of behaviour on reasoning tasks. First, as acknowledged by the inference rule theorists themselves, pragmatic influences can only be accounted for by mechanisms additional to and separate from the basic mental logic, such as invited inferences, conversational implicatures, and pragmatic reasoning schemas. Because almost all real word reasoning takes place in semantically rich contexts in which such pragmatic influences apply, it seems rather odd to us to identify an abstract mental logic as the primary reasoning mechanism whilst requiring a supplementary pragmatic theory to account for most of what actually happens! Also, we have already seen clear evidence both that the natural way of reasoning is primarily from uncertain beliefs, and that reasoning logically on the basis of a restricted set of assumptions requires a conscious and only partially successful effort by the subjects.

The second major problem for the inference rule theory concerns its account of error and bias. The strongest claim it seems that such theorists can make is to be able to specify the conditions under which errors are likely to occur: for example, when indirect reasoning procedures are required or when more steps of inference are involved (see Rips, 1989). Take the case of the Wason selection task. Inference theorists have argued that the poor solution rate on the abstract task is consistent with their theory because the task is difficult to solve by mental logical principles (Rips, 1994; O’Brien, 1993). For example, Rips argues that only forward rules such as modus ponens—leading to selection of the p card—can be applied, so that there is no means provided to support not-q choices, whereas O’Brien argues that not-q selections require a complex chain of indirect reasoning that few subjects are likely to achieve.

This is all very well, but the theory is telling us little about what subjects actually do. People endorse modus tollens less than modus ponens, but the only explanation for this in the theory is that the latter is a simple rule in mental logic, while the former is a compound one there and so more difficult to execute. No account is proffered for why MT is not expressed in a simple rule. We indeed showed in Chapter 1 that, for inferences from uncertain beliefs, one should more often reject the conditional when given the premises for MT than when given those for MP. But the inference rule theory has no natural way of accounting for uncertainty in the premises—it is essentially based on the idea that the premises are assumptions or suppositions.

The theory also lacks the means for explaining such clear phenomena as matching bias and negative conclusion bias that we have already argued are not response biases that can be tacked on to a competence system. In fact, none of the vast range of psychological factors that are now known to influence behaviour on the Wason selection task in both abstract and thematic versions (reviewed by Evans et al., 1993a, and here mainly in Chapter 4) can be accounted for by any part of the psychological theory of mental logic. In a way, of course, the mental rules theory could be viewed as the mirror-image of the heuristic-analytic theory in that it is explicit about the logical component of performance with little to say about the non-logical component. Because the selection task provides the least evidence of deductive competence of any reasoning problem studied in this field, it is not surprising that the rule theory is at its weakest here where the H-A theory is at its strongest. However, if the rule theory was to provide the missing analytic component it would have to provide a mechanism for competence that was both inherently plausible and able to integrate at a psychological level with the heuristic/relevance account of the non-logical component of performance. We will return to this issue following a brief survey of the rival mental model theory approach.

Mental models

The theory of reasoning by mental models, introduced by Johnson-Laird (1983), has been very successful, not least because of the energy of its advocates who have applied the theory in relevant experimental studies to all major fields of deductive reasoning research, including syllogistic reasoning, conditional and propositional reasoning, relational reasoning, and meta-deduction (see Johnson-Laird & Byrne, 1991). The theory has also been judged the most complete of the current reasoning theories in a different sense. It is the only theory to have been applied to all three major problem areas in the reasoning literature: those of deductive competence, bias, and content effects (Evans, 1991).

As we pointed out in Chapter 1, the theory of mental models itself postulates a limited kind of mental logic, with rules for manipulating mental models taking the place of natural deduction inference forms. But the inference rule theory, as this name for it suggests, is more heavily committed to logical rules than Johnson-Laird’s mental models theory, and hence one can describe the difference as one of (logical) rules vs. models. In Johnson-Laird’s theory, mental models consist of tokens and represent putative states of the world. The basic theory as originally applied to syllogisms and quantified reasoning (Johnson-Laird & Bara, 1984) involves three basic stages:

Given some premises, the reasoner forms a provisional mental model representing a possible state of the world in which these premises are true.
The reasoner then inspects the model in order to derive a provisional conclusion of a not-trivial nature (e.g. not a repetition of a premise).
The reasoner searches for counter-examples—i.e. models in which the premises would be true and the conclusion false. If no such counter-examples are found then the conclusion is regarded as valid.

The third stage is crucial if the subject is genuinely to attempt deductive reasoning, and predictions for syllogistic reasoning are focused on this stage. For example, Johnson-Laird and Bara (1984) showed that the more alternative models the subjects need to consider, the less accurate is their reasoning—an effect that they attribute to limited working memory capacity. Similarly, the mental models explanation of the typical belief bias effect in syllogistic reasoning— acceptance of invalid but believable conclusions—is that subjects lack motivation to seek counter-examples when the initial conclusion favours belief. In moving the theory into the domain of propositional reasoning (Johnson-Laird & Byrne, 1991; Johnson-Laird, Byrne, & Schaeken, 1992), however, the role of the third stage has become less clear. In conditional reasoning, for example, it is typically assumed that the subject represents the sentence as one or more explicit models plus an implicit model. For example, “If p then q” might be represented as:

[p]	q
…

Johnson-Laird and Byrne suggest that given the second premise p, subjects will immediately infer q (modus ponens), and given q they will infer p (affirmation of the consequent) unless they flesh out the representation to provide a counter-example, e.g.:

[p]	q
¬p	q
¬p	¬q

Evans (1993b) has argued that only modus ponens should follow from the initial representation in which the minor premise p is exhaustively represented (indicated by the square brackets). To infer q directly from p, appears to violate the third general principle of mental model theory—namely the search for counter-examples. The point is that the presence of the implicit model means that the subjects know that other situations may be consistent with the conditional, even though they have not yet thought what those situations might be. Hence, unless the premise is exhaustively represented it should be apparent that there could be a counter-example model to be found.

Johnson-Laird and Byrne (1993, p. 194) argue against what they term “impeccable rationality” and propose instead:

The … notion of deductive competence rests on a meta-principle: an inference is valid provided that there is no model of the premises in which its conclusion is false. Individuals who have no training in logic appear to have a tacit grasp of this meta-principle, but have no grasp of specific logical principles … They have no principles for valid thinking, i.e. for searching for models that refute conclusions.

What we understand Johnson-Laird and Byrne to be saying is that subjects will only draw inferences where no counter-example is present, but their competence in executing the original stage 3 of the model theory is weak. Thus subjects will not necessarily seek to flesh out models with implicit components, and will not suppress an inference simply because there could be a counter-example. It is certainly true that the characteristic error in both propositional and syllogistic reasoning is to execute “fallacies”, i.e. to draw more conclusions than are strictly warranted by the premises given—a finding consistent with weak understanding of logical necessity. We have already seen that instructions emphasising necessity can reduce endorsement of fallacious conclusions and associated belief bias (Evans et al., 1994; Newstead et al., 1992), suggesting that the search for counter-examples is to some extent under conscious control.

The mental model theory appears to us to be a more complete psychological theory of reasoning than the inference rules theory in that errors and biases can be accounted for in ways that are intrinsic to the reasoning process. We have already seen one example, in the case of belief bias, where the theory can naturally incorporate the idea that search for counter-examples is curtailed when the putative conclusion is believable. Now, we have noted that rule theorists are happy to concede the presence of “response biases” that are external to and additional to the process of reasoning (e.g. O’Brien, 1993). However, we have already argued that effects such as matching bias and negative conclusion bias in conditional reasoning are not response biases, but arise as part of the effort of reasoning (see Evans et al., 1995 and in press a, for detailed experimental evidence and discussion concerning these biases). For example, if “negative conclusion bias” was really a bias to endorse negative conclusions following a process of inference, the inference rule theory would not be affected. Since it has something to do with double negation, as Evans et al. (in press a) have shown, it is a major problem; the rule of double negation, from not-not-p to infer p, is included as a primary (direct, forward) rule of inference in both the major inference rule systems we have discussed, and should therefore be immediately executed.

The concrete reasoning specified in the model theory can incorporate such an effect (although it was not predicted by model theorists). For example, modus tollens is difficult with a negative antecedent conditional “If not p then q”. If this is represented as:

[¬p]	q
…

then the minor premise (not-q) clearly eliminates the first model. In order to draw the correct conclusion p, however, the subject must appreciate that, because not-p is exhaustively represented, any other model must have a p. Due to double negation this is more difficult than when the antecedent is affirmative and [p] is the case eliminated. Here the conclusion not-p is a natural denial of a supposition.

A further attraction of the model theory for us is the ability to link with the notion of relevance. Mental model theorists are already talking about “focusing” effects in which subjects are assumed to concentrate on cases explicitly represented in mental models (e.g. Legrenzi et al., 1993). As we argued in Chapter 3, relevance is the prime cause of focusing. Hence the model theory would appear to provide the simplest solution to the missing analytic component of the heuristic-analytic theory, with heuristic relevance determining the content of the models from which inferences are then drawn in the manner described by Johnson-Laird and Byrne. Before opting for this easy solution to a very complex problem, however, we are aware of some further issues in the inference rules versus models debate that need to be aired.

Can we decide between rules and models?

The advocates of the inference rules and mental model theories themselves appear to believe that it is possible to decide who is right. For example, Johnson-Laird and Byrne (1991) are very confident that they have established the case for models, and present a number of apparent grounds for rejecting inference rules. Equally, theorists such as Rips (1994) believe that the case for their rules is unanswerable, and the model theory has been subjected to very strong attacks in the recent literature (see O’Brien, Braine, & Yang, 1994; Bonatti, 1994; and for a reply, Johnson-Laird, Byrne, & Schaeken, 1994). For a detailed review of the large range of evidence and argument that has been offered in the inference rules versus models debate see Evans et al. (1993a, Chapter 3).

For the neutral observer, trying to decide between inference rules and models on the basis of the arguments, the debate between the two can be frustrating. Both sides attribute predictions to the other approach that they then proceed—to nobody’s surprise—to refute. The debate is also conducted as if it were between two precise scientific theories, capable of being clearly confirmed or disconfirmed in experiments easy to interpret, yet unfortunately this is not the case. Consider the free parameters that each side permits itself in explaining the data.

Inference rule theorists can choose what list of rules to have and what mechanisms for their application are proposed. They can attribute error to the use of complex rules, made up of indirect reasoning procedures or proofs with more steps of inference. They can allow for variance in the unspecified pragmatic processes that carry out the translation between the domain and the abstract code used for reasoning. As if this was not enough, they can also attribute variance to response biases, and to pragmatic reasoning schemas and may even suppose the use of mental models as well. It is very hard to see what would provide a means of strongly confirming or disconfirming the core of this theory.

Mental model theory has similar problems. First, there is no pragmatic theory of mental representation: the proposals concerning the contents of models are provided by argument and example rather than by full computational procedures, and are subject to arbitrary change and some inconsistency—e.g. on the subject of conditionals with negations in them (see Evans, Clibbens, & Rood, 1995). Explanations about pragmatic influences, say by thematic content on the Wason selection task, are attributed to the influence of knowledge upon model representation without specification of principles or mechanism. Next, there are no clear principles to explain when fleshing out of models will occur, and even the mechanism for deduction from a given representation is sometimes left unspecified. As an example, Evans (1993b) derived a new prediction of an affirmative premise bias from his interpretation of the mental model theory of conditionals, but there seems to be no way to establish whether this is predicted by the “official” theory

If neither approach is fully and precisely defined in itself, then the issue of which is “correct” is going to be hard to decide on empirical grounds. The best one might achieve is to disconfirm a particular version or application of either theory. Like a hydra, however, the theories may grow new heads faster than one can cut them off. A further problem is that both approaches are forms of mental logic that are essentially similar at a deep level. Oaksford and Chater (1993, 1995) for example have used this argument in an attempt to demonstrate that both theories are subject to computational intractability problems. Oaksford and Chater’s argument is that in real life we have to reason from many beliefs so that any logic-based theory will become intractable whether based on syntactic (rules) or semantic (models) principles.

Actually we would argue that Oaksford and Chater’s analysis is not correct, because the focusing power of the heuristic (pragmatic, relevance) system is such as to constrain all reasoning to a small set of premises. Of course, we are also well aware that this puts the computational problem back a stage leaving us with the “frame problem” that no one currently knows how to solve. There is, however, a profound implication of this analysis for reasoning research. It suggests that the common agenda of the model and rule theorists—the search for a mechanism of deduction—is relatively lightweight. The computationally heavy and rational₁ part of the process lies in the highly selective retrieval and application of information from memory. Accounting for people’s relatively modest, in every way, rationality₂ should be much easier.

It is a challenge to separate rule and model theories by empirical means, but we see some grounds for preferring the model theory approach, at least for reasoning with novel problems. When reasoning with semantically rich materials, we think it likely that subjects will retrieve rules and heuristics that they have learned to apply in similar contexts. We have shown earlier in this chapter that deductive competence from assumptions is associated with the explicit cognitive system, but it is fragile, always subject to errors and biases, and easily abandoned in favour of reasoning from uncertain belief. Thus the idea of an innate, mental, natural, deduction system as the core for intelligent processes, with pragmatic processes added on as an afterthought seems implausible to us.

For those who regard mental model theory as an equally strong form of mental logic, the same arguments would apply. However, its main proponents do not regard it as such, as the above quote from Johnson-Laird and Byrne (1993) clearly illustrates. We agree with them that the meagre amount of deductive competence actually achieved by subjects in reasoning experiments requires nothing more than a grasp of their semantic principle. Really all it comes down to is this: in trying to decide what must follow from some premises, people try to see what is common to all the states of affairs they can think of in which the premises are true. We find this minimalist description of human logical reasoning far more plausible than a set of innate abstract inference rules. Notice as well that it appears to have a natural extension to the case of uncertain premises. These will not be true in all possible models people can think of, but they can take account of that and try to see what follows in most of the models in which the premises are true. There is thus a way to extend mental models theory to probabilistic reasoning, and this is a big advantage it has over the rules approach (Johnson-Laird, 1994a, b; Stevenson & Over, 1995).

We have other reasons for preferring the model type theory, some of which we have already indicated. We regard it as a much more psychological theory, capable of yielding—at least in principle— accounts of the influence of pragmatic factors and biases in ways that are intrinsic to the process of reasoning proposed, and not extrinsic as in the mental model account. We are attracted also by the linkage with focusing and relevance which we see as arising from the implicit heuristic processes. Perhaps most importantly, model theory can provide a framework for understanding decision making as well as deductive reasoning (see Chapter 7). We do, however, have important reservations about the current specifications of the theory of Johnson-Laird and Byrne, which we will elaborate in Chapter 7.

Note

Strictly speaking 6.2 is not a modus tollens argument the conclusion of which would instead be “Therefore the letter is not not T”. From a logical point of view 6.2 thus combines MT with double negation elimination. We follow the conventional terminology of the psychological rather than philosophical literature on this point.