12

Lessons for (the) Living

Anomalistic psychology is not just a fascinating topic in its own right, it is also a relatively painless way of picking up important critical thinking skills that can be applied in all aspects of life, including those which may appear to be a million miles away from the paranormal.

Among the more obvious applications of such skills is the assessment of controversial scientific claims. Newspapers and other media constantly bombard us with such claims, and many people understandably feel unable to decide what to believe. One of the most important current areas of apparent controversy is whether global warming is really happening and, if it is, the degree to which it is caused by human activity. The very future of life on our planet may depend on addressing such issues correctly. The coverage of this topic provides a perfect example of how the media often do not simply report on genuine scientific controversies but sometimes play a major role in generating what we might call pseudo-controversies. When the vast majority of the scientific experts in an area speak with one voice but are opposed by a very small but vocal minority, we would probably be wise to listen to the majority and, in this context, adopt the precautionary principle with respect to looking after our planet.

On a more personal level, critical thinking skills are certainly required if we are to make wise choices when it comes to consumer behavior. Many of the cognitive biases, particularly those related to reasoning, that lead us to draw faulty conclusions when considering the paranormal also appear to skew our judgment when it comes to buying, selling, investing, and (especially) gambling. Models of classical economics assume that people make such decisions based on an entirely rational analysis of costs and benefits. Unfortunately, this central assumption is just plain wrong, as shown by numerous ingenious and compelling experimental investigations.

Medical decision-making is often literally a matter of life and death, and yet here too our judgments and behavior often depart radically from anything that could be described as rational. Where our own health and the health of our loved ones are concerned, the stakes are at their highest, and we find it hard to deal with the inevitable inherent uncertainties involved in such situations. Emotions run high, and this too will cloud our judgments. This has been demonstrated all too often during the COVID-19 pandemic. And yet it is precisely in such areas that we really do need to do our absolute best to make the correct choices regarding medical treatments and healthy lifestyles. The popularity of complementary and alternative medicines is undeniable proof of our human irrationality where health is concerned. Once again, the media is often at fault, making exaggerated claims regarding miracle cures, on the one hand, and health risks, on the other, and generating yet more pseudo-controversies.

On occasion, insights from anomalistic psychology are directly relevant to evaluating controversial claims in areas that appear to be completely unrelated to the topics covered in this book. For example, chapters 5 and 6 discussed the use of hypnotic regression to allegedly recover blocked memories of alien abduction and past lives, respectively. In both cases, the evidence very strongly suggests that such “recovered memories” are, in fact, false memories. However, the same method is used by some therapists in attempts to recover allegedly repressed memories of childhood sexual abuse, including extreme claims of ritualized satanic abuse. I suspect that most people would be rather skeptical regarding reports of alien abduction or past lives based on hypnotic regression (although clearly a substantial minority are not). But many of those people who would dismiss such reports may well be inclined to accept reports of recovered memories of childhood sexual abuse obtained in this way even in the complete absence of any independent evidence that such abuse ever took place.

Why might many people feel it is more reasonable to believe that “recovered memories” of childhood sexual abuse are likely to be true even though they would reject memories of aliens and past lives? For one thing, any reasonable person would know that childhood sexual abuse really does occur throughout society on a scale that is much higher than was once recognized—and it really can have devastating psychological consequences for the victims. The evidence that alien abduction and reincarnation really take place is thinner on the ground, to put it mildly.

A second factor is the widespread acceptance by both the general public and professionals of psychoanalytic notions such as repression despite the fact that most memory experts are very dubious regarding this very concept.1 In general, those suffering traumatic events are much more likely to remember than to forget them. The lesson from anomalistic psychology is that, in the absence of independent evidence, one should treat all reports of recovered memories with a great deal of caution.

Elsewhere I have discussed two examples of contexts in which some knowledge of the ideomotor effect, so familiar to students of anomalistic psychology, may well have helped to prevent needless tragedies.2 As readers will recall, the ideomotor effect is a phenomenon whereby one’s own beliefs and expectations can result in unconscious muscular movements. It is the explanation for several ostensibly paranormal phenomena as previously discussed, including table-tilting, the Ouija board, and dowsing.

The first example is that of the ADE-651, a device that was claimed to be able to detect not only explosives and weapons but also human bodies, contraband ivory, truffles, drugs, and even banknotes. It was apparently able to do so even if the target object was several miles away, underground, or underwater. Not surprisingly, this amazing piece of technology did not come cheap, costing up to $60,000 per unit, but if it provided an effective means to prevent the loss of hundreds of lives in terrorist attacks, it would clearly be a price worth paying. The only problem was that it didn’t. It was a piece of junk that cost a few dollars to produce and was no more effective than a chocolate teapot What is more, the British company that produced the device, Advanced Technical Security & Communications (ATSC), was fully aware of this fact.

In the first decade of this century, these useless devices were sold to twenty countries in the Middle East, including Afghanistan and Iraq. The Iraqi government alone spent around $80 million on them. Each device consisted of a handheld unit into which was mounted a swiveling antenna. It was claimed that the device was powered by static electricity alone and could be charged up by the operator holding the device while walking or shuffling their feet for a few moments prior to use. The inventor of the ADE-651 and founder of ATSC, James McCormick, claimed that the device worked on a similar principle to that of dowsing. That should have set alarm bells ringing very loudly given that, as discussed, dowsing has never been shown to be effective when tested under properly controlled conditions. James Randi publicly offered $1 million to anyone who could demonstrate that the device really worked. Tellingly, no one from ATSC ever responded to the challenge. It is clear that McCormick was fully aware that the devices did not work. It is reported that on one occasion, when challenged about the effectiveness of the device, McCormick replied that it did “exactly what it’s meant to . . . it makes money.”

In all probability, this cynical scam cost many hundreds of innocent lives. In January 2010, the BBC’s Newsnight program broadcast an exposé of the device, and in April 2013 McCormick was convicted of fraud and sentenced to ten years in prison.3 In 2018 his sentence was extended by two years following his refusal to meet a shortfall of $2.5 million in repayments to recompense organizations defrauded by him.

A second example of tragic consequences resulting from a lack of knowledge of the ideomotor effect is the discredited pseudoscience of facilitated communication (FC). FC, which is sometimes referred as progressive kinesthetic feedback and supported typing, is a technique that attempts to allow people with severe communication difficulties to express their thoughts and feelings with the assistance of a facilitator. The facilitator steadies the disabled individual’s arm or hand enough for them to operate a keyboard or other device. Proponents of this technique believe that the communication problems of such individuals, perhaps resulting from extremely severe autism or cerebral palsy, are primarily due to motor difficulties, not impaired intellect. Thus, it is claimed, by reducing the impact of such motor impairments, the person can express their inner voice. Needless to say, such a positive message was music to the ears of the impaired individual’s loved ones. The parents of many severely impaired children were now convinced that their beloved offspring were no longer prisoners of silence. Sadly, it turned out to be too good to be true.

This technique first became popular in the mid-1970s thanks to the efforts of its inventor, Australian teacher Rosemary Crossley. Sociologist Douglas Biklen then introduced the technique into the United States, and from there it spread around the world. Although media coverage was initially overwhelmingly positive and uncritical, some were skeptical from the outset. The critics pointed to the fact that none of the disabled individuals now expressing themselves so eloquently had ever had any formal training in reading and writing. Furthermore, messages were often being produced when the individual was not even looking at the keyboard. The reader has probably guessed by now that the true source of the messages was not, in fact, the disabled individual at all. It was the facilitator, albeit without any conscious awareness of this fact on their part. They were, in effect, using the disabled individual as a kind of human Ouija board.

This unwelcome suspicion was confirmed by the results of dozens of properly controlled double-blind tests.4 These tests established that the correct answer to questions was only ever produced if the facilitator knew the answer. In situations where the disabled person was shown a picture of a test object, such as a ball, but the facilitator was led to believe that a different picture had been presented, say of a teddy, the answer produced was always the one that the facilitator thought was the correct answer, not the one that was actually correct.

Some may argue that even if facilitators, teachers, and loved ones were mistaken in believing that FC was a valid method of communication with the severely disabled, it was still worth doing as it brought joy into the lives of carers and did no actual harm. Sadly, even this is not true. There have been literally dozens of allegations of sexual and physical abuse based on messages generated using facilitated communication. In other cases, disabled people were sexually abused by facilitators who believed that consent for such activity had been given via FC.

The Limitations of Science

As the examples above demonstrate, the scientific method often provides a powerful technique to help us to differentiate between true and false claims. However, one of the most profound lessons I have learned from my decades of working in psychology and parapsychology is the need to be aware of the weaknesses as well as the strengths of the scientific method. I became particularly aware of the former when reflecting on my experience of attempting to publish a set of failed replications of a controversial parapsychological study some years ago.

This series of events began in 2011 with the publication of a set of nine experiments in the prestigious Journal of Personality and Social Psychology (JPSP) by Professor Daryl Bem of Cornell University.5 Bem is a well-respected social psychologist who has made many important contributions to the field. What is unusual about him is that, in contrast to most psychologists, he is a strong believer in psi. On first reading, Bem’s series of studies, involving as they did over a thousand participants, appeared to present strong evidence that people could somehow sense future events before they happened; in other words, that precognition is real. What is more, Bem had published his results in a mainstream psychology journal, knowing full well that had they been published in a parapsychology journal, they would have been ignored by the world’s science reporters. As it was, they were reported and discussed around the world, much to the delight of most parapsychologists.

There was a theme running through the nine experiments, all but one of which reported statistically significant results supporting the existence of psi. Bem had adapted several standard psychological techniques by “time-reversing” them. In many experimental psychology studies, a manipulation at time T1 will have a predictable effect on performance at time T2. According to Bem, even if performance is measured before the manipulation takes place, it will still be affected by that manipulation. Rather than describe all nine of Bem’s experiments, I will illustrate the idea of time reversal by describing the technique employed in Experiment 9, the experiment with the largest effect size.

Experiment 9 investigated what Bem called the retroactive facilitation of recall. If you were to look at a list of words only once and then have your memory for those words tested, you would not be surprised if your performance was poorer than if you had been allowed to look at the words several times. After all, there is nothing controversial about the idea that rehearsal improves memory performance. What is rather more controversial is the notion that rehearsal will improve memory performance even if the rehearsal does not take place until after memory has been tested. What Bem did in this experiment was to present a set of forty-eight words, one at a time, on a computer screen to each participant. They were then allowed five minutes to write down every word that they could remember. The computer then randomly selected half of the original words and presented them again for extra processing (i.e., rehearsal). Results suggested that the words that were randomly selected for additional processing were remembered better than the other words—even though this additional rehearsal did not take place until after memory had been tested. This appears to be a claim worthy of Alice in Wonderland, and yet this is precisely the claim that appeared to be supported by the results of Bem’s Experiment 9.

As you might expect, the experimental techniques and statistical analyses employed in this series of studies were criticized by skeptics from the outset, but the feeling generally was that, although the criticisms were generally valid, each of the flaws identified was not in itself a major concern.6 The ultimate test of whether the effects reported by Bem were real was if they would replicate. After all, the mantra is often heard that “replication is the cornerstone of science.” To his great credit, Bem was indeed keen for other researchers to replicate his findings, even offering to make his software freely available to any researchers who wanted to carry out such replications. Was it possible that Bem had finally discovered the Holy Grail of parapsychology—a robust and replicable paranormal effect?

In collaboration with Stuart Ritchie and Richard Wiseman, I decided to take him up on this kind offer. Given that all three of us were skeptical about actually replicating Bem’s psi effects, the reader may be wondering why we chose to do this. The honest answer is, we had an ulterior motive. We thought this would be a relatively quick and easy way to get a paper published in a top psychology journal, the Journal of Personality and Social Psychology. After all, the world’s science media had widely covered the original controversial findings. If, as we expected, the effects failed to replicate, this would surely be worthy of publishing in the self-same journal, especially given Bem’s explicit encouragement for other researchers to attempt replications.

We decided to each carry out an independent replication of Bem’s Experiment 9, the study of the retroactive facilitation of recall that had produced the largest effect size of all in his experimental series. We consciously decided to replicate Bem’s methodology as closely as we could, a task made much easier by the fact that we were using the same software as him. There was a good reason for this. There is a tendency for parapsychologists to dismiss the results of failed replications if there was any deviation from the exact methodology used in the original study that produced results in favor of the psi hypothesis. Of course, such deviations are not seen as being a problem if the previous results are apparently successfully replicated!

Much to our complete lack of surprise, none of our three studies replicated the results of Bem’s Experiment 9. We wrote up our findings and submitted our paper to the JPSP. In response, we received a polite reply from the editor rejecting our paper without even sending it out for peer review. His reason? The JPSP simply did not publish replications. In light of the huge amount of publicity that Bem’s paper had received, we argued that it was important that failures to replicate these effects should be published, but the editor still refused to send our paper out for peer review. We submitted our paper to two more high-impact journals, Science Brevia and Psychological Science, and received exactly the same response. Note that we were not arguing that any of these journals should automatically publish our paper, simply that they should send it out for peer review in the standard way.

Our initial failure to get a journal to send our paper out for peer review caused a considerable amount of comment in the media, as it revealed in a very stark manner the problem of publication bias.7 If journals simply refuse to publish replications—especially failed replications—the research that they do publish will in no way be representative of the field as a whole. Instead, it will strongly overrepresent significant, positive findings of novel, often counterintuitive effects. Such effects may well attract considerable media attention, but, given their novelty, it is unclear at that early stage whether they would replicate at all.

We were pleased that when we submitted our paper to the British Journal of Psychology, it was finally sent out for peer review. One of the referees was very positive about our work and felt that the paper should be published pretty much as it was. The other referee was far less positive and most certainly did not feel the paper should be published as it was. On the basis of the comments made by the second referee, we suspected that we knew who it was. We thought it was probably a certain Professor Daryl Bem of Cornell University—the very man whose results we had failed to replicate. When we asked Bem if he was indeed the second referee, he confirmed that he was. We pointed out to the editor that this appeared to be a rather glaring conflict of interest and requested that a third referee be sought to decide the issue. Our request was turned down.

Almost on the verge of giving up hope of ever getting our paper published, we decided to submit it to the open access journal PLoS One. Much to our relief, the paper was sent out for peer review and, following minor revisions, was finally published8. This was the first paper I ever published in an open access journal, and I was very glad I did. Because it was open access, anyone could download the paper free of charge—and it turned out that lots of people were interested in our results. At one point, the paper was receiving over a thousand views a day. To date, it has received over 50,000 views.

Our timing was fortuitous. Bem’s paper had appeared at a time when many researchers were starting to express concern regarding replicability issues within psychology. It had long been recognized that parapsychology’s biggest challenge was to find techniques that could be used to reliably demonstrate psi effects under controlled conditions, but the magnitude of this problem within mainstream psychology was only just beginning to be appreciated. Concerns were being expressed that even some of the effects that had been featured in standard psychology textbooks for decades may not actually be real. This replication crisis became the subject of a great deal of discussion and debate within the field.9

Even prior to the replication crisis, researchers accepted that some of the effects reported in the literature, and even in their own research output, must in fact be spurious. This follows from the reliance on the use of p-values in evaluating the results of experiments. Without going into details, these days the numerical results of an experiment are typically fed into a computer running appropriate software to produce a test statistic and its associated p-value. The p-value is the probability that you would get that particular value of the test statistic or one even more extreme if, in fact, the effect you are hypothesizing is not actually present. If the p-value is low, the researcher will reject the null hypothesis and conclude that the results are probably due to a real effect. The arbitrary p-value conventionally taken to indicate statistically significant results within psychology is p < .05. Taken at face value, this implies that spuriously significant results will arise purely on the basis of chance once in every twenty statistical tests.

In fact, however, the situation is much worse than this, and false positive results are actually present in the research literature at a much higher rate than the 5 percent p-value implies. This is because of what are known as questionable research practices (QRPs).10 We are not referring here to blatant fraud involving the manipulation or even complete fabrication of data. Such fraud does indeed occur in all areas of science, but it is probably quite rare. Any researcher found guilty of fraud will face very serious consequences, including losing their job and their reputation. QRPs, in contrast, do not at first appear to be such heinous crimes. The term is used to describe the numerous opportunities that arise in collecting, selecting, and analyzing data where researchers have some degree of flexibility in the choices they make. This flexibility may not be at all apparent when reading the final published report.

Joseph Simmons, Leif Nelson, and Uri Simonsohn discuss a number of ways in which researchers may exploit such flexibility in order to obtain results that appear to be statistically significant at the magical p < .05 level that will make it more likely that the study will not only be deemed to be worth writing up for submission to a journal but also more likely to be accepted for publication if it is submitted.11

One example of a QRP is known as optional stopping. This refers to the seemingly innocent practice of analyzing one’s data as one goes along. It is not surprising that researchers would engage in this practice without feeling that they were doing anything wrong. After all, if one is two-thirds of the way through collecting data in an experiment with two conditions, it is only natural that one would be curious to take a quick peek at how the pattern of results is shaping up, isn’t it? The problem is that if one discovers that one already has a statistically significant result in the desired direction, it is then very tempting to simply stop collecting any more data. After all, why waste time and resources when one already has the result one was hoping for? But what if the results appear to be going in the desired direction but are not quite statistically significant? Surely there is no harm in then collecting more data as originally intended? Surprisingly, as Simmons and colleagues demonstrate using computer simulations, such seemingly trivial departures from best practice can boost the false positive rate by about 50 percent.

Simmons and colleagues describe a number of other QRPs in their paper, each of which, taken in isolation, appears to be only a slight departure from best practice. As they point out,

In the course of collecting and analyzing data, researchers have many decisions to make: Should more data be collected? Should some observations be excluded? Which conditions should be combined and which ones compared? Which control variables should be considered? Should specific measures be combined or transformed or both?

Given these various degrees of freedom, researchers would often analyze their data in many different ways in the hope that they could produce a statistically significant effect of the type desired. This practice was sometimes jokingly referred to as “torturing the data until it confesses.” In fairness, it is only recently that researchers have become fully aware of the dangers of such practices in terms of increasing the risk of producing false positive findings. Using computer simulations, Simmons and colleagues demonstrated that by combining a number of QRPs, one could quite quickly reach a point where it was more likely than not that such a spuriously significant result could be found.

Demonstrating the dangers of QRPs using computer simulations was not enough for Simmons and colleagues. They really hammered their message home by reporting the results of two actual studies involving real data. In both cases, the methodology, results, and analyses were truthfully reported. How could it be, then, that the first of their studies reported results that appeared to support a very unlikely hypothesis? This hypothesis was that listening to a children’s song would make people feel older. The results of their second study went one step further, supporting an impossible hypothesis: that listening to a particular song could actually reduce the age of participants!

Although everything Simmons and colleagues reported in this section of their paper was truthful, their account was far from being the whole truth. For example, in their initial account they failed to mention a long list of additional variables that they collected data on, thus allowing them to perform analyses on many different subsets of variables until they eventually managed to produce the spuriously significant effects reported. The only variables referred to in their original account were, of course, the ones involved in their contrived analyses.

Fortunately, many researchers have taken the lessons of this replicability crisis to heart, taking steps to assess the magnitude of the problem and implementing methods to address it. For example, with respect to the former, Brian Nosek of the University of Virginia organized an ambitious attempt to assess the level of replicability of 100 studies published in three highly regarded journals in 2008. The results of this project, which involved some 270 collaborators, were published in 2015.12 Only 36.1 percent of the replication studies, carried out using exactly the same methodology as the original studies, replicated the published findings. Even when they did, the effects reported were often smaller than in the original studies.

In terms of actually addressing the problem, many researchers now argue in favor of preregistration of studies. Preregistration requires researchers to state in advance of data collection and analysis the exact methodology and analyses to be employed, thus drastically reducing the problem of undisclosed flexibility. Suggestions have also been made to encourage the publication of replication attempts, and many journals, including the Journal of Personality and Social Psychology, have explicitly changed their policies on this issue.13 In fact, the JPSP published the results of a series of seven online investigations of retroactive facilitation of recall by Jeff Galak and colleagues, involving a total of over 3,000 participants. Once again, the effect was not replicated.14

Of course, we may never know quite how Bem was able to produce that string of statistically significant results in the first place, but there is fairly strong circumstantial evidence that they are probably the result of QRPs. Bem’s previous advice on writing empirical journal articles certainly appears to support this possibility. Eric-Jan Wagenmakers and colleagues provide a couple of telling quotations that point in this direction, arguing that Bem appears to blur the important distinction between exploratory and confirmatory studies.15

In Bem’s own words:

The conventional view of the research process is that we first derive a set of hypotheses from a theory, design and conduct a study to test these hypotheses, analyze the data to see if they were confirmed or disconfirmed, and then chronicle this sequence of events in the journal article. . . . But this is not how our enterprise actually proceeds. Psychology is more exciting than that.16

Bem goes on to offer the following advice to senior researchers, who are often not directly involved in collecting data. Instead, the data are collected by more junior colleagues, and the senior researcher then goes on to write up and submit the report:

To compensate for this remoteness from our participants, let us at least become intimately familiar with the record of their behavior: the data. Examine them from every angle. Analyze the sexes separately. Make up new composite indexes. If a datum suggests a new hypothesis, try to find further evidence for it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results, place them aside temporarily and see if any coherent patterns emerge. Go on a fishing expedition for something—anything—interesting.

There is nothing inherently wrong with exploring data in this way. However, it is vitally important that such fishing expeditions are clearly presented as such. It is all too easy to write up a report in such a way that significant effects that were actually discovered in this way are presented as being based on hypotheses that were stated in advance of data collection. Wagenmakers and colleagues present a strong case that Bem is sometimes guilty of blurring this distinction between exploratory and confirmatory analyses.

There is also other suggestive evidence of methodological sloppiness on Bem’s part. For example, if the effect sizes obtained in Bem’s series are plotted against the number of participants in each study, a statistically significant negative correlation is found. In other words, the smaller the sample, the larger the effect size. It is of note that Bem explicitly states that a minimum of 100 participants would be used in each study and yet his Experiment 9, the one with the biggest effect size, involved only fifty participants. This is consistent with the notion that Bem may have been guilty of optional stopping, as discussed earlier.

We can be absolutely certain that Bem collected data on far more variables than were mentioned in his final report. We know this because we had access to the software he used to run his experiments. In addition to the variables that were reported, Bem also collected data on a range of other variables, such as how much the experimenter liked the participant, how anxious the participant was, how enthusiastic the participant appeared to be, and whether they engaged in biofeedback, meditation, and so on.

It was also apparent that Bem originally divided the words used in his Experiment 9 into common and uncommon words, presumably in the hope that different patterns of results might emerge as a consequence of familiarity, but no mention is made of this variable in the final report. As described by Simmons and colleagues, the collection of data on a range of variables allows the researcher to attempt multiple analyses until a “significant” effect is found.

It is now generally accepted within psychology that many of the effects that have been described in textbooks for decades may in fact be nothing more than unreplicable false positives. Furthermore, replication problems have also been highlighted in many other fields of science.17 Given this, is it fair to single out parapsychology for particular criticism? I would argue that it is. For one thing, the replicability issue appears to be worse in parapsychology than in any other field of science. In psychology, for example, there are literally hundreds of effects that are extremely robust and reliable, including, to name but a few, the effects of rehearsal on memory, the Stroop effect, and dozens of visual and auditory illusions.18 In contrast, there is not a single psi effect that is reliable enough to have a reasonable chance of being demonstrated in an undergraduate lab class.

Arguably of even greater importance is the fact that the implications of accepting the results of parapsychological studies that appear to support the existence of psi are much more far-reaching than accepting findings in other areas of science, regardless of whether the effects reported elsewhere were real or spurious.19 With the benefit of hindsight, I can be fairly certain that some of the effects reported in my publications in mainstream psychology journals over the years were probably spurious, the result of me innocently engaging in the types of QRPs described. At the time that we carried out those studies, analyzed the results, and wrote up our findings, the cumulative impact of numerous minor departures from best practice was not as well appreciated as it is today. Most experimental psychologists of my generation simply did not know any better. However, accepting as real any false positive effects that I may have reported does not require that anyone rejects our currently accepted scientific understanding of the universe.

Accepting the results of apparently positive significant findings supporting the existence of psi would, in the opinion of most mainstream scientists, require precisely such a rejection of our current scientific worldview. It is not absolutely impossible that advocates of the psi hypothesis are correct in their view that our current scientific models should indeed be rejected, or at least drastically revised, in light of their findings, but such a drastic step should not be taken lightly. In Carl Sagan’s words, “Extraordinary claims require extraordinary evidence.”20 The quality of evidence produced by parapsychologists to date simply does not reach this high standard.

All sciences are aimed at detecting signals (that is, true effects) against a background of noise. As already stated, two types of error are possible. Type I errors are those where a scientist wrongly concludes that a true effect has been found when, in fact, there isn’t one, perhaps as a consequence of QRPs. Type II errors refer to the situation where a scientist concludes that there is no true effect to be found when, in fact, there is, perhaps as a consequence of a poorly designed or underpowered study. Sophisticated techniques have been developed and refined to help scientists to try to avoid both types of error. However, it is in the nature of science that they cannot be eliminated completely. The question arises: What would a science look like if the data being collected contained only noise and no true signals at all? Is it possible it would look something like parapsychology?

The astute reader may well have realized that the preceding discussion raises another question: How many of the effects reported in this book, including those reported by members of the APRU, may themselves be unreplicable? The honest answer is, I simply do not know. I can be fairly confident that many, perhaps most, of them are replicable, either because we have replicated them ourselves or else they have been replicated by other independent researchers. Some of them, such as the induction of false memories using the DRM technique, our poor ability to estimate probabilities, demonstrations of inattentional blindness and population stereotypes, and memory for Roman numerals on clocks and watches (again, to name but a few) are so robust that I routinely use them in public talks, knowing full well that I can rely on them working with my audience. But, in all honesty, I could not guarantee that absolutely all of the results I have discussed would replicate.

If the lessons of the replication crisis are taken to heart by both psychologists and parapsychologists, not to mention the wider scientific community, there is every reason to be hopeful that the rate of false positives in future studies will be reduced. However, it will not be eliminated. For this reason, one of the most important lessons to learn is that one should never take the results of a single study as being absolute proof that a claimed effect is real. Science is never about certainty. It is always a matter of basing one’s opinion on the best evidence available at the time but being willing to revise one’s opinion if good-quality new evidence contradicting the claim is subsequently produced.

Science is not an established body of facts; it is a method for attempting to approach the truth. On some issues, such as the idea that human activity is causing drastic damage to our climate, we can have a very high degree of confidence that the available evidence overwhelmingly supports the claim. When it comes to claims based on, say, astrology, homeopathy, and other forms of alternative medicine, we can be equally confident in rejecting them given the complete lack of good-quality evidence in their support. The evidence with respect to many claims, however, will be somewhere between these two extremes.

It is not uncommon for scientific hypotheses that were once rejected by the wider scientific community to subsequently be accepted in light of new evidence and vice versa. One of the greatest strengths of the scientific approach is this very willingness to be open to revision in light of new empirical evidence. Whereas faith and uncritical acceptance of authority are at the very heart of religious and (some) political belief systems, they are the absolute opposite of what is required in science. In science, skepticism is valued, and all assertions, no matter who makes them, may be questioned.

There is a famous quotation about democracy that is often attributed to Sir Winston Churchill: “Democracy is the worst form of government, except for all the others.” The pedant in me feels obliged to point out that Churchill was, as he acknowledged, quoting an unknown source when he uttered these words. However, in the same spirit, I think it would be fair to say that science is the worst way to investigate how the universe works, except for all the others. Because scientists are only human, the scientific method may not always be applied as well as it should be, and we should be aware of the problems that may arise as a result. But, for my money, the scientific approach is far superior to all others when it comes to the tricky task of assessing controversial claims about both the world around us and the workings of the human mind.

Rationality Isn’t Everything

Our ability to assess evidence and logical arguments is crucial not only when assessing controversial scientific claims, making consumer decisions, and considering health issues but also when making political judgments, career choices, and even, at times, decisions about personal relationships.21 Indeed, there is no important area of our lives where such skills should not be applied. But for all that, it is vital that we do not fall into the trap of thinking that rationality is the only important factor in such situations.

We are not robots, and life without emotion would simply not be worth living. It is the joy we feel when good things happen that drives our lives, along with the need to do what we can to minimize the pain that is also an inevitable part of the human condition. We can recognize and embrace these basic truths and set our priorities in life accordingly. Science and rationality cannot directly inform us what our values should be, but once we know what they are, evidence and logic can maximize our chances of living our lives in accordance with those values.

Returning to the main theme of this book, it is probably a mistake to see paranormal and religious beliefs as some kind of deviation from a rational norm. The truth is that the norm for humans is not one of rationality, as I hope this book has shown, and there are many beliefs that may provide psychological benefits even though they are fundamentally wrong.22 It is simply not the case that a more accurate perception of reality is always an inherently good thing. A huge amount of evidence shows that psychological health tends to be associated with what psychologists call unrealistic optimism (also known as the optimism bias).23 If you give a questionnaire to psychologically healthy people asking them what they think the chances are of really good things happening to them, such as winning a major prize, and of really bad things happening to them, like becoming terminally ill, they will tend to overestimate the chances of the good things happening and underestimate the chances of the bad things. Give the same questionnaires to people with clinical depression, and their estimates tend to be much more accurate. Life really is that bad. But the trick is, of course, to accept that reality on an intellectual level but knowingly agree to a spot of self-deception whereby you live your life as if it were not the case.

Accepting that we only have one life will inevitably influence the way we live that life. We can either refuse to face that basic fact and put our trust in various deities and the hope of some form of postmortem survival, along with all the other superficially reassuring beliefs that go along with such a worldview, or we can do whatever we can to make the one life we have worth living. That requires us to be able to make difficult decisions when required, using all of the evidence and critical thinking skills at our disposal. But it also requires that we are able to accept and embrace the fundamentally irrational and emotional sides of ourselves.