Humans have become cleverer. In the last few million years a lot of those upgrades were at the hardware level. Our brains grew bigger, literally tripling in size. Our default psychological approach switched from relying on our own abilities to relying on socially acquired information from others. Each of us became a cultural neuron wired into a collective brain. That collective brain was more brilliant than any one of us and in turn it made each neuron – each cultural brain – brighter.
Around 200,000 years ago that hardware-driven intelligence slowed down and if anything, it has begun to reverse. Brain sizes became smaller in the last 10,000 years or so. But a smaller brain didn't stop us from becoming brighter because hardware isn't the only thing that matters. We continued on a path to becoming more and more clever thanks to our shared culture. Our software improved.
Indeed, improvements in our software were probably the reason why our brains shrank. We didn't need all that energetically expensive hardware because we were distributing our thinking to collective computation.
As our individual brains got smaller, our collective brains got larger and more sophisticated. And as the software rather than the hardware became cleverer, the best bits of software could be shared throughout the social network. We could all upgrade our ability to think by acquiring the latest knowledge and learning new ways of thinking – numbers, hypotheticals, more formal logic, and much more. Each of us solved a little sliver of the world's problems and through selective social learning, shared and recombined the solutions into a highly effective cultural package that could then be transmitted to our children. We deferred and distributed the innovation computation from individual brains to the collective so that even the least able among us could benefit from adaptive behaviors, advanced technologies, and life-improving health care in which we had played only a peripheral part or even no part in creating. This is how our collective brains made each of us cleverer even while our individual brains shrank – through software.
Much of our thinking that we now take for granted is not universal nor the way we thought since the beginning of our species, but instead comprises transmitted products of our culture that have now reached fixation – become ubiquitous – in many populations. Such products seem like human universals because everyone we know shares them. They are not. The ability to count indefinitely beyond fingers or body parts; to read, write, store, and learn ideas through text; the tendency to reason abstractly with syllogisms and enthymemes and approximations of formal logic – all were tools for thinking that were culturally created and then transmitted. All have made us brighter and more capable of performing the many tasks needed in the modern world. These processes have enabled the modern technologies we have created. As Friedrich Hayek put it, ‘it's culture which has made us intelligent, not intelligence which has made culture.’
Since the advent of IQ tests, we have been able to measure these changes. The Flynn effect reveals that IQ has been rising, and the rise is fastest in recently developing countries. This may have been partly due to increases in nutrition and decreases in pollution, parasites, pathogens, and other health insults that damage our brains. These nutritional and health improvements also involve cultural and institutional software improving the social and physical environment using everything from increased agricultural output, better nutritional knowledge, and less polluting vehicles to mosquito nets and medicine. But a lot has been driven by purely cultural software updates downloaded directly into our cognition. One of the main means by which this download takes place is school.
As we discussed, schools are not a human universal. In many hunter-gatherer societies there's not even a lot of active teaching going on. The kids are left to hang around and watch the adults hunt, cook, build, and deliberate. Pastoralists have more to learn and do a little bit more deliberate instruction. When there's more to learn, we might spend more time teaching. But as the 2020 pandemic taught many parents, teaching is a difficult and time-consuming activity that prevents you from doing other things. And that's why we have specialist teachers. An investment in teaching children pays off at a population level when there's a lot to learn. The more complex and complicated the cultural package, the greater the pay-off.
Today, people freely find their comparative advantage by performing well on general tests and then picking a major or career that suits their interests and skills. But prior to the Industrial Revolution, most people learned their skills through an apprenticeship. They were restricted to whom they knew or simply learned the family trade from a parent or relative. It was only the very upper echelons of society that had tutors offering a personalized education.
Alexander the Great could be greater than those around him because he was one of the few who had access to Aristotle. Aristotle was a student of Plato and Plato was a student of Socrates. We know less about the influences that created Socrates, but he was no doubt a product of the Sophists with whom he argued and other cultural groups in the ancient Greek intellectual environment of the first millennia bce. Ancient Greek culture was itself a product of cultural transmission and competition with earlier groups in North Africa, the Middle East, and Asia. Remember, different groups have dominated at different times, and who was ‘civilized’, ‘barbarian’, or ‘savage’ has gone around and around.
When the Industrial Revolution was full steam ahead, our society needed skilled factory workers, which motivated the expansion of compulsory formal education. This Victorian model of the school was itself a factory; a factory for producing good factory workers. Many of us no longer work in factories, but we have inherited the basic school design through a process of path dependence. Remember, when a child is born, they must catch up on the last several thousand years of human history. We now use schools as an efficient way to help them do that, delivering a base-level cultural package from which they can acquire everything else they need in the modern world. Today, as the world becomes more complex, educational innovations try harder and harder to pack in more, earlier, and more easily.
And so we are hungry for educational improvements. But they're hard to prove and harder to implement. We have begun to teach some skills and topics earlier; subjects have changed; we've dropped some things and included others. Many curricula now include computing as a fourth pillar to the traditional three Rs: reading, writing, arithmetic, and now algorithms.
As a result of path dependence there are many inefficiencies in our current system, especially in the overall structure of schooling. We tweak at the margins because massive change is difficult. But this inability to educationally innovate may be causing a Great Stagnation in improving our intelligence.
The Flynn effect has slowed down in the developed world and, in some cases, begun to reverse. But drawing on our theory of everyone we can see that there are various changes, both large and small, that may continue to make us cleverer, putting our foot down and accelerating the Flynn effect.
In 1996, five years after gaining independence from the USSR, Estonia founded the Tiigrihüpe – Tiger Leap Foundation – to rethink and redevelop their education curriculum. With a virtually blank slate, they looked for best practices from around the world. Estonians are a people who value education, and they recognized the increasing role that technology would play in our lives. The goal was to leap ahead of other countries by creating the most technically savvy educational curriculum in the world, thereby creating the most technically savvy people in the world. The program to transform Estonia into E-stonia was launched by Toomas Hendrik Ilves, Jaak Aaviksoo, and Lennart Georg Meri with the recognition that the future performance of Estonia depended on the future performance of its people.
Ilves was Swedish-born and American-raised – his parents escaped Estonia after the Soviet occupation. He had a rich and diverse life experience which no doubt prepared his mind and made him an ideal magpie. He graduated as valedictorian from Leonia High School in New Jersey and then majored in psychology at Columbia, later completing a masters degree at the University of Pennsylvania. Ilves had a diverse career, working in education, research, the arts, and journalism in various places from Vancouver to Munich. In 1993, after the fall of communism, he served as Estonia's Ambassador to the United States before returning to Estonia in 1996 to become Minister of Foreign Affairs, and in 2006 became the country's third president since the fall of communism.
The other two figures were also magpies. Aaviksoo was Minister for Education, holding a PhD in physics. The president at the time, Meri was a peripatetic filmmaker and writer who had been educated at nine different schools across Europe.
These magpies with prepared minds (COMPASS Secret 3) put together an amazing team that included people like computer scientist Linnar Viik. They invested in three key pillars: computers and the Internet as pathways to knowledge, basic teacher training, and native-language electronic courseware. Whether they knew it or not, they were creating a more effective cultural evolutionary environment and cleverer collective brain. They had filled the car with gas and put their foot on the accelerator.
In 1991 only half of Estonia had access to a telephone. By 2001 all schools were Internet-connected and all students had access to computers. This connected Estonia to the rest of the world's collective brain. The curriculum was changed, encouraging and teaching students how to learn through the Internet. Teachers were trained in technology and were encouraged to seek out the best ideas from around the world and from each other. The platform SchoolLife was launched to create a teacher collective brain where teachers could share ideas, resources, and course materials.
Tiigrihüpe recognized the importance of collective brain thinking, sociality, high transmission fidelity, and sharing. Even today, the model embodies the full COMPASS approach and has continued to improve, offering opportunities for radical revolutions not only in education but governance, health, and every other aspect of Estonian society, unrestricted by the constraints of path dependence, and built on the back of high-quality technology training for both teachers and students. When the first generation of Tiger Leap Kids entered university and then the public and private sector, Estonia was transformed forever.
Estonia continues investing in its people through a cutting-edge education system that has maintained their students’ position as the top non-Asian country in math, science, and reading. In 2012 it was the first country to start teaching programming and algorithms in elementary school to six-year-olds. In 2013 it was the first country to implement a radical approach to math education, spearheaded by Conrad Wolfram, younger brother of prodigy mathematician and physicist Stephen Wolfram of Wolfram Alpha and Mathematica.
In a 2010 TED talk titled ‘Teaching Kids Real Math with Computers’, a frustrated younger Wolfram explained that ‘people confuse, in my view, the order of the invention of the tools with the order in which they should use them for teaching’. His point was that we teach mathematics in the same order it was invented and this way is not necessarily the most efficient or effective. In biology there is a concept called recapitulation theory which says that as organisms grow they go through their evolutionary history; ontogeny recapitulating phylogeny. By learning mathematics in the order it was invented, children go through a kind of cultural ontogeny recapitulating history. Start with the Greeks and go from there. But there's no reason that this should be the most efficient or effective way to teach mathematics. And indeed, as a result of this approach, we never get to twenty-first-century advancements in probability and statistics, which are probably more valuable than calculating angles in triangles or memorizing rules for figuring out the length of a hypotenuse. Trigonometry, if it needs to be covered at all, doesn't need to be taught before algebra. And algebra doesn't need to be taught before calculus.
The difficulty many students face in calculus, for example, is in the mechanics. Remembering and algorithmically following the chain rule or quotient rule and knowing when to use them.
You may have forgotten these and perhaps you never really understood why they worked. Both facts probably had zero impact on your life – even if you're an engineer. And that's because calculus – and indeed mathematics in general – is not about the mechanics, it's about the thinking. What does it mean to take a derivative? What does it mean to calculate an integral? When and why are these useful? You can learn this intuitively, developing an intuition for mathematical reasoning, without knowing anything about chain or quotient rules, which are techniques invented for a world without computers.
Math isn't about adding and subtracting or remembering rules for calculating partial derivatives. It is about logic and reasoning, only sometimes with numbers. And in the real world, the mechanics of mathematics are done on computers, not on paper. We can introduce concepts such as derivatives and integrals in elementary school alongside programming, leaving the computation to the computer, and introducing mechanics later.
Conrad's thinking was just the latest approach that Estonia had sought to becoming brighter. It's no wonder that Estonian students leaped forward in their PISA scores across all subjects to become the top-performing non-Asian country in the world!
Estonia's success is a lesson to us all. It goes beyond the use of technology or any specific content. The secret is a product of an innovation mindset (distilled in the COMPASS approach), high cultural and social value being placed on education, and cooperative commitment to their people and future. These in turn led to Estonia plugging its population into the world's collective brain and, like magpies, borrowing and recombining a broader educational, cultural, and institutional infrastructure.
Any country or even individuals can do this.
From Shanghai to Sydney to San Francisco there is much that could be and should be shared and cross-pollinated. Not just content and curricula but also attitudes toward education itself. As just one example, take cross-cultural differences in attitudes toward mathematics.
In the West you'll often hear people saying something along the lines of ‘I was never really good at math.’ Rarely will you hear ‘I was never really good at reading.’ Western attitudes toward numeracy and literacy betray hidden assumptions not present in, for example, much of Asia.
It's true that not everyone needs to interpret Tolstoy, but if you can't read then you have to trust others to interpret a world of information for you, just as an illiterate person did in the past. Similarly, it's true that not everyone needs to transform tensors, but if you can't do simple math then you need someone to interpret personal finances, probabilities, and health decisions for you.
Education systems in many WEIRD countries have let down many students. Students who show an early aptitude in math do fine, but those who do not are not taught that they are just as capable at learning mathematical skills as their peers. Many WEIRD cultures have had to compensate for a lack of individual skill by legislating simplicity at a societal level – simplify investments, mortgages, and the presentation of important statistics. Despite these societal crutches, people fail to understand concepts such as exponential growth, contributing to the failure to save at an earlier age for their retirement, susceptibility to poor credit card usage, inability to compare loan and investment decisions, and suboptimal decisions in everything from health insurance to home loans.
We know that this is a correctable problem because the asymmetry between numeracy and literacy is not a cultural universal.
In many cultures math is not seen as an inherent trait that only some are good at. It's seen more like reading – a skill that requires practice. And in these cultures children perform better at mathematics, leading to stereotypes such as ‘Asians are good at math.’ In reality, Asians are good at realizing that math is a skill that can be learned and developed with the right instruction and attitude. For me, this point was starkly made during the 2020 pandemic when schools were closed.
Drawing on the experience of my friend from engineering school, Clinton Freeman, who had been home-schooling his daughters for some time, we took the opportunity during the lockdown to bring together different curricula around the world and test how our children reacted. In math, it was astonishing the progress our kids made using the Singaporean and Shanghai curricula (which have a lot of cross-pollination with each other). Simple things like drawing on relationships across multiple areas of mathematics or multiple approaches to solving the same problem encouraged generalization; explicit logical reasoning and clear and precise explanations for why and how an approach worked and hypotheticals about what would happen if something changed encouraged thinking about proofs; and the early use of conventional mathematical language, such as learning that letters could be numbers and introducing simple pre-algebra, removed barriers for later learning. As a result of little changes such as these, China and Singapore are able to teach algebra in elementary school, a subject reserved for secondary school in the United Kingdom. It is perhaps no surprise that China and Singapore top the PISA tables. I was good at math in school but was astonished as I watched my then six-year-old competently solve for the unknown variable ‘x’ in an equation and very quickly advance to rearranging multivariable equations to solve for different values. All thanks to small differences in how they were taught.
The general point is that there are low-hanging intellectual arbitrage opportunities (COMPASS Secret 3) for becoming brighter if we're willing to go off the beaten path (COMPASS Secret 2) of decisions made in the past. It requires sufficient school funding, investment in retraining teachers, and incentives to do things differently, but also a realization that it's not about math education nor overly competitive pushy educational curricula but rather the flexibility of the human mind and recognizing what we could be capable of if we try to find out.
Our psychology, in other words, is highly hackable. We are not blank slates but our minds are highly flexible. Formal education is the primary means by which we transmit our cultural package to the next generation. The things we assume people are bad at are often the things we don't prioritize, don't teach, or don't do enough research in figuring out how to teach better. For example, we assume people are highly susceptible to logical fallacies – straw-man arguments, ad hominem attacks, appeals to authority, or confusing correlation with causation – but formal logic, reason, and fallacies are also things we rarely formerly teach children at a young age in the English-speaking world. For a species so dependent on its software, logical fallacies need not be a permanent foible. Just as traditional societies were able to learn how to count, we can learn how to reason. And as the Flynn effect makes clear, we have always been capable of more than we currently do. Children from the 1940s learned far less than twenty-first-century children but would have done just as well if presented with a modern curriculum.
There are many degrees of freedom for change, but it requires a shift in attitude, a willingness to experiment, and political will, perhaps in a start-up city.
Consider even the format of the school day, which need not be 9 a.m. to 3 p.m. or similar partial days that force parents to leave work to pick up their children. This system may be a legacy of a time when many children would go home to help on a family farm and when many mothers stayed at home and could pick up their children in the middle of the working day. An alternative arrangement would be one that matched a typical adult working day, with lots of breaks, which reduced as children grew older. Sports, extracurricular activities, homework, and weekend work could be built into this time. Extracurricular activities would become curricular activities and homework would just be schoolwork. Expensive childcare could instead be reallocated toward reducing the cost of these extra hours. And as Finland demonstrates, paying teachers a salary comparable to other high-prestige careers and offering opportunities for development increases the prestige of the career, and attracts the best minds to take up one of the most noble and valuable of professions, charged with transmitting the light of knowledge to the next generation.
The structure of courses and assessments, too, is open to innovation. In 2017 the then president of the Royal Society – the world's oldest scientific society and Britain's top scientific society – Venki Ramakrishnan, condemned Britain's A levels in which students take just three to four final subjects assessed by subject-based national exams as ‘no longer fit for purpose’. A levels are an educational model that is too narrow. Asking students at age sixteen what they want to specialize in leaves huge holes in their knowledge of the world that are rarely filled.
From a collective brain perspective and from my own experience moving through multiple educational systems and now teaching students who have gone through every major curriculum in the world, studying only three to four subjects at an advanced level leaves students ignorant of so much, preventing them from drawing the valuable connections needed for intellectual arbitrage and other means of innovation in the modern world. A broader and more balanced curriculum is necessary to better connect the collective brain and solve the paradox of diversity created by over-specialization. This skill perhaps is even more valuable in the age of AI.
In addition to these macro-level changes, within schools there are many micro-level opportunities for innovation. We train specialist teachers because the gap between children and adults is so large that extensive training is required to bridge it. ‘Hey grown-ups, this is how kids learn.’
In stark contrast, in many traditional societies there is less adult–child instruction. Instead, children learn from slightly older children. Five-year-olds learn from six- and seven-year-olds, seven-year-olds learn from eight- and nine-year-olds, and so on. This smaller age gap offers a more gradual learning gradient, making it easier to understand what is being taught. This gradient has been lost under our current system. But that doesn't mean we need to go back to these other models, only that we can learn from them.
We typically teach children in age-group cohorts, which is assumed to represent similar skill levels and is therefore more efficient. But as any teacher knows, that assumption of similar skill levels is false. Skills vary dramatically between children and in different subjects. Inevitably, some struggle to keep up and others are held back from their full potential and this may differ from subject to subject. Modern technology allows us to create a more personalized education or even cohorts based on skill and maturity rather than age. Students then learn how to learn, an essential skill in a quickly changing world.
There are various other approaches to rethinking education. Famous among these is Elon Musk's Astranova school, formerly Ad Astra. The school is based on pillars such as first principle learning and real-world relevance. The idea is captured by contrasting two ways to teach children about an engine. One approach is to start by teaching them about tools like a screwdriver and wrench and how they're used. Eventually students learn how these tools help take an engine apart. It's an approach that focuses on learning the constituent parts and building them up to the final product. But along the way, many students fail to realize the relevance of what they're learning, which affects their motivation. They ask, ‘What's the point of learning this?’
An alternative approach is to give students access to the tools and an engine and get them to take it apart and put it back together, learning the principles as they undertake the task – starting at the beginning and at the end and meeting in the middle with a practical and relevant focus throughout. Just as it was easier to derive the principles of hydraulics after we had steam engines, it's easier to learn anything by doing it and seeing its most real-world relevance. This approach is one of many and may not even be the best approach, but such innovations are necessary to step off the beaten path and step on the accelerator of the Flynn effect.
Schools are trapped in suboptimal local equilibria and seemingly unprepared for the demands of our current world. Many parents recognize this and so to better prepare their progeny, they compensate for these inadequacies by supplementing what public systems, and even private systems, offer. Not all parents have the resources, skills, or time to do that. The increasing irrelevance of public education to the acquisition of everyday life skills further reinforces inequality and group differences, which is costly to us all.
Talent is equally distributed; opportunity is not. That's not strictly true.
Our genes vary sufficiently that in a perfectly equal world in which everyone had the same resources, the same parenting environment, the same educational opportunities, the same cultural input, and the same access to unpolluted air, nutritious food, and clean water, there would still be inequality of outcomes. But while there may be prodigies and genetic geniuses – John von Neumanns and Terry Taos – there are many more with unrealized potential. Indeed, those we call geniuses on the basis of their contributions may have simply been ‘bright enough’ but in the right place in their collective brain to be the nexus of ideas. They stood out not because of raw talent but because of greater personal opportunity in a world of more unequal opportunity than today.
Sometimes it feels like there were more geniuses in the past, and people wonder where all these great minds have gone. One possibility is that it's not so much that there were more geniuses in the past, but that there were just fewer people with the necessary education, access to books, networks of knowledge, and resources. This is an answer to today's missing geniuses. Newton was the peak of a molehill at a time when rates of literacy, let alone higher learning, were incredibly low and there was much low-hanging fruit to be discovered. It is statistically unlikely, given rates of social mobility at the time – at a time when so few had access to education – that Newton happened to be the person with the most potential in England.
Similarly, Einstein and von Neumann both sat on a slightly larger but still very small hill. Today, it is almost statistically certain that there are many more people of equal potential and genius as Newton, Einstein, and von Neumann working in tech, top universities, and top finance firms. Or put it another way, take the average Big Tech engineer, physics professor, or finance quantitative analyst back to the seventeenth century without any knowledge of today and put them in Newton's situation, and they will probably re-derive Newton's laws and perhaps do much more.
Geniuses as we label them are not just genetic geniuses but are a product of their cultural software written by their position in a collective brain. Given the increase in population size and advancements in education since Newton's time, there are far too many Newton-level geniuses for any to be particularly noticed or make the history books in quite the same way as when the competition was lower. But given rates of poverty and the world population, it is also almost certain that there are even more Newtons, Einsteins, and von Neumanns who work far more modest jobs than tech, academia, or finance, simply for lack of opportunity. Born in another place, even today, Einstein may have lived out his days as a quiet clerk. Many could-be Einsteins still do.
And so, talent may not be equally distributed, but opportunity definitely isn't. We are a long way from an equal world. The potential for talent is far more equally distributed than is the opportunity to nurture that talent and have it benefit us all. As we discussed in Chapters 7 and 9, so much human potential is lost to the vast inequalities entrenched in our systems by the fractures between us and the unearned intergenerational transmission of wealth.
Evidence for this can be seen in the social mobility data. Social mobility indices track the correlation between a child and their parents’ socioeconomic status. That is, to what degree your wealth, income, educational and other lifetime outcomes are determined by those of your parents. Obviously, some portion of that correlation can be attributed to genes, but the cross-national differences in this mobility are revealing.
It should be no surprise that the highest social mobility can be found in the Nordic countries, Denmark and Norway consistently topping the list. As we learned earlier when comparing Norway to the United Kingdom, being a poorer Norwegian doesn't prevent you from accessing high-quality education, good food, and a safe and pollution-free neighborhood. Poorer and richer Norwegians mix more freely, allowing for the flow of ideas and culture, keeping a unified cultural-group and reducing inequality. And as a result, genes become a better predictor of outcomes in Norway than in Britain and a better predictor in Britain than in the United States. Mobility is higher and therefore so is heritability.
This contradicts ideas that the rich stay rich due to better genes or greater talent. As we learned, genes are a stronger predictor of cognitive ability among the wealthy in the United States, although not the poor, but a better predictor across the population in places like Norway.
It may be counterintuitive, but a more equal society is one in which genes play a greater role in success.
I hope these cases and policies inspire change around the world and offer a playbook for how to get there. All of them are approaches derived from a theory of everyone to accelerate the Flynn effect at a population level to give everyone the best opportunities in life to maximize their potential – not just for their own sake, but for the sake of our collective future. The goal is to maximize the probability of our children being healthy, wealthy, attractive, successful, and happy humans.
Countries that invest more in education have greater intergenerational mobility. Even within a country like the United States, states that invest more in education have greater intergenerational mobility than those that invest less. The message is clear: more educational opportunities can help a society discover and nurture the next generation needed to take us to the next energy level.
Unfortunately, intergenerational mobility has been falling; wealth and other outcomes are becoming more entrenched. And as the energy ceiling falls, for the first time in a long time, by many metrics, children are leading worse lives than their parents.
An American born in the 1940s had a greater than 90% probability of being better off than their parents, a result of both economic growth and intergenerational mobility. A child born in the 1960s had around a 60% probability. The American Dream died in the 1980s with the oil crisis, when the probability of a child born in that decade earning more than their parents became a coin toss: fifty-fifty.
There remains tremendous variation in intergenerational mobility across the United States and across the world.
One commonly used intergenerational mobility metric is intergenerational elasticity of income (IGE), the percentage of a person's income that can be predicted based on how much their parents made. It's around 50% in both the United States and United Kingdom, which means 50% of a child's income can be explained by their parents’ income and 50% by other factors, such as education, luck, or hard work. In contrast, IGE is less than 20% in more equal Canada, Finland, Norway, and Denmark, which means just 20% of a child's income is explained by their parents’ income and 80% is explained by other factors. Financially struggling parents are far less likely to have an impact on their children's future finances in Canada or Denmark than in the United States or United Kingdom.
Drawing on our theory of everyone, we can consider what it takes to give everyone the best opportunity. I'll frame these as policies and as individual choices, but recognize that there are many barriers to implementing such policies and that most people don't have the ability to make such choices. Poorer people aren't always making poor choices, they simply have poor choices to choose from. Nonetheless, knowing the general direction helps us to steer away from where we don't want to go and toward where we do.
At a population level, it's almost obvious, but simply investing in education has a large return. A cleverer country is a better country for everyone. As John Green, author of The Fault in Our Stars and host of the successful YouTube series CrashCourse, explained,
Public education does not exist for the benefit of students or the benefit of their parents. It exists for the benefit of the social order. We have discovered as a species that it is useful to have an educated population. You do not need to be a student or have a child who is a student to benefit from public education. Every second of every day of your life, you benefit from public education. So let me explain why I like to pay taxes for schools, even though I don't personally have a kid in school: It's because I don't like living in a country with a bunch of stupid people.
But it's not just financial investment in schools, it's how that money is used, the context and culture children find themselves in, the other values that are encouraged such as hard work and persistence, and the possible aspirational futures children can see through the lens of their evolved social learning psychology, sensitive to the success and pathways of potential models available to them.
Groups differ in their outcomes as measured by IQ, educational attainment, income, wealth, health, and lifespan, among other metrics. A constellation of cultural traits is required in order to succeed. If any are missing, you're less likely to get to the top: an excellent education, ambition, willingness to work hard, connections, resources, ability and willingness to take risks, and so on. And so given imperfect transmission and different cultural traits between individuals, families, and entire groups, it's not surprising to see group differences. That's an understatement. It would be astonishing if all groups had the same outcomes. But the size of the gap between groups is within our control.
The opportunities for tackling group differences upstream – childhood environments and opportunities – are much greater than where we tend to focus downstream – university places and jobs.
Many communities, even across the United Kingdom and the United States, suffer from ongoing brain hardware assaults from pollution, disease, insufficient nutrition, exposure to smoking, and/or toxins such as lead. Several reports, for example, reveal that many UK residents suffer from lead poisoning caused by lead leaking from old pipes into their water supply. I tested the water at our house and discovered unsafe levels – and our home is not in an area known for lead poisoning. Areas known to be even higher in lead poisoning, such as Glasgow, correlate with lower school performance and higher juvenile delinquency, among other associated traits.
Lead abatement alone may have large effects, if we measured and invested in it. Research is ongoing. A study published in 2022 in the Proceedings of the National Academy of Sciences suggested that leaded petrol – common in the twentieth century – reduced the IQ of half the American population.
Upstream group differences can also be found in the culture of families, how children are raised, the stability of parent relationships, and quality of schools. All are amenable to policy levers that lead to better lives.
By the end of high school, these differences can have remarkable outcomes. Estimates from the United States suggest that in 2016 no more than 2,200 Blacks and 4,900 Latinos scored above 700 on the math SAT. The maximum score was 800. In contrast, at least 48,000 Whites and 52,800 Asians scored over 700. Looking at even higher percentiles, say over 750, the number drops to 1,000 Blacks and 2,400 Latinos, compared to 16,000 Whites and 29,570 Asians across the entire United States. So although 51% of test takers are White, 21% are Latino, 14% are Black, and 14% are Asian, among those who score over 750, 60% are Asian, 33% are White, 5% are Latino, and 2% are Black.
The same forces that cause these differences propagate to universities, colleges, and workplaces. As with many public health interventions, getting in early can have larger effects, but rather than deal with these difficult multifaceted challenges upstream, we often take an easier but more defeatist approach. This well-intentioned strategy of affirmative action, quotas, removing standardized testing, or many anti-racist policies is perhaps implicitly racist insofar as it creates low expectations for some groups on the assumption that the problem is unsolvable upstream.
Affirmative action and anti-racist policies attempt to tackle very real challenges. Indeed, they can be useful in, for example, offering role models that children can aspire to or simply generating awareness of the deep inequities. They are also based on correct assumptions that these are very real issues that exist at a cultural and structural level. But the forces that lead to these large differences in school performance are not fixed by downstream patches. And because the ultimate root cause of the group differences remains unaddressed, we are left with an industrious ecosystem of proximate further fixes, policies, approaches, and advocacy that creates skewed incentives and unintended, sometimes self-defeating consequences. All because we never tackled the upstream, ultimate root causes.
As an example, consider research from Duke University looking at the trajectory of Black and White students. Compared to White students, Black students are far more likely to switch from more difficult to easier majors. This difference is entirely predicted by upstream high-school preparation, before they arrived on campus. In turn, these decisions and reduction in Black representation in more difficult majors have downstream effects on employability and lifetime income that are not resolved by admission to Duke.
To put it simply, policies that admit less-prepared students do not solve a performance gap that needed to be solved before students arrived at university. The gap needed to be closed by helping high schools better prepare their students. In turn, differences in high-school performance needed to be solved by helping elementary schools prepare their students for high-school. In turn, the gap in elementary schooling needed to be solved by helping mothers and fathers stay healthy and produce healthy offspring and provide a more nurturing environment.
The downstream effects of upstream differences can also be seen in the experience of migrants, where immigration policy fails to consider factors that affect the success of migrants. For example, in the United Kingdom, European migrants make a greater fiscal contribution than do non-European migrants, at least in the first generation. Data from France, Germany, and Canada show the same pattern – highly skilled, more culturally close migrants make the greatest economic contribution in terms of income and subsequent taxes. They are a larger net gain for the economy.
More highly educated migrants tend to have higher incomes and therefore contribute greater economic value, paying for themselves, so to speak – especially when those skills are needed and missing in the economy. Economics is not everything of course, but it is essential to supporting a social welfare state that takes care of those with disabilities, who have fallen on hard times, or are less well off for a variety of reasons. Social welfare states are supported by the tax base, so the more productive workforce supports the less productive members of a society. As a result, we need to carefully consider the welfare implications of people with different skill sets and how they contribute to the productivity of the tax base. If not, we need to consider how we will support them, or support them in becoming more productive.
But not everything is on the migrant side. Even with the same education and qualifications, migrants from culturally distant countries have lower incomes. This may be simply due to discrimination, but discrimination is not simple.
One compelling discrimination audit research design involves sending out identical CVs – same degree, university, work experience, and so on – changing only a name. For example, in a Western context, some employers might get a CV from say Abdul Mohammad while others might get an otherwise identical CV from Adam McKinsey. The Mohammads get fewer callbacks despite having the same education, same experience, and indeed otherwise identical CVs right down to formatting. Studies adjusting the names – stereotypical male versus female, minority versus Western, and so on, offer compelling evidence of discrimination on the basis of sex and ethnicity.
But evidence for discrimination isn't large enough to explain the entire income difference for culturally distant migrants or different ethnic groups. Moreover, discrimination isn't an ultimate explanation. We need to explain why some people discriminate more than others and why there is more discrimination for some groups than others even when they share many characteristics. Remember the income differences by immigrant country of origin that we discussed in Chapter 3? Consistent with these patterns, research reveals that South Asians are over-represented in executive leadership positions compared to East Asians, but also reveals that South Asians face more discrimination than East Asians. But, as this disaggregated data reveals, this too is a broad generalization.
The luminary South Asian CEOs of many American companies, particularly tech companies such as Microsoft, Alphabet, IBM, Twitter, and OnlyFans, are not South Asian broadly speaking; they are of Indian ancestry. But this too is too broad a generalization. These CEOs aren't just Indian, they are overwhelmingly upper-class Brahmins, who represent less than 5% of India. As you can see, the measurement of culture and its application is complicated without a theory of everyone. Broad-brush assumptions or statements about discrimination fail to understand the degree of discrimination, at whom it is aimed, and its ultimate causes. Simplistic accounts such as ‘people are racist’ do not explain why some people are more racist or more so toward some groups and even subgroups than others. A fuller explanation requires understanding what cues people are using to discriminate, and why.
Some of this may be due to perceived differences in the quality of education in different parts of the world. Migrants moving to Sweden from Switzerland may be more culturally close and have a better educational experience than those from more culturally distant Syria. But even with the same educational qualifications, Switzerland may have better schools and training than Syria. The effects of war and economic differences alone would create such educational differences. Both training and the cultural challenges of coordination and communication are known to result in differences in work performance.
Resolving challenges downstream by focusing on the tendency to discriminate, but not understanding what causes differences in discrimination, not only foments conflict between existing societal fractures but even risks rationalizing racism, sexism, or other forms of prejudice.
Let's revisit the topic of variations in SAT scores among different racial groups, a subject we touched on earlier. These variations can persist into university life, influencing academic performance. Universities aiming to boost the representation of minority groups might consider accepting lower scores from these groups compared to others. Consequently, race or gender could be seen as indicators of differing admission standards or academic performance. Now, imagine an employer reviewing identical resumes, but without access to the candidates’ grades – only their degrees (since grades are typically not included on resumes). If this employer's goal is to select the highest performing candidate, they might make assumptions about the academic performance of a minority candidate, based on the knowledge that universities sometimes accept lower SAT scores from under-represented groups. While it's important to acknowledge that these candidates may have faced significant challenges, these challenges were encountered earlier in their journey and could potentially impact their performance in the workplace. An employer focused on maximizing performance might be concerned about this. However, in a scenario where affirmative action is not in play, the same employer might view the minority candidate more favorably, recognizing that their journey to this point may have been more challenging. If admission standards are consistent across all groups, it can be assumed that all candidates possess equal skills. In fact, under-represented minorities may even be seen as having an edge, considering the obstacles they've likely overcome.
By solving problems downstream rather than upstream, we risk creating rational racists and Bayesian bigots, who behave in a way that a computer just trying to optimize outcomes would also behave. And then, in order to reduce the effects of this discrimination, we need further fixes for the further problems that in turn further distort the goal of an efficient and fair system. We are a long way from an efficient system, but that doesn't mean we should embrace inefficiencies rather than try to improve them.
Problems flow downstream and when we try to fix them downstream instead of upstream, at the source, we create distortions that undermine values such as meritocracy or free speech. The trouble is that some people use these legitimate criticisms of policies such as affirmative action as an excuse to do nothing. But it is not sufficient to say that a model is broken – we need to develop a better model. We need to do as much or more work to fix these problems at their upstream source.
Many popular anti-racist approaches, though well intentioned, undermine our commitments to free speech and meritocratic promotion, demonize sections of the population deemed to have privilege, and poison intergroup relations. But the alternative of doing nothing is also not an option.
When someone is injured, we must staunch the bleeding and also treat the symptoms. To truly heal our society we must not only invest in bandages and symptom relief but also remove the underlying causes of the injuries. A theory of everyone demands a systems-level ultimate policy approach.
Harvard's Opportunity Insights initiative, led by economist Raj Chetty, is an example of a research program that brings the latest and most powerful data analytic, AI, and machine-learning tools to bear on the largest datasets with the goal of developing scalable solutions that close these gaps upstream. Although progress has been made in many small projects, large-scale initiatives like those pioneered by Opportunity Insights are rarer. Opportunity Insights confirms the effects of wealth inequality, which may be reduced by policies such as land value taxes, as discussed in Chapter 9. It also confirms the important role of cultural transmission and knitting together our collective brain. For example, children with poorer parents who have an opportunity to grow up in neighborhoods where there is more interaction between the rich and poor have considerably better outcomes. Though not measured, presumably children of wealthier parents also have an opportunity to develop a better understanding of the structural and material challenges faced by other, less-privileged members of society.
Insights such as these have led to scalable initiatives like the Creating Moves to Opportunity (CMTO) project and the Charlotte Opportunity Initiative Collaboration.
Every year, the United States spends approximately $20 billion on rental assistance through the Housing Choice Voucher Program. CMTO helps redeploy these funds in ways that increase the educational and life outcome returns on the money. For example, Opportunity Insights helps low-income families identify neighborhoods with opportunities to better meet their needs and then coaches and helps them apply for houses that bring them to these neighborhoods. Children who are able to move to these neighborhoods are more likely to attend college, attend higher-quality colleges, and are less likely to become single parents. The effects are stronger the longer children are able to live in neighborhoods with more opportunity.
Alongside mobility approaches like CMTO are initiatives that use data-driven approaches to support local communities in improving themselves. Opportunity Insights identified Charlotte, North Carolina, as having the lowest rate of upward social mobility of the fifty largest metropolitan areas in the United States. The Charlotte Opportunity Initiative Collaboration works with local leaders to make targeted investments that increase the likelihood of children from low-income families rising up the social ladder. Support includes helping local leaders make direct investments in health and education, from ensuring adequate health care and increasing access to high-quality pre-schools through to supporting college applications. The collaboration also helps desegregate neighborhoods to increase cultural transmission, using similar approaches to CMTO and through re-evaluating housing policies.
Opportunity Insights is just one example of where the latest data science and big data are being used to tackle the challenges of inequality in educational, income, and overall life outcomes. The approach is a way to help us become brighter. It is more difficult than any preferential admission policy, but it tackles the hard problem of creating a fairer world rather than miring us deeper in the mess created by the legacy of past injustices. It is imperative that we tackle these challenges as we enter a world of AI and machine-learning systems. Our computational companions are quickly entering every aspect of our lives and, despite no evolved xenophobic tendencies, nonetheless instantiate the kind of rational racism and Bayesian bigotry discussed before, simply by picking up on patterns in our unfair world. But these AI and machine-learning systems also have the capacity to make us brighter by joining genes, culture, and individual experience as a fourth line of information.
Marvin Minsky was a god in the world of AI. No history of AI can be written without mentioning his contributions. The dominant paradigm for building artificial intelligence from the 1950s to the 1990s was the symbolic approach. Human intelligence was assumed to lie in our ability to reason with a rich body of information. Therefore, our ability to program logic and apply it to rich representations of knowledge, such as in silico semantic networks, led to wild optimism about the creation of human-like intelligent machines. In an interview with Life magazine in 1970, Minsky optimistically declared that ‘In from three to eight years we will have a machine with the general intelligence of an average human being.’ But he was wrong. In fact, the assumption of intelligence as logic applied to knowledge was wrong. The secret to human intelligence was not just in our logic and knowledge representations. At the very least, these were difficult to program directly.
An alternative approach, notably associated with Geoffrey Hinton, was connectionist. Rather than directly programming logic and knowledge, Hinton, who had a degree in experimental psychology, sought to represent the very brain itself in silico – an artificial neural network. Initially, success was limited. Minsky famously stood up after a talk by a young computer scientist, who had just presented a neural network approach to AI, and asked: ‘How can an intelligent young man like you waste your time with something like this? . . . This is an idea with no future’. What was missing from the connectionist approach was sufficiently large networks and sufficiently large training datasets.
In the early twenty-first century neural networks proved the value of the connectionist approach thanks to more powerful compute and vast troves of data that enabled the training of deeper and larger neural networks. Hinton was vindicated. The neural network approach has continued to offer surprising successes, from machines beating humans at board games such as chess or Go to writing essays and creating art from a description.
The neural network approach has continued to take inspiration from human analogues. The focus has been on neuroscience and brain architecture. AI researchers now recognize that this approach is improvable or even insufficient.
The theory of everyone reveals that human intelligence isn't just a function of substantial neural hardware but of socially acquired software. That is a paradigm shift. Human intelligence is not simply a result of our brains but of the sophisticated social learning strategies we use to acquire information from large, cooperative, collective networks of other humans.
Our intelligence is a product of ‘pre-training’ by millions of years of genetically evolved hardware and thousands of years of culturally evolved software, adjusted over a lifetime of experience. This body of research offers a new paradigm for AI. Within this, the focus isn't just on the neural hardware replicating a single human brain but on the way our many brains evolved for cultural learning as a collective brain, making each of our brains cleverer. We shaped our machines, our culture, and our technology, but they in turn shape us and our children. The addition of machine intelligences to our collective brain has the potential to truly supplement our cognition in ways the existence of computers has only scratched the surface.
We have used energy to power machines that supplement our muscles. We have used energy to power machines that calculate for us and connect us. The next step is to use energy to power machines that truly supplement our minds.
Remember that our three main lines of information – genetic, cultural, and individual learning – are all reinforcement learning with different limits and lags. Ultimately, all reward what works and punish what doesn't over different timescales with different information sets. Machine learning is the missing line of information, parsing the world's data, with the capacity to do it in a personalized way. It is the combination of cultural and individual learning.
To paraphrase a common joke in the AI community, trial and error, small changes to your software, is bad coding practice. But do it fast enough and it's machine learning. Machine learning is individual learning on steroids.
Machine-learning algorithms can parse our large cultural corpus to discover patterns that cultural evolution may miss. Moreover, they can help us make better decisions. If you want to know what can make you happier, wealthier, or more attractive, you are often forced to rely on data and evidence about the average person: what makes the average person happier, wealthier, or more attractive.
But none of us is an average person. The average person is, ironically, rare. Machine learning, however, by combining your data with the data of all people like you, can help you see what makes people like you happier, wealthier, and more attractive. It can also make you cleverer.
First, simply by providing powerful ways to crunch data and see new patterns. And second, by humans building on machine-powered discoveries.
In 1997 AI Deep Blue beat then world chess champion Garry Kasparov. Soon after the victory of machine over man, the New York Times ran a story with experts weighing in that winning at chess might be achievable by a machine, but AI was unlikely to beat human players at Go because the subtleties and space of possible moves is orders of magnitude larger.
In 2017 AI AlphaGo defeated the best Go player in the world, Ke Jie. What's interesting is that since then human players have improved by learning from how AlphaGo plays, discovering new moves and playing styles. Just as the steam engine helped us discover the laws of thermodynamics, AI is teaching us about how we think and improving our cultural software.
Machines now write stories. They write profound poetry. They draw pictures from descriptions. They write functional computer code. In this era of astonishing successes with no current limit in sight, people endlessly speculate about what the future holds. Computers doing science and engineering, improving themselves and leading to unprecedented quick advancement, perhaps helping us crack fusion and enter the next level of energy abundance. Computers achieving Artificial General Intelligence, brighter than any human. Perhaps even digital people, able to do more than we ever could, embodied completely in software living on any planet that has energy and conditions to run computer servers.
More progress is needed to know the true limits of what machines can achieve and our role in all of this. The tides of progress can only be held back for so long. Even if one country tries to protect human jobs at the expense of Great Stagnation, other countries will push forward toward a creative explosion, just as China is developing hundreds of nuclear reactors, speeding up their progress toward energy abundance.
We have a new fourth line of informational inheritance – artificial intelligence. We may not yet know the limits of what AI is capable of, but we do know that AI has the capacity to make each of us cleverer.
AI empowers the law of innovation and may help us crack the next level of energy. Biologist Carl Bergstrom described DALL-E 2, an AI which can create new images based purely on a description, as tapping into our collective unconsciousness. DALL-E 2, like text-generating AIs such as GPT (generative pretrained transformer), is perhaps the most sophisticated magpie we can conceive of creating, searching a latent space of possibilities and creating new art and writing never before seen. It is recombinatorial creativity in a very human sense. The same principle could be used in scientific discovery, which is based on similar principles. We can expect to see advancements ranging from protein folding to gene editing to advancements in our theory of everyone.
AI also empowers the law of cooperation, making us more cooperative by helping coordinate behavior. Today, people will often use memes or pop-cultural references to communicate the right metaphors and emotions. Similarly, in the near future, commonly used AI will help us find common ground to coordinate and communicate. As people begin to interact more with these artificial agents, their influence will grow. In turn, if these agents offer similar advice on appropriate responses to a particular context or appropriate behaviors in general, they will effectively become mediators, culturally diffusing these norms. In turn, this will help people coordinate and communicate through a more shared culture.
But AI also has the capacity to further exacerbate inequalities and fractures created by our current social systems. It is an innovation in efficiency that lets fewer people do more with less leading to lower scales of cooperation entrenching a few with power over many. In some domains, nations such as China and Russia have an advantage. AI is dependent not just on powerful computers but on large troves of data. Large language models such as GPT benefit from large corpuses of digitized English text. In other domains such as facial recognition and medical diagnoses, China and Russia can advance more quickly thanks to weaker privacy protection laws. China isn't quite as worried about using medical data or footage from CCTVs. Indeed, China has become a large exporter of AI facial recognition technologies. The advancement of AI-enhanced surveillance drives upgrades to high-resolution CCTV cameras in China and the rise of desk-based ‘smart cops’ replacing police on the streets.
There will soon be AI workers in our economy working alongside humans. In some cases they will replace humans, in others they will cooperate and enhance human abilities. The ability to work with agents, such as the new skill of ‘prompt engineering’ – learning the best approach and phrasing to ask an AI to perform a task – is quickly becoming more valuable.
The creation of these agents also exacerbates our need for energy. Unlike your phone, laptop, gaming rig, or even home server, which use very little energy, a lot of energy is needed for the computing power needed to train AI agents. These energy requirements grow every year because there are more agents trained for different tasks and because these agents train on larger datasets and become more sophisticated with more parameters. Of course, once trained, these agents can do more work per watt than the equivalent number of humans. Their control enhances the power of those few who control them. Every company in the world will be changed by AI as profoundly as they were changed by the Internet itself.
This may very well be the most important century in human history, but if our future is to be in the stars then we must resolve the cultural, institutional, and social challenges that threaten humanity. We must use the laws of life and the theory of everyone to pick the brighter future.