WILL I CATCH FLU NEXT WINTER?
Epidemics and how they spread
Stories of the Black Death and the Great Plague still carry a morbid fascination for us hundreds of years after they happened. Both were probably the result of the same thing, a bacterial infection called Yersinia pestis carried by rat fleas. The plague was so infectious that it took only a few cases to start this horrendous epidemic.
In fact the Black Death entered Europe when a Tartar army catapulted infected corpses into a Genoese trading post, a scene of almost Monty Pythonesque absurdity were it not for the grim consequences. About a quarter of the European population died.
Plague is still out there, but fortunately its effect on mankind has been massively reduced in the last century, thanks to the advances in medicine and hygiene. Another important factor in the control of this and other diseases has been the science of epidemiology, which seeks to understand how epidemics work, and how they can be managed. Epidemiologists are in demand today more than ever. AIDS, foot-and-mouth, BSE and biological warfare have all been major news stories in recent times, and numbers are at the heart of much of their analysis.
The infectivity of gossip
It isn’t only disease that spreads through communities. One of the most familiar and everyday epidemics is that of news and gossip. So, as an ideal introduction to the topic, let’s take a look into the world of spreading news. Imagine that you hear a bit of hot scandal. Knowing that it would be unkind to tell too many people, you pass the news on to just a couple of your closest friends. ‘Don’t tell anybody,’ you say. ‘Certainly not,’ they promise. However, they are naturally unable to keep it com pletely to themselves, so they, too, allow themselves to disclose the secret to a couple of confidants, under the strict agreement that they will keep it quiet. Those confidants do the same, and so it goes on, each new disclosure leading on to ‘just a couple’ of others. For the sake of argument, let’s suppose that the news broke at 8 a.m., and that each person’s disclosures took place within half an hour. How many people will know the news by 8 p.m. the same day?
8:00 a.m. |
Only you know the news |
8:30 |
You and two friends know (1+2) |
9:00 |
You, your friends, and their friends know (1 +2 + 4) |
9:30 |
Eight more have now joined the ring … |
By 8 p.m., there have been 24 half-hour intervals, during which time the number of people joining the circle has been regularly doubling. The number of people who are in the know after 24 half-hours can be represented as the sum:
What does this sum add to? How many people are now in on ‘our little secret’? You probably suspect that it has run well into the thousands. In fact it is far worse than that. It turns out that, assuming every new disclosure was made to people who didn’t know it already, there are now 33,554,431 people who have heard – roughly half of the UK population.
This phenomenal rate of increase is known as exponential growth, and defies most people’s sense of how numbers work.
The final number reached by exponentials is also highly sensitive to how many people hear the news from a single source, what we will call the spread factor. In our gossip example, each person was fairly discreet and told only two others, a spread factor of two. If they had instead told three other people (which is still quite discreet), then, by six o’clock that evening, 5.2 billion people have heard – that’s practically the entire world population. And you told only three people!
Spreading the gossip to new people doesn’t necessarily mean everyone will get to hear about it, however. For it to grow, the spread factor has to be more than one disclosure per person. If each person who hears the news spreads it to exactly one other person, then by 8 p.m. only 24 people have heard. The rate of spread in this case is steady and unspectacular.
If the news is sufficiently dull, or people are sufficiently good at protecting confidentiality, the spread factor will be less than 1 per person. In this case, the story actually dies off. Suppose that out of a group of people who know the secret, three-quarters tell one person, and the rest keep their lips sealed, a spread factor of three-quarters, or 75 per cent. Suppose, too, that 64 are in the room when the news is revealed. The spread goes like this:
8:00 a.m. | 64 people know |
8:30 | 75% of these 64 people pass it on (so 48 others hear about it) |
9:00 | Those 48 pass it on to 36 more 9:30 The 36 pass it on to 27 . . . and so on. |
The number in the gossip circle can be written as another series:
This series can go on for ever. If you wait long enough, does this mean that the whole population will eventually get to hear? The answer is no. Only a limited number will ever get to hear it, and the spread of news will eventually stop altogether. In fact there is a formula for working out the sum of an infinite series like the one above, so long as the spread factor, S, is less than 1.
If the number who hear the news at the start is A and the spread factor is S, then for the infinite series above.
In that example, A was 64 and S was 0.75. Plug the numbers into the simple formula to get: 64 / (1 – 0.75) = 64 / 0.25 = 256. That figure of 256 is known as the asymptote. It will never actually be reached, but when the number who have heard the news gets close to this figure, spread will cease.
Using the same formula you might like to confirm that if 200 people are given the news and the spread factor is only one confidant per ten people (so S = 0.1), then only 200/0.9, or 222 people, will get to hear the gossip.
This says something interesting about news leaks. The formula shows that the number of people who eventually get to hear a news leak is far more dependent on the spread factor than on the number who are exposed to it in the first place. Downing Street, take note.
The numbers behind infections
No doubt you can immediately see the similarity between the spread of gossip and the spread of an infection. The number of people who first hear the gossip is analogous to the number of people who are the initial carriers of an infection. The rate at which gossip is passed on to others is analogous to the infection rate of the disease. And, as we have seen, if gossip or a disease is going to become an epidemic, it is crucial that the spread factor must be greater than 1. If that factor can be kept below 1 -that is, if every carrier can be guaranteed on average to transfer the disease to less than one other person during the whole of their infection -then the disease will die out. This makes ‘1’ probably the single most important number in the whole of epidemiology.
The spread factor of diseases is dependent on a number of different things. The nature of the virus or bug itself is fundamental, of course. Some germs are so powerful and can infiltrate the body in so many ways, for example through touch or breathing, that they are highly infectious and hard to protect against. Others, such as HIV, are not particularly easy to transmit from person to person but may still have a high spread factor because they survive for a long time and their carriers inadver tently behave in a way that gives the germ a helping hand in getting passed on, for example through the transfer of body fluid.
In order to work out the growth rate of the infection, all of these factors are taken into account by analysing the statistics of how rapidly the infections spread in human populations. In the box are some approximate figures quoted for the infectivity of four well-known diseases:
Spread factors and infectious periods
A typical infectious period |
Spread factor |
|
HIV |
4 years |
3 |
Smallpox |
25 days |
4 |
Flu |
5 days |
4 |
Measles |
14 days |
17 |
In other words, at the start of an outbreak, a person with flu may be infectious for five days, during which time they will infect about four people. These figures are only rough averages, and depend on the specific virus, country and community in question. In developing countries the spread factors are usually higher.
The key point is that all of the spread factors are greater than 1, making all of the diseases a serious threat if they are left to their own devices. The rate for measles is particularly high, which is why it spreads like wildfire through classrooms of unimmunised children.
Why you need ‘e’ for natural growth
In the example of gossip-spreading earlier in the chapter, one assumption was that news spreads in regular clumps of half an hour. This is a gross simplification of what happens in reality. Infections don’t wait till the clock passes a certain time before they have their next surge. They spread continuously.
There is a special number, known as ‘e’ that is behind continuous growth, and to understand this number it might help to think of it in money terms, as the box explains.
Where ‘e’ comes from
Imagine you have £1, and you put it into a bank account that offers 100 per cent interest per year. If the bank pays you the 100 per cent interest at the end of the year, you end the year with £2.
If, instead, it pays 50 per cent every six months, then you have £1.50 after six months, and 50 per cent more than that at the end of the year, or £2.25.
What about four lots of 25 per cent at three-month intervals? This works out at even more, a final amount of £2.44.
As the periods between interest payments get shorter, you get closer and closer to continuous growth of your investment, but the sum of money at the end of the year tends towards a maximum figure. That maximum is about £2.72. The actual number begins 2.71828… and is known as Euler’s number, ‘e’. It is the number at the root of all natural population growth and a fundamental player in many other areas of maths, too. Expressed as the formula (1 + 1/n)n, the larger the value of ‘n’becomes, the nearer you get to e.
Like the bank that adds interest continuously, infectious diseases constantly spread themselves – or, at least, that is a pretty close approximation.
The spread factor, S, is the number of new cases created by an infected person, and the number of people infected at the start of the outbreak is I. If infection only happened as a sudden event at the end of one infectious period, the number of newly infected people at the end of the period would be:
However, in the same way that the fictitious bank doesn’t just add interest at the end, but adds interest on the interest, the new infections themselves start working on the population immediately. Not surprisingly, this leads to a formula that involves ‘e’. The number of infected carriers after one infectious period (that’s five days for flu or a month for smallpox) turns out to be:
So if 10 people have flu at the start of the week and the spread factor S is 4, this would predict that by the end of the week there will be:
After T infectious periods, the formula for the number of people infected is:
This is the fundamental formula of epidemics. If S is less than 1, then the expression e(S-1)T gets smaller as T gets bigger – in other words, the infection dies out. If S = 1, then the number of infected carriers remains constant. And, if S is more than 1, the infection becomes an epidemic.
This simple model of the growth of infection (and gossip) is quite accurate in the early stages. However, as more and more people become infected, there are fewer and fewer people left who are susceptible to infection. This must itself reduce the spread factor. You can picture the analogy with gossip. After a while, it becomes harder and harder to find somebody who hasn’t already heard the news, so the number of people to whom it spreads from each carrier will reduce. Infectivity wears out after a while, and, if the infection numbers drop low enough, it is possible for the infection to die out before everyone has been exposed to it.
The Kermack McKendrick model
In 1927 two scientists, Kermack and McKendrick, developed a mathematical model that has become the reference point for all other major epidemic models since. They noted that the total number of infected carriers would grow if the number of new infections over a certain period was larger than the number of people becoming noninfectious in that period, the latter being achieved in one of two ways – by recovering or by dying.
The huge uncertainty in forecasts of CJD
When the first few cases of CJD, the human form of BSE, were confirmed, there was a degree of hysteria in the press about a potential new plague. Scientists began to make estimates of how many victims the disease might claim. What was bizarre, however, was the huge range in these estimates. Claims were made that the illness might infect anywhere between 100 people and 500,000 people – which is a bit like saying, ‘I am confident that your income is between £50 and £500,000 a month’ (true, but you haven’t exactly narrowed things down).
The reason for this broad range comes down to the sensitivity of exponential growth to the rate of infection. In the early stages of a new illness, the spread factor is not known, and therefore it has to be estimated from the first few bits of data. Given the margin of error that is likely in the early stages, and the huge divergence of exponential graphs in their later stages, it is not so surprising that the final figure cannot be estimated with any accuracy for quite some time.
KM, as we will call them, divided the population into three categories:
• Susceptible (i.e. haven’t been exposed yet)
• Infected
• Recovered (i.e. now immune)
Their simplified model of the spread of infections took into account these three categories. The results proved remarkably good at replicating the patterns seen in real epidemics – a rapid growth in the number of people infected, followed by an equally rapid decline. The KM equations dealt with rates of change, and involved differential equations.
The mathematics of differential equations is certainly not trivial, and there is a severe risk that any further discussion on this topic will lead to a rapid glazing-over of eyes. So, instead, let’s skip the formulae and jump straight to some of the things that KM found when they had solved their equations.
Most interesting is that the KM equations can predict the proportion of the population that will never be touched by the infection.
The greater the initial infectivity of the illness, the smaller the proportion of ‘untouched’ there will be. Remember that an initial infection spread factor of more than 1.0 is critical for an epidemic to take hold. It turns out that, if the initial infection factor is 1.5, more than 50 per cent of the population will never be exposed to the infection. However, as the spread factor increases, the epidemic becomes more pervasive. By the time it reaches 3, only 5 per cent of the population remain unexposed.
Foot-and-mouth disease is an interesting case of this. Because it is so highly infectious, with S over 100, once one animal on a farm has become infected the entire farm is regarded by the modellers as being infected, and is treated as if it were one huge infected animal. Since the infectivity is considerably reduced at a mile separation, the spread of the disease is then viewed as being from farm to farm, with a more manageable infectivity rate of S at around 1.5.
With most infections, a critical starting rate is needed for the epidemic to take hold. Most often this means that people need to be packed together sufficiently closely, as often happens in the poor areas of towns, or the number of contacts between people has to be sufficiently high, through sexual promiscuity, for example. The consequence of this is that, if people can be kept far enough apart for long enough, most diseases will die out of their own accord with very few infections. Which is why the cruel policy of locking people into their own homes during the Great Plague was actually pretty effective…
Computer and other infections
As if biological infection weren’t bad enough, mankind has imposed other viruses on itself of its own volition. Most notorious is the computer virus, which in many ways is a direct mimic of its biological cousin. Computer viruses are miniprograms written by programmers with a grudge or with too much time on their hands. The most powerful ones can wreak havoc on millions of computers by wiping out hard disks or clogging up email boxes.
Like living bugs, the computer versions can lie dormant before springing into action weeks or even years after they have entered your system.
There are, however, some important differences between computer and biological viruses. Biological viruses require physical contact of some kind, which means that the geographical location of susceptible people is important. You are much more likely to catch flu from your neighbour than from somebody in Bucharest. But, thanks to the Internet, geographical distance is no defence against computer viruses – they can travel from anywhere in the world in a split second. And, while biological viruses usually take hours or days to make their occupants ill, computer viruses can do equivalent damage almost instantaneously.
Finally, while mankind has a broad genetic diversity, making some people naturally immune to certain infections, computers increasingly lack this diversity (just think of how many people’s computers contain identical Microsoft or Netscape ‘genes’). So, if one computer is infected, there is a risk that most might become so.
The consequence of all this is that computer infections can spread far faster than any viruses that have been witnessed before. They can infect tens of millions in hours instead of years, which makes them a major global threat to organisations of all kinds. A number of viruses have already caused this level of devastating damage. In May 2000, a virus wrapped up in an email headed LOVE LETTER FOR YOU was estimated to have reached 50 million users in just a week. Anybody opening the attachment suffered serious damage to their computer files, and many people did just that. One estimate put the damage at $2.6 billion.
The growing threat of computer viruses explains why the computer world has created roles that are direct parallels to those in the medical community. Scientists have produced mathematical models to predict the rate of infection and the risk of exposure from different computer viruses; computer ‘doctors’ are there to treat and with luck revive the infected. And, most important of all, computer ‘health’ advisers are there to immunise or protect computers against infection; the golden rule is that prevention is always better than the cure.
The mathematics of infection applies even beyond the world of viruses. Very similar models are used to forecast the growth in the number of products in the market, for example. Marketing experts are forever looking at how they can increase the infectivity and penetration of their product, while minimising the infectivity of their rivals.
Religion, according to some, is also a virus. It tends to be passed on from parents to their children far more than between any other groups, and, while the infection can disappear in adulthood, it has a tendency to reappear in old age and other stages of life when immunity is low.
Even jokes and trivia spread with the same rules as viruses. Have you heard that St John’s Wood is the only London Underground station that contains no letters in common with the word ‘mackerel’? If you have, that’s the virus at work. If you haven’t, pass it on.