3

EVERYDAY ESTIMATION

ESTIMATION AND MONEY

SHOPPING BILLS AND SPREADSHEETS

One of the most familiar uses for back-of-envelope maths is in adding up bills. If you’re working to a budget and doing a big shop, it can be useful to mentally tot up how much you’ve put in the basket so far, so as to avoid getting a nasty surprise at the till. But there are also times when you will see a column of figures – in a bill, or in a spreadsheet1 – look at the total and think: ‘surely that can’t be right’. In both cases, speedy ballpark estimates are useful.

One simple way to speed up the estimate of a shopping bill is to ignore the pence and just add the pounds. The result will be a figure that is an under-estimate – what’s known as a lower bound. You can repeat the task, this time rounding odd pence up to the nearest pound, to give you the upper bound. The true figure will be somewhere in between the two, and a reasonable guess is to go for the middle figure.

Lower Bound    £32

Upper bound    £40

Estimate          £36

Quicker still, add up the pounds, and then add 50p for each item on the bill, so if the pounds add up to £32 and there are 10 items:

£32 + 10 × 50p = £37.

What this doesn’t allow for is the tendency for some products (books, for example) to have a disproportionate number of prices that end with 99p (e.g. £2.99 or £9.99) – or more commonly these days, 95p. This is a device used by retailers to trick us into thinking a product is cheaper than it is. The same items priced at just a penny higher, £3 and £10, feel much dearer, because we are more influenced by the leading digit than the later digits.

Rather than trying to allow for the peculiar distortions of supermarket prices, you can simplify things considerably just by rounding them to the nearest £1 or by using Zequals.

TEST YOURSELF

Bob has a spreadsheet that enables him to keep tabs on sales of widgets around the country. Here’s the column for widget sales in Newcastle:

SALES
£ 190.10
£ 120.46
£    8.22
£ 396.63
£ 130.50
£    41.55
TOTAL £ 697.36

Do you trust the total at the bottom?

Solutions

HOW CAN THAT SHOP STILL BE IN BUSINESS?

A few years ago, a new cooking utensil shop opened in my local high street in London. It was at the premium end of the street, where rents had been hiked up extortionately in recent years, and the gossip was that an annual rent of £25,000 was typical for a shop that size. It prompted me to wonder what their business model was.

If the rent is £25,000 per year, that’s £25,000 ÷ 50 = £500 per week, or about £100 per shopping day. So just to pay the rent, the shop needed to make a profit of £100 per day. If the shop had a 100% mark-up on products, then it needed £200 per day in sales just to cover the rent (in other words, about seven of their upmarket saucepans). That was before the business rates, maintenance and insurance. It meant that before the owners could pay themselves and any other staff, they were probably having to put over £400 through the tills each day, on average. Yet frequently when I went past, there was nobody in the shop. ‘How do they stay in business?’ I wondered.

My question was answered not long afterwards, when the shop closed.

SAVING, BORROWING AND PERCENTAGES

Percentages are ubiquitous when dealing with money. I covered discounts and VAT here, but percentages get more involved when you are dealing with loans and saving – and the numbers when working out what interest you will have to pay on a mortgage, or receive from an ISA, are probably going to dwarf those you encounter during shopping.

The maths, fortunately, is the same (5% interest on savings of £3,000 means you’ll earn £150 each year). The complication comes with compounding: if your loan or savings run beyond a year, then you start paying, or earning, interest on interest.

If you have savings of £10,000 in an account that is paying 10% interest (happy days!), you will have £11,000 after one year. But at the end of the second year you won’t have £12,000 because you are now earning interest on £11,000 rather than £10,000. After two years, your savings pot has therefore risen to £10,000 × 1.1 × 1.1 = £12,100. An extra £100. It doesn’t sound much, but this margin becomes more significant the longer you save for, and the higher the interest rate is. (And on the other side, debts that are incurring compound interest can rise alarmingly.)

When interest rates are small (2.5%, say), there’s a handy back-of-envelope rule for working out longer-term savings that’s close to the right figure. If you are saving for four years at 2.5%, then the interest you’ve earned after four years is very close to 4 × 2.5% = 10%. (For comparison, the correct figure is 10.38%.) The smaller the interest rate, the better this approximation becomes.

This rule of small numbers allows you to be quite cavalier in calculations where you aren’t required to give a precise answer, because it means you can just add and subtract percentages instead of multiplying and dividing them.

For example, if your savings go up by 3.3% in one year, 3.1% the next, and 2.7% the year after that, you’re not far off the right number if you say that the total growth over three years will be 3.3 + 3.1 + 2.7 = 9.1% (and using Zequals, you can simplify it further: 3% + 3% + 3% = 9%).

This is fine over short periods. But over longer periods there’s another handy rule of thumb: the Rule of 72.

DOUBLE YOUR MONEY – THE RULE OF 72

If your bank pays you compound interest at 4%, how long will it be before you have doubled your money?

This complex-sounding calculation can be answered with a deceptively simple rule. It’s known as the Rule of 72.

Whatever the growth rate (be that 1.2%, 4%, 10% or even 30%), the time it will take for the quantity in question to double can be found by dividing it into 72.

With an interest rate of 4%, your money in the bank will double in 72 ÷ 4 = 13 years.

How incredibly convenient (you might be thinking) that the number in this rule of thumb is 72, for 72 is a number that divides exactly by 2, 3, 4 and 6: numbers that will often be used as interest rates.

It turns out that, strictly speaking, this should be known as the rule of 69.3. That is the figure that emerges from doing the algebra behind exponential growth (described in more detail here). But try dividing anything into 69.3 and you’ll end up with a mess. Whoever first worked out this rule quickly spotted that by nudging it up to 72 instead, there was a chance people would be able to work out the numbers on the back of an envelope – or even in their heads. So the Rule of 72 it is.2

Knowing how long it will take for numbers to double is handy, but there may be times when you want to know a different target. What about trebling your money, or increasing it tenfold? It turns out there is a rule of thumb for any target of growth that you choose. In each case, it uses a convenient number that is quite close to the accurate one.

Here’s a table:

How many years before a quantity … Convenient number to divide into Example: How long it takes if growth rate = 4%
Doubles 72 72 ÷ 4 = 18 years
Increases threefold 120 120 ÷ 4 = 30 years
Increases tenfold 240 240 ÷ 4 = 60 years

CONVERTING CURRENCY

Any international trip or purchase is going to involve a conversion, and unlike units of measurement, the conversion rates between sterling, euros and US dollars are changing all the time. During the current century, £100 could have bought you anything between $120 and over $200, which is a massive variation.

With many currency conversions, you – or the person you are dealing with – are probably going to be concerned with calculating figures that are correct to the last penny, or cent, and you’ll almost certainly use a calculator. But imagine that you are at the airport and need some US dollars. The exchange rate is at $US1.40 to the pound, you ask the travel-exchange desk for $1,000, and the cashier charges you £793.40. Happy? Well, £793.40 is about £800, and a mental check tells you that 1,000 ÷ 800 = 1.25, which is a long way from the supposed $1.40 rate. So they’re either charging a huge commission, or the cashier has mistyped a number. Do you still want to go ahead with this transaction?

Fortunately for the British, the pound is more valuable than most of the rest of the world’s units of currency,3 meaning that £1 will usually buy you more than one dollar, euro, Swiss franc, and far more yuan or rupees. Therefore, converting sterling to other currencies generally means multiplying by a number between 1 and 2.

You probably have your own ideas of mental short cuts you might use to do a rough conversion, but depending on the exchange rate, you might round to the nearest convenient ratio. For example:

Exchange rate of other currency Close to… Estimation Short cut
1.09 1.1 Add 10%
1.35 1.33 Add one-third
1.52 1.5 Add a half
1.72 1.75 Add three-quarters
1.81 1.8 Double, then take off 10%
2.1 2 Just double it!

This is fine if you are converting from (say) sterling to US dollars. But if you are doing the reverse, it will mean multiplying by a number less than one, or dividing, and most people find both of these harder to do mentally. A confident arithmetician might be happy dividing by (say) 1.4, but the back-of-envelope approach will be to divide by 2 and add about 50% to the answer. And that’s probably going to be good enough for when you need it. ‘So that hotel is going to cost us $500 for one night – let’s see, halve that = £250, plus £125… more than £350: ouch! – way beyond our budget.’

ESTIMATING SIZE

ESTIMATING DISTANCES

The most straightforward way to estimate distances is by comparing the distance you are trying to figure out to one that you already know. If you know that the distance from Sydney to Melbourne is 500 miles, then the distance from Sydney to Canberra – a city that for historical, political reasons needed to be roughly in the middle of the two cities – is going to be about 250 miles. And if you don’t know the distance from Sydney to Melbourne, you might be able to estimate it using some other information you have – for example, that it takes about an hour to fly from one to the other. Since most planes fly at about 500 mph, in one hour the plane will cover 500 miles. There is, of course, plenty of ‘ish’ involved here.

The same principle applies to shorter distances, of course. If you know that your own height is 1.5 metres (say), then you can estimate the height of the room you are in by, for example, picturing whether you’d reach the ceiling if you stood on your own shoulders. (I’m a firm believer in picturing things like this.)

All of this is fairly routine and familiar. But there are three rather more quirky (and I think charming) ways of estimating distance and height.

1. The Dropped-Stone Method

When I was a child, we’d often take a trip to Beeston Castle in Cheshire. One of my favourite features there was the well, and we’d delight in dropping a pebble into the well and counting the seconds before we heard it hit the bottom. (With decades-worth of children doing the same thing since, I fear the well is no longer quite as deep as it was.)

You can estimate the depth of the well with a bit of Newtonian maths, which says that the distance travelled by an object falling under gravity is given by:

Distance = ½ a.t2

where a = the acceleration due to gravity (about 10 metres per second per second) and t = the time taken for the stone to drop to the bottom of the well. If the pebble drops for three seconds, the depth of the well is therefore roughly:

½ × 10 × 32 = 45 metres.

There are two things being ignored here that mean the measured time will be an over-estimate. The first is that, because of air resistance, the pebble will eventually stop accelerating, and will therefore take longer to hit the bottom than it would have done in a vacuum. The second is that when the pebble hits the bottom of the well, it takes time for the sound to travel back up to your ears. Both of these effects are quite small, however. So, for a decent estimate of the height, square the time and multiply by 5.

2. The Finger Method

Let’s say you are standing on the beach and can see a yacht on the horizon. You wonder how far away the yacht is.

Here’s one way to find out. Stretch out your arm in front of you, close one eye and hold up a finger so that it covers the yacht.

Now open that eye and close the other one. Your finger will appear to jump to the side, away from the yacht (a phenomenon known as ‘parallax’).

The distance to the yacht is roughly:

10 × the distance that the yacht jumped.

You will need to use your judgement to estimate by how many yacht-lengths the yacht has jumped to the side. If you estimate that the yacht jumped by about 15 times its own length, then the yacht is roughly:

10 × 15 yacht-lengths

= 150 yacht-lengths away from you.

There is of course one other thing you need to estimate: the length of the yacht itself. You’ll need to use your rudimentary knowledge of yachts to decide if this looks like a 5-, 10- or 20-metre yacht. If you reckon it’s a 10-metre vessel, then your estimate is that it is about 10 × 150 = 1,500 metres away.

As I say, it’s quirky. But I bet you now want to try it with that pylon you can see out of the window…

3. The Crisp-Packet Method

You are in the park and you are curious to know how tall a particularly fine poplar tree is. You can make a good estimate using an empty crisp packet.

Fold over the corner of the packet so that the top of the packet lines up with the side, and make a crease on the diagonal where the fold is. This diagonal is now at 45 degrees.

Hold the crisp packet next to your eye, and look along the diagonal as if it’s a telescope, keeping the bottom of the crisp packet horizontal.

Walk towards the tree until the diagonal of the crisp packet lines up with the top of the tree.

Now, taking large strides of about one metre, count how many strides it is to the base of the tree.

The height of the tree is approximately:

the number of strides + your height.

How does this work? By lining up the folded crisp packet with the top of the tree, you have formed an isosceles triangle: the distance from you to the tree is the same as the height of the tree above your eye-line.

CIRCLES AND PI

There are two formulae related to circles that every child is required to learn.

For a circle of radius R:

Circumference = 2πR.

Area = πR2.

But what value is π? It depends who you ask.

A mathematician will tell you that pi is the ratio of the circumference of a circle to its diameter, a transcendental number that begins 3.14159… and continues for ever.

An engineer (so the joke goes) will tell you ‘π is about 3, but let’s call it 10 just to be on the safe side’.

Whichever of these views you sympathise with, when it comes to most real-world problem solving, knowing that π h 3 is good enough.4

But when on earth might you need to use it at all?

In the 2004 Olympics in Athens, British athlete Kelly Holmes won gold in the 800 metres. Five days later, she had made it to the final of the 1500 metres, and she was aiming to become the first British athlete to win gold medals in both distances.

Kelly Holmes was a tactical runner, and was prepared to run at a pace that was comfortable for her, even if it meant she spent some of the race at the back of the field. As the final lap began, she was positioned in eighth place. Now she had to get to the front. The problem was that, in overtaking, she would need to be in the second lane and run outside the athletes in front of her. On the straights this would make no difference, but both ends of the track are semicircles, and Holmes therefore had to run round a circle with larger circumference than her competitors. In other words, to win gold, she had to run further than 1,500 metres.

But how much further?

At first glance it would appear that we don’t have enough information. How long is an Olympic race track? What’s the radius of the bends? How wide are the bends? But it turns out that only one of these items of data is important.

Take a look at the sketch of a track. Let’s call the length of the straights L, and the radius of the inside lane at the end R. Remembering that the circumference of a circle is 2πR, the length of a lap is twice the length of the straights plus the circumference of the circle, or 2L + 2πR. But Kelly Holmes had to run around a circle whose radius was larger – by the width of one lane.

How wide is a lane on an athletics track? Just picture it in your mind. Thirty centimetres? No, much wider than that. A couple of metres (the equivalent of an athlete lying across it)? No, less than that. A metre sounds about right.5 So let’s say the radius of Kelly’s circle was R + 1.

We can now work out the length of Kelly’s lap:

2π(R + 1) + 2L

= 2πR + 2π + 2L.

Subtract from it the length of the inside lap and Kelly’s ‘extra’ distance is:

= 2πR + 2π + 2L – 2πR – 2L.

2πR and 2L cancel out to leave us with 2π…, which is 2 × 3.14… let’s call it 6 metres (since everything is an approximation here).

Six metres – that’s a lot. It’s the difference between a gold medal and being an also-ran.

What’s interesting is that Kelly Holmes must have built this into her tactical calculations for the race: she knew she’d have to run further, but felt it was a price worth paying for enabling her to run at her own pace. And it worked: she won the race with a couple of metres to spare.

And that is how Kelly Holmes became Dame Kelly Holmes.

AREAS AND SQUARE ROOTS

We are often presented with numbers that are in square units, particularly area. A description of an apartment might say that it is ‘1,200 square feet’, while a forest fire might be said to be covering ‘100 square miles’. Picturing square anything is difficult – we tend to find it easier to think in lengths. A hundred square miles is the equivalent of a square with sides that are 10 miles long, and to find the length of the side of that square, we need to be able to work out the square root of the area.

Here’s a real example. In the winter of 2013–14, the South West of England experienced one of its wettest months ever. As a result, an area of low-lying land known as the Somerset Levels was flooded, and the flood waters remained for several weeks. At its peak in January 2014, it was reported that 69 square kilometres had been flooded.

Let’s picture how big this area of flooding would be if it were a square.

If the area of a square is 69 km2 then the length of each side is

which is a number between 8 and 9 (and nearer to 8). So, the area that was flooded was roughly the same as a square that was 8 km × 8 km, or 5 miles × 5 miles. Now that is something that I can just about picture.

The Somerset Floods story was an example of where it could be handy to be able to work out the square root of a number. Working this out exactly can be messy, but there’s a neat method for making a good estimate.

Suppose you want to work out the square root of 170,423.

Starting from the right-hand side of the number (the units column) and working leftwards, break the number up into pairs of digits, like this:

17  04  23.

Start with the first pair of digits (17) and estimate the square root of that number. Since the square root of 16 is 4, the square root of 17 is going to be 4-and-a-bit. If we’re using Zequals, we just call it 4, but if we want a little more accuracy we can call it 4.1.

Now count how many other pairs of digits there are, and for each pair, multiply the square root of the first number by 10. In this case, there are two other pairs, so we multiply 4.1 × 10 × 10 = 410. So the square root of 170,423 is about 410.

Try another: the square root of 4,138,947.

Split the number into pairs, starting at the right:

4  13  8947

(notice this time that the opening ‘pair’ is just a single digit: 4).

The square root is therefore, roughly:

2 × 10 × 10 × 10 = 2,000.

TEST YOURSELF

Can you estimate in your head the square roots of the following numbers? If you end up within 5% of the exact answer, give yourself a point. Within 1%, give yourself a hefty pat on the back.

(a)  26

(b)  6,872

(c)  473.86 (hint – ignore the digits after the decimal point!)

(d)  The floor area of a flat is advertised as being ‘910 square feet’. How big would that be if it were a single square room?

(e)  According to Wikipedia, the Caspian Sea is 371,000 km2. If it were a square with the same area, would it fit inside the borders of France?

Solutions

WHO WANTS TO BE A MILLIONAIRE? (PART 2)

The year was 2008,6 and this was an episode of Who Wants to be a Millionaire? featuring couples. One couple – let’s call them the Smiths – had made it to £64,000.

The question that would take them to £125,000 was this: ‘Which ocean has an area of 4.7 million square miles?’

(a) Arctic

(b) Atlantic

(c) Indian

(d) Pacific

The Smiths didn’t know the answer, so decided to use their final lifeline, which was ‘Ask the audience’.

About half the audience voted for the Pacific, but the couple had hoped for a more definitive vote – 80% or more – so they decided to play safe and take the money.

The reason I know about this story is that my friend John Haigh, a lecturer at Sussex University and keen back-of-enveloper and Zequaliser, told me the next day that this question had cropped up, and that he had worked out the answer in his head. As his starting point, he made an estimate of the size of the ocean with which he was most familiar: the Atlantic. Can you figure out which was the correct answer? (See here)

METRIC AND IMPERIAL CONVERSIONS

WHO NEEDS IMPERIAL?

Like it or not, we will all continue to need to convert from metric to imperial and vice versa for some time yet. Why? There are two reasons:

(1) The USA: The world’s biggest economy and most influential culture still stubbornly talks and works in feet, yards, pounds and gallons. This rubs off on the rest of the world because there will be references to imperial units not only in engineering technical specifications but also, for example, in popular songs, movies and American cookery books.

(2) Old habits die hard: Commonwealth countries that shifted from imperial to metric still have a legacy of imperial measurements in their language and their thinking. For example, in New Zealand, which went fully metric in 1976, the number of kilometres that a car has travelled is still referred to as its ‘mileage’. But that’s nothing compared with the UK, which is divided down the middle as a nation on which system it uses. Even individuals have split personalities when it comes to measurement. I have met numerous adults who, for example, know their height in feet but their weight in kilos. And this is not just an age thing. I’ve done surveys of hundreds of 15-year-olds across the UK, and the results are consistent:

This widespread use of imperial units is despite the fact teenagers never encounter these units in their school exams, and despite the fact that almost none of them know that there are 14 pounds in a stone! So, in the UK, whichever units of measurement you use, you are going to encounter people who prefer the other units.

THE MARS ORBITER FIASCO

In December 1998, NASA launched a space probe called the Mars Climate Orbiter. Its mission: to study the Martian atmosphere. Several months later, as the orbiter approached the planet, it fired the thrusters that were designed to put it into a stable orbit. But to the horror of the NASA team who were monitoring progress, the rocket thrusters were much too strong and the probe hurtled into the planet and was destroyed.

A NASA review board later discovered that the software designed by the Jet Propulsion Laboratory at NASA had used the metric system in its calculations, but the engineers at Lockheed Martin Astronautics who built the spacecraft had based their calculations on traditional inches and feet (in the report, this was referred to as the ‘English system’, as if this were somehow not the USA’s fault). Instead of applying pound-force, the rockets applied Newton-force, about four times bigger. The cost of this simple error was $125 million of lost space probe.

MILES AND KILOMETRES

We took a huge step forward when we switched from imperial units to metric. Calculations in feet, pounds and gallons were simplified overnight when everything could be worked out in decimal.

In the UK, metrication really kicked in during the early 1970s, when we joined the EEC, in which metric units were already standard. The school curriculum went metric at the same time, which means that maths education of anyone under the age of 50 focused exclusively on metric units. With one, glaring exception: the mile.

Road signs are exclusively in miles, and, by default, every car speedometer is in miles per hour. We therefore have the curious situation where the vast majority of teenagers will quote long distances in miles, but short distances in metres; and they will quote faster speeds in miles per hour, but slower, more human speeds in metres per second. Confusing? Well no, not really. So long as they aren’t having to switch between the two, most people are comfortable enough working in whichever imperial or metric unit they are used to in the context they are working in.

The problems start when there is a need to convert from metric to imperial. I asked a large group of 15-year-olds to estimate the distance from London to New York. Their answers varied considerably, but most of them were in the not-completely-outrageous range between 1,000 and 10,000 miles (the correct figure is around 3,500 miles as the crow – or at least the Boeing 787 Dreamliner – flies).

The problems arose when they were asked to then quote that number in kilometres. There was relatively low awareness of what the relationship is between a mile and a kilometre, other than that they are both ‘quite long’. In many cases, the mileage figure was just multiplied by 10 to get a figure in ‘kilometres’. Perhaps this mistake arises because the shorthand for both metres and miles is ‘m’. There’s a vague awareness that an ‘m’ is related to a ‘km’, so why not multiply by 1,000… no wait, that sounds too high… why not multiply by 10 instead?

The actual ratio of kilometres to miles is 1.6. (More precisely, it is 1.609…)

How To Remember Your Conversions

These three mnemonics appeared on the back of Kellogg’s cornflakes packets in the 1970s, and those of a certain generation have never forgotten them:

A litre of water’s

a pint and three-quarters

Two and a quarter pounds of jam

weigh about a kilogram

A metre measures three foot three,

it’s longer than a yard, you see.

BACK-OF-ENVELOPE CONVERSIONS

If it’s too much of a faff to use the accurate ratios above, you can use a Zequals-style approach to give you conversions that will suffice in most situations. And conveniently, all of the common rough-and-ready conversions only require doubling or halving.

Accurate conversion Rough conversion Example
Litres to pints × 7/4 (or 1.75) Double 10 litres ~ 20 pints
Litres to (UK) gallons × 7/32 Quarter 20 litres ~ 5 gallons
Kilometres to miles × 5/8 Halve 200 km ~ 100 miles
Metres per second to miles per hour × 2¼ Double 10 m/s ~ 20 mph
Centimetres to inches × 2/5 Halve 6 cm ~ 3 inches
Metres to yards × 13/12 (add 1/12) Equal 70 metres ~ 70 yards
Kilograms to pounds × 2¼ Double 10 kg ~ 20 pounds
Celsius to Fahrenheit × 9/5 and add 32 Double and add 30 20 °C ~ 70 °F

There are of course other more obscure imperial measurements that you might encounter, such as acres (land), furlongs (in horse racing) and fluid ounces (cooking), but these rarely crop up in everyday encounters and you’re unlikely to have to deal with converting them on the hoof.

TEST YOURSELF

Do rough conversions of the following in your head:

(a)  70 miles in kilometres.

(b)  40 kilograms in pounds.

(c)  150 metres in yards.

(d)  100 kilometres in miles.

(e)  25 °C in Fahrenheit.

(f)  10 stones in kilograms (one stone is 14 pounds).

Solutions

A QUIRKY METHOD FOR MILE–KILOMETRE CONVERSION

About 800 years ago, a mathematician called Leonardo of Pisa (who was nicknamed Fibonacci) wrote about a curious sequence of numbers. Starting with 0 and 1, the sequence goes as follows:

0 1 1 2 3 5 8 13 21 34 55

Each number in the sequence is obtained by adding the previous two terms. So, after 55, the next term will be 34 + 55 = 89.

Now here is the remarkable thing. From the number 3 onwards, if you take any two consecutive terms in the Fibonacci sequence, their ratio is very close to 1.6. For example, 13 ÷ 8 = 1.625, and 34 ÷ 21 = 1.619. This isn’t just a fluke; it turns out that as you go further along the sequence, the ratio of successive terms in the Fibonacci sequence gets closer and closer to a number known as the ‘Golden Ratio’, which is roughly 1.618.

The coincidence is that the Golden Ratio is very close to 1.609, which is the ratio of miles to kilometres. So if you want to convert 13 miles to kilometres, then, just by glancing at the Fibonacci sequence, you can estimate that the answer is going to be about 21 km, and you’ll be correct to within 1%.

It works in reverse, too. Travelling around Europe, you spot that your destination is 34 kilometres. ‘That’s 21 miles,’ you can state, with remarkable accuracy.

ESTIMATION AND STATISTICS

AVERAGES AND UNCERTAINTY

The word ‘average’ is used in everyday speak to mean ‘typical’ or ‘somebody in the middle’. In many situations it’s fine to use this general word, but it’s worth being reminded that there are three different averages that are commonly used.

The mean is the most commonly used average. It’s found by adding up all the values or measurements, and dividing by the number of items you are measuring. The mean is what’s used when referring to average adult height, batting averages in cricket and also people’s income.

The median is the middle value, if you were to line up all the data from smallest to largest.

The mode is the data value that crops up the most often. For example, the ‘modal’ shoe size for an adult woman in the UK is 6.

We’ve seen earlier that most statistics have an element of uncertainty, so that statistics that you are presented with might be an over- or under-statement of the true figure.

The cause of this ‘error’ will be one of two things: either the method you use to measure the statistic isn’t reliable (a weighing scale that gives different readings each time, for example), or the thing that you are measuring tends to vary (for example, if you are looking to find the height of a typical person).

Either way, the ‘true’ figure is going to be somewhere on a spread of possible values. Most often this spread (more formally known as a distribution) will look something like this:

This shape is known as the Normal distribution (so called for the banal reason that it’s not abnormal), though it’s often called a bell curve (because it is shaped like a bell). Points in the central, higher region of the curve represent values or readings that arise most frequently, while the lower regions left and right are the more extreme and less frequent values. The heights of children in a class, the time it takes for daffodils to bloom and many other everyday phenomena follow this sort of pattern. It’s handy that this spread is symmetrical, because it means the average (mean) value is right in the middle. In distributions like this, it’s just as likely that a statistic or measurement will be higher than the quoted figure as lower, so it’s fine to call the highest point the ‘average’.

However, not all statistics follow this pattern. For example, if you were monitoring the amount of time that people spend in a toilet cubicle at a rock concert,7 the distribution would look something like this:

Most people might spend three or four minutes, but a few take 10 minutes or more, and one or two will exceed 20 minutes. This distribution is known as a ‘log normal’ (cue lavatorial jokes). Looking at this graph, you’d say that the typical time spent is two to three minutes, but the ‘mean’ (the total time spent divided by the number of people) is going to be higher than that because of the few extremes.

Adult income, similarly, has a skewed distribution, with many people clustered in the range £20,000–£30,000, but with a long tail to the right that includes a few on multi-million salaries. So, while the middle (median) income is around £25,000, the ‘mean’ income will be to the right of the peak. The majority of people earn less than the average (mean) income – which can make the choice of which average to quote very political.

There are two other distributions that are also quite common. If you take a random sample of tweets on Twitter, the number of ‘likes’ for individual tweets will look like this:

The most common number of likes is zero, followed by 1, then 2, each becoming less likely. This is known as an exponential distribution.

Finally, if you were to pick a more economically developed country and gather together everyone between the age of 20 and 50, the distribution of their ages will typically look like this:

Yes, there will be bumps from year to year, and there might be a slight trend going up or down, but broadly speaking it’s a flat line. If you pick somebody at random from this group, the chance of the person you choose being 23 will be about the same as the chance of them being 33.

You’ll get a similar pattern with, for example, the time you expect to wait for a London Underground train if you turn up at a station at random. Trains might be spaced every couple of minutes, and you are as likely to arrive on the platform at the beginning of the two-minute gap (just after a train left) as you are to arrive at the end of the gap (as a train is pulling in). And, of course, you could arrive anywhere in between.

Knowing which of these distributions a particular statistic belongs to can be helpful in making estimates.

WORKING OUT PROBABILITY

‘Probability’ is the formal way of saying ‘the chance of a something happening’. Probabilities range from absolutely certain (i.e. it will happen 100% of the time, such as the sun rising tomorrow) and impossible (0% of the time), and can be anything in between. The chance of flipping a head on a fair coin, for example, is 50%, while the chance of picking a Heart from a regular pack of cards is 25%, since a pack is divided equally into four suits.

Although percentages are the most common way of expressing everyday probabilities, there are other ways of saying the same thing. The chance of picking a Heart can be represented as:

But some probabilities can’t just be stated by simple inspection. Sometimes, to find (or estimate) a probability, you need to take a different approach. For example:

The more vague the probability that you are using in your calculations, the less reliable your estimates based on it will be. So, while you can work out precisely the odds on you getting a ‘flush’ of five cards in poker, you can only estimate very approximately the chance that your team will win a pub quiz.

BACK-OF-ENVELOPE SURVEYS

Many of the statistics that are fed to us are the result of surveys. If we’re told that 7% of the public plan to vote for the Green Party at the next election, or that 65% of children spend over three hours a day at home in front of screens, those figures aren’t based on a census of the whole population, but on a sample of perhaps 1,000 people, chosen carefully to be ‘representative’ of the population by age, gender, social background and so on. Thanks to the large sample, the pollsters and market researchers can give us a figure that they know is quite reliable.

But surveys don’t have to be restricted to the professionals. You can do your own back-of-envelope versions. The experts would pull their hair out if you conducted a sample based on, say, 10 people, but even a tiny sample can begin to give you a sense of the bigger picture.

Several years ago, I was at a committee meeting where concern was expressed at the falling membership of a national body that I was working with. There was a proposal to hire some consultants to help draft a survey that could be sent out to all of the several thousand members, canvassing their views.

I had an alternative suggestion. We needed information quickly. Why didn’t the 10 of us around the table each take a couple of copies of a short questionnaire, and the next time we met a potential member over the coming week or two, ask them to fill it in. I reckoned that even a handful of responses would give us a huge pointer to what the big issues were. I was voted down, but I decided to do my guerrilla research anyway. Out of five people that I talked to, two said that the reason they were not members was that they had no time to enjoy the benefits, and three said they had once been members but now met their needs using free alternatives that were available via social media.

With that tiny sample, it would be bogus to claim with confidence that ‘40% of the target group no longer have time to enjoy the benefits of membership’ or that ‘60% now get their resources elsewhere’. But, in truth, we didn’t need to know the results to the nearest 10%. If the result of the mini survey had been 90% and 30%, we would still have come to the same conclusion, which was that new threats had emerged, from time pressure and from social media, and that these had to be addressed. (That national body never did get round to its big survey – but it did make some positive changes, including engaging in social media.)

Back-of-envelope surveys are nothing special. We all do them, all the time. We ask a few friends which plumber they’d recommend, or what the going rate is for the tooth fairy when a child loses a tooth. A survey of three people will produce a result with a huge margin of error. And yet, in the end, we can still glean valuable insights, and make better decisions, if our survey tells us that 67% of the public (OK, in truth it’s two out of three of our friends) say that in their house the tooth fairy pays £1.

Of course, you should do big, statistically rigorous and representative surveys when you can, and when accuracy is important. But when you don’t have the time or the money to do it, don’t rule out the value of un-rigorous, biased, back-of-the-envelope alternatives – as long as you remember not to set too much store by the numbers in the results.

HOW LONG WILL WE BE IN THIS QUEUE?

It’s October half-term. I’m at Legoland theme park. Again. The kids are desperate to go on the ‘Pirate Falls’ ride. My heart sinks when I see the sign saying the wait is currently one hour. That will be one hour shuffling slowly along the queue, with the prospect of a five-minute boat ride and a soaking at the far end.

Luckily, the queue gives good views of people setting off on the ride, so this is a chance to check out if Legoland’s prediction of one hour is right. If it’s really going to take that long, we’ll go and find something else to do.

We decide to do a survey. We watch the boats setting off at the start of the ride, and over a period of five minutes we count how many people have gone past. Some boats have four people in them (Hooray, that will deplete the queue!), a couple have none (Boo! What a waste of a good boat).

Over five minutes we count 36 people, so we work out that the average throughput is about seven people per minute. Then we estimate the length of the queue – about 150 people.

One hundred and fifty people at seven people per minute – 150 divided by 7: that’s about 20. Clearly Legoland’s prediction of one hour is massively wrong; it will be more like 20 minutes. This cheers me up no end, though it turns out that there’s a factor I hadn’t allowed for: Q-bots, which allow premium customers to go straight to the front of the queue through a separate entrance. It turns out that Q-Bots slow the queue down by about 25%, so it’s nearer 25 minutes before we get to the front of the queue.

Still, back-of-envelope thinking ensured we were better informed than those who trusted the waiting-time sign, and it also helped occupy us for a few minutes. I was happy. Until we got to the end of the ride, when we left the boat looking like drowned rats.

THE CHANCE OF MULTIPLE EVENTS

You can work out the probability of two or more independent events happening by multiplying their probabilities together.

It’s often convenient to work out the probability of more than one event happening by using fractions – indeed, this is perhaps the most important application of the methods that you learned at school for adding and multiplying fractions. For example, the chance of getting two sixes when rolling two dice is worked out as:

That’s a lot easier than working out 16.7% × 16.7%.

The events are independent if one of the events is in no way influenced by the other: rolling a dice and flipping a coin are independent events, but living in Wales and having the surname Jones are not. One in 20 people in the UK live in Wales, and around 1 in 100 in the UK have the surname Jones – but the chance that the next winner of the UK Lottery is a person who lives in Wales and has the surname Jones is not 1/20 × 1/100 (= 1 in 2,000). The proportion of Welsh residents called Jones is around 1 in 17,8 so the chance of a Lottery winner being Welsh and called Jones is going to be roughly

Knowing whether or not two events are independent is partly common sense and partly experience, but for back-of-envelope purposes, as a starting point you can generally treat two events that don’t have an obvious connection as being independent.

For example, suppose you’re running a bit late, and need to catch a bus to the station where you want to catch a train. Suppose about four-fifths (80%) of trains leave your local station on time, and your hunch tells you that the chance you’ll have to wait more than five minutes for a bus (and hence be late for the train) is about one in two (50%). Now, the chance of the bus being late and the train being late aren’t entirely independent: bad weather would affect both of them, for example. But that connection probably isn’t significant. The chance that you’ll have to wait more than five minutes and that the train will be on time is therefore going to be about 4/5 × 1/2 which is 2/5, or 40%.

SPOTTING TRENDS

Much of our modern life is underpinned by statistics. They dominate our news, and we use them to form opinions, make judgements and, most importantly, make decisions. It’s a statistician’s job to wade through data and to spot important patterns and connections. And in an era where ‘big data’ is used to help advertisers, political parties and other institutions to understand our behaviour in frightening detail so that they can understand and perhaps influence us, it should come as little surprise that the best statisticians can earn huge salaries.

The maths involved in statistics can get quite advanced. If you’ve got a set of data (such as the fictitious points in the graph below) and are looking for the straight line that most tightly passes through it (the so-called ‘best fit’), there are sophisticated mathematical techniques you can use to find it.9 But in many cases, the human eye is good enough. A bit like in a spot-the-ball competition, I’ve used my judgement, and a hunch, to draw a straight line through the points below. It suggests there’s a slight trend upwards. Your line might be different, but it’s unlikely to be that different.

Often, if you’re looking to forecast what’s going to happen in the near future, a straight-line extrapolation from the past is a good starting point. Here, for example, is a chart showing the proportion of all grocery spending that was done online from 2013 (2.1%) to 2017 (4.8%). Care to guess what happened in 2018?10

The online market was clearly growing each year, though not by a fixed amount. The annual growth was just 0.4% in 2014, and for the next three years it was in the range 0.6% to 0.8%. A guess of 0.7% growth seems sensible for 2017–18. Of course it might be as high as 1% or as low as 0.3%, or it might do something cataclysmic, we can’t be sure, but like steering an oil tanker, statistics that are heading steadily in one direction can take a lot of effort to shift to a different path. So a prediction of 5.5% (ish) for 2018 is a relatively safe bet – and as it happens, 5.5% is exactly what the growth was that year. But I’ve been a bit lucky here – informed guesses aren’t always as accurate as this.

The further into the future you are trying to forecast, the more risky it is to extrapolate a trend from the data you have. And be careful: while the grocery purchasing habits of the USA are clearly revealing a changing pattern of consumer behaviour, it’s possible for an ‘upward trend’ to happen purely at random. If I repeatedly toss 10 coins and get four heads the first time, five heads the second time and six heads the third, it might look like an upward trend, but the stats say that the most likely outcome next time is going to be five.

Finally, buried within a long-term trend, there might well be short periods where the data moves in the opposite direction. Some climate-change deniers liked to cherry-pick the period 2001–13 to ‘prove’ that global warming had stopped. Over that time-frame, the average global temperatures moved up and down with no obvious trend. Zoom out to look at the data over a century, however, and the evidence of an upward trend is compelling. That in itself is not proof, of course. But most scientists prefer to look at the long-term statistics, not short-term blips.

PREMIER LEAGUE GOALS: CERTAINTY WITHIN UNCERTAINTY

Here’s a prediction. Next season, 1,000 goals will be scored in the Premier League.

OK, it will probably be a handful more than that, but that figure is probably right to within 5%, which is astonishingly accurate by back-of-envelope standards.

How come this prediction is such a conveniently round number? Mainly it’s a coincidence, but our confidence in it comes from history. If you look at all the seasons since 1995/6 (when the league settled at 20 teams), the highest number of goals scored was 1,066 in 2011/12, and the lowest was 931 in 2006/7. In six of the nine seasons between 2009 and 2018, the total number of goals was in the range 1,052 to 1,066.

In total there are 380 Premier League matches played in a season, with an average of about 2.6 goals per match. This means that were the league for some reason to change the number of teams up or down, we could still make a decent estimate of how many goals there would be. Increase the league from 20 to 22 teams, and we’d now have 462 matches. That’s roughly a 20% increase in the number of games, so we’d expect about 1,200 goals.

If we increase a league to 24 teams, there are 552 matches; that’s almost a 50% increase on the Premier League, so we might expect, say 1,450 goals. As it happens, the second division, known as The Championship, does have 24 teams. And sure enough, the average number of goals per season is roughly 1,450.

It might seem remarkable that a game that is popular for its drama and unpredictability can be so surprisingly predictable when you look at the big picture – but that’s not unusual in statistics.