What do all these numbers mean? ‘“Worrying” Jobless Rise Needs Urgent Action – Labour’ was the BBC headline. It explained the problem in its own words: ‘The number of people out of work rose by 38,000 to 2.49 million in the three months to June, official figures show.’
There are dozens of different ways to quantify the jobs market, and I’m not going to summarise them all here. The claimant count and the labour force survey are commonly used, and the number of hours worked is informative too: you can fight among yourselves over which is best, and get distracted by party politics to your hearts’ content. But in claiming that this figure for the number of people out of work has risen, the BBC is simply wrong.
Here’s why. The ‘Labour Market’ figures come through the Office for National Statistics, and it has published the latest numbers in a PDF document. See here, top table, fourth row, you will find these figures the BBC is citing. Unemployment aged sixteen and above is at 2,494,000, and has risen by 38,000 over the past quarter (and by 32,000 over the past year). But you will also see some other figures, after the symbol ‘±’, in a column marked ‘sampling variability of change’.
Those figures are called ‘95 per cent confidence intervals’, and these are among the most useful inventions of modern life.
We can’t do a full census of everyone in the population every time we want some data, because they’re too expensive and time-consuming for monthly data collection. Instead, we take what we hope is a representative sample.
This can fail in two interesting ways. Firstly, you’ll be familiar with the idea that a sample can be systematically unrepresentative: if you want to know about the health of the population as a whole, but you survey people in a GP waiting room, then you’re an idiot.
But a sample can also be unrepresentative simply by chance, through something called sampling error. This is not caused by idiocy. Imagine a large bubblegum-vending machine, containing thousands of blue and yellow bubblegum balls. You know that exactly 40 per cent of those balls are yellow. When you take a sample of a hundred balls, you might get forty yellow ones, but in fact, as you intuitively know already, sometimes you will get thirty-two, sometimes forty-eight, or thirty-seven, or forty-three, or whatever. This is sampling error.
Now, normally, you’re at the other end of the telescope. You take your sample of a hundred balls, but you don’t know the true proportion of yellow balls in the jar – you’re trying to estimate that – so you calculate a 95 per cent confidence interval around whatever proportion of yellow you get in your sample of a hundred balls, using a formula (in this case, 1.96 × the square root of ((0.6 × 0.4) ÷ 100)).
What does this mean? Strictly (it still makes my head hurt), it means that if you repeatedly took samples of a hundred, then on 95 per cent of those attempts, the true proportion in the bubblegum jar would lie somewhere between the upper and lower limits of the 95 per cent confidence intervals of your samples. That’s all we can say.
So, if we look at these employment figures, you can see that the changes reported are clearly not statistically significant: the estimated change over the past quarter is 38,000, but the 95 per cent confidence interval is ±87,000, running from –49,000 to 125,000. That wide range clearly includes zero, which means it’s perfectly likely that there’s been no change at all. The annual change is 32,000, but again, that’s ±111,000.
I don’t know what’s happening to the economy – it’s probably not great. But these specific numbers are being over-interpreted, and there is an equally important problem arising from that, which is frankly more enduring for meaningful political engagement.
We are barraged, every day, with a vast quantity of numerical data, presented with absolute certainty and fetishistic precision. In reality, many of these numbers amount to nothing more than statistical noise, the gentle static fuzz of random variation and sampling error, making figures drift up and down, following no pattern at all, like the changing roll of a dice. This, I confidently predict, will never change.