Chapter 9

Running the numbers

How many services do you, as an entire organisation or central government, offer to users or to citizens and businesses? Take a guess. Fifty? A couple of hundred? Several thousand? You probably don’t know. That’s forgivable. No one else in your organisation knows either.

As the digital team is starting to spread its wings and build up a reputation for delivery far beyond its four walls, thoughts will turn to the bigger, thornier service design challenges it can get its teeth into. It is time to look at fixing some of the bigger transactions on the books, the brownfield services your organisation or government has long laboured over, where there are unimpressed users and savings to be made. To make a start on that, you need to know where to prioritise.

Finding out which of your organisation’s services are the most used can be a surprisingly difficult question to answer. For a government, even if you take out the multitude of services which are offered at a local or state level, there are still many separate public bodies to explore. Systematically working out which services are the most broken from a user’s perspective can be even harder, short of relying on anecdotes and horror stories. Gathering basic data on services should be one of the team’s first deep forays into the numbers behind digital transformation.

Exploring transactions

In the months leading up to writing your digital strategy you may feel the sense that nobody is quite clear on exactly how big a challenge the process of transforming your organisation will be, at least not in cold, hard numbers. The GDS committed itself to making making all new or redesigned government services handling more than 100,000 transactions a year ‘digital by default’. That sounds like a very lofty ambition, but how difficult is that exactly? How many services does central government offer that are that big? Nobody knew. This was mostly because nobody in a central position had bothered to ask. Delivery was not something the centre had concerned itself with before; if departments wanted to collect the data, that was their lookout. Creating a full service catalogue – essentially, a very big list – was the first step towards putting in place data that enables a digital team to plan for a successful digital transformation at scale.

The GDS’s version of this catalogue was called the Transactions Explorer. In its earliest versions, the Explorer was not a particularly sophisticated thing. It started life as a spreadsheet with three columns: the name of a service, the department that ran it, and how many transactions it handled each year. The information was gathered via requests from each individual department, sent out once per quarter. The collated version was then published for the world to see. Over time, the Explorer would add detail for other indicators, like cost per transaction, and evolve into a published, real-time performance dashboard for government services.

While much of this data will be familiar to operational staff, many people operating at the policy or strategy level of an organisation – those making decisions that directly affect users and the frontline staff who actually see them – are rarely confronted with this basic service information. This is one of the effects of splitting organisations into separate silos rather than mixing them into multidisciplinary teams. For staff operating at a policy level, frontline delivery metrics are someone else’s problem to worry about.

Numbers without behaviour distortion

Admitting there are imperfections in any quantitative measurement is generally a wise position to adopt in government. Government data tends to be treated with hushed respect, as if it was unimpeachably accurate. The standards that most democratic governments set themselves in terms of data quality are rightly high, and nation states generate more reliable numbers than most other sources. That does not mean they offer perfection. Even now, at the beginning of almost every official data-gathering exercise is a fallible human being with a spreadsheet to fill in.

Taking a qualified view of your organisation’s numbers leads to another important change in the team’s attitude towards data: trust the trend line more than than the exact number. While the numbers are probably not too bad, there’s no guarantee you can rely on them for making fair comparisons. However many guidelines, definitions and demands you set, for as long as data is gathered by humans in bureaucracies, different departments will report on the same numbers in different ways.

One thing you should therefore avoid looking for in your service catalogue is winners and losers, or creating league tables comparing different services. The data won’t be reliable enough, and the services not similar enough to do this fairly. Only the most blatantly broken or brilliant services will stand out from their peers, and you probably know those already. Rather than forcing false comparisons between different things that might give a false picture, pick the most likely point of consistency – reports that come from the same department about the same service – and pay attention to those. The trend line from a reliable source of data offers an indicator of relative progress or decline. The more dependable data sources are brought together, the better position a digital team will be in to spot issues.

The UK government’s transaction data contained a couple of insights that prove common in most large organisations. No matter how many services an institution is running, as a general rule the vast majority of transactions take place in a relatively small number of them – the top 10% in terms of volume typically account for 90% of all the transactions taking place across the whole of government. The rest make up the ‘long tail’. Quite a lot of these will be very small indeed. The UK’s environment department receives around 10 applications a year for burials at sea, for example. For the digital team, prioritising how best to achieve the strategic ambitions for digitisation suddenly becomes more straightforward. Fix the top 10% of services, and you can deliver the vast majority of benefit to users.

Another less obvious conclusion from the GDS’s performance dashboards was the realisation of just how terrible the organisation had become at naming services. When you look at services one by one, it is easy to forget how cryptic their function must be to somebody who is unfamiliar with it. When confronted with a list of 700 or more services all described as word jumbles, you begin to realise how often services are named for the benefit of government or business, to the confusion of users. The names are also a pretty good indicator for how much the service as a whole was designed with users in mind. The GDS’s head of service design, Louise Downe, provided a good rule of thumb: ‘Good services are verbs, bad services are nouns.’50

By far the most important insight from digging into the data, however, was the power of transparency.

Make things open, it makes things better

There was a lot of nervousness about publishing the Transactions Explorer. Instinctively, when most public officials think of publication they consider the risks this entails, rather than the benefits. As a government’s default position is to retain information, and the personal rewards for officials deviating from that default are scant, their bias is to only see problems in openness. From an individual perspective, that’s fair enough. From a whole of government view, it is harder to defend. The public paid for that data. Unless it imperils national security, there’s a good case for them having access to it.

Governments around the world have spent a good deal of energy in recent years developing performance league tables looking at various arms of the state – schools, surgeries, and so on. This work is generally badged as ‘deliverology’: the process of establishing a small team focused on performance, gathering performance data to set targets and trajectories, and having routines to drive and ensure a focus on performance.51 Much of it is sensible, basic project management, rarely a bad idea for governments to stick with. The performance tables are also supposed to support the idea of greater choice in public services (something which in practice doesn’t always happen very much). In either case, the psychology behind them is clear; there is a strong incentive for the low performers to avoid the embarrassment of a lowly position, and for the high performers to aspire for the top spots.

Publishing the Transactions Explorer in the open effectively turned this psychological trick back on the officials who had been happily deploying it on their public sector colleagues for years. If a department failed to submit their data to the Transactions Explorer, they wouldn’t be left off the public list. Instead, in their place would be the blank space their numbers should have occupied. In time, as more of those blank spaces were filled, the remaining voids began to look more shameful for the departments and agencies unable or unwilling to divulge the data.

As an open data set, the Transactions Explorer wasn’t especially interesting. It wasn’t big enough to be suited to data mining, nor was it controversial enough to interest journalists. The value of transparency in this case was to change the incentives acting on civil servants. Failing to publish data made you look bad professionally, the reverse of what was typically the case.

Measuring performance

The Transactions Explorer began as a simple measure of transaction volume. This information was necessary to work out where it made sense to prioritise digital team’s efforts for delivering the biggest impact. Simply going after the most widely used services wasn’t a particularly nuanced strategy, however. The largest transaction handled by the UK government was Stamp Duty Reserve Tax: payments made on share purchases. It handles over a billion transactions a year, but is an automated process and not public-facing – not a sensible candidate for transformation. So as well as needing to refine how it would choose projects to work on, the GDS also needed to give teams building digital services all over government a clear answer about what measures mattered.

Measuring organisational performance is a sprawling, many-sided debate. There are as many perspectives on the ‘right’ things to measure as there are ‘right’ ways to measure them. Some businesses measure hundreds of different variables in their quest for profitability. Most governments tend to be similarly thorough, with the added complication of managing multiple desired outcomes at the same time, where the operational measures often fail to match up with lofty political goals.

In the UK, to keep things simple, we selected four performance metrics: digital take-up, completion rate, cost per transaction and user satisfaction. We could have picked more. Four was a manageable number, and effectively covered the bases for the GDS’s primary strategic aims: getting more people to use online government services, building services that worked first time, saving money and meeting user needs.

As soon as you set performance indicators and determine a baseline for how things look before you’ve tried to improve the picture, you will be strongly encouraged to set a target number: a goal that you will strive to hit by a certain point in time. Be very careful about this.

Targets are a controversial topic in government circles. For some, they are simple and cheap way of pointing a complicated entity in one direction. To others, they are blunt tools, responsible for creating perverse incentives and questionable outcomes. The truth is probably somewhere between the two. Targets have undoubtedly helped drive improved performance in some specific areas. They tend to be especially good in fields where direct comparisons are relatively straightforward and there is a low chance for human beings to game the system by focusing on meeting the target rather than the intent lying behind it. But where the scope for variation and gaming is high, problems arise.

Let’s take digital take-up as an example. The GDS could have set itself a target for 80% online take-up for all of the UK government’s digital services. Approximately four-fifths of the UK population was online in 2012 – 80% sounds like a reasonable if ambitious target. Dig a little deeper though, and things begin to unravel. For some services, such as registering to vote, the simplicity of the transaction and nature of the people likely to use it means that aiming for a target nearer 95% might be more reasonable. Applying for certain forms of benefit is, on the other hand, a far more involved process with a very different set of users. In that case, reaching 70% digital take-up represents a significant achievement. As a specific target, 80% manages to be wrong in both directions. So would any other number. If you were to avoid setting individual service targets and instead take 80% digital take-up as an aggregate aim across all government services, there would be a strong argument for the digital team and departments to focus their efforts on services that offered the simplest processes and most digitally confident users just to make the numbers add up.

That kind of gaming doesn’t reflect a particularly cynical or nefarious view of government officials – it is just what rational actors would do. Not all officials would agree with focusing on the easy service as a fair strategy, which would lead to internal arguments, which in turn would lead to delays. Exactly the same problems can be imagined from gaming cost per transaction targets, completion rates or user satisfaction. None of this benefits users.

If targets are too tempting to ignore altogether, they should be set on a service-by-service basis, and relative to a baseline: to cut the cost of issuing a fishing licence by a third, or increase completion rates of self-assessment tax forms by 10%, for example.

While avoiding targets was the right decision, we didn’t get all of our metrics right. User satisfaction proved perennially difficult to draw conclusions from, no matter how it was measured. The problem with user satisfaction was finding numbers that could give a reliable indication of relevant information to measure performance or improve a service. Was the service meeting user needs? For businesses, this is a little more straightforward; if a user is not satisfied, they can look elsewhere. The problem for governments everywhere is that their digital service can meet user needs very successfully while still leaving the user dissatisfied. It is a rare person who concludes the process of paying the government their taxes by leaving a thank you message for the smoothness of the experience.

In government, measuring user satisfaction picks up false signals: about how happy people are about paying tax, even about how happy they are with the government’s political performance in general. These are not things that any digital service team can do anything about. In the end, the most reliable way to measure user satisfaction was in the research lab, watching real people use the service. This was difficult to scale, but always worth the effort.

The GDS’s choice of four performance metrics acted as useful pointers for stories to celebrate or worries to address. They weren’t designed to provide the people managing the services day to day with all the detailed insight needed to make incremental improvements to services; more detailed web analytics packages delivered that. What they offered was an indication of relative progress, and a measure of momentum.

Money

While putting an accurate figure on user satisfaction can prove almost impossible, one metric can not be ignored entirely. Making a compelling argument that shows digital transformation can save money or generate revenue is crucial in persuading your organisation to take a step into the unknown. For the UK government between 2010 and 2015, austerity was the biggest game in town. It is no exaggeration to say that the economic conditions and resultant squeeze on public finances was the single biggest factor emboldening the digital agenda in government. Without it, making the political case for institutional reform would have been a much bigger challenge; good times make defending the bureaucratic status quo a much more straightforward task.

In the Digital Efficiency report published at the same time as the Government Digital Strategy, the GDS made an economic case for digital transformation. The report showed that taking a digital by default approach to government services could save the government £1.8 billion over the course of the parliament, and eventually reduce the government’s cost by almost £2 billion a year.52

Constructing economic arguments for digital services is more of an art than a science. There is now plenty of circumstantial evidence to draw on from other institutions, and lots of examples detailing the relative costs of using phone, post, face-to-face and digital channels to carry out a particular service. Even with this, constructing a case for an analogue government or business to adopt digital is not straightforward. If an organisation has never tried something like this before, there are few direct precedents or data points for them to build an argument upon.

There are three main benefits to building economic arguments for digital transformation. It proved that the digital team took the financial case for digitisation seriously enough to conduct a detailed piece of analysis, which meant that analytically inclined officials were more minded to trust the team’s intentions than they would have otherwise been. Second, the report gave an indication of the savings that were possible, setting an appropriate level of expectation. While exact figures wouldn’t be perfect, it set an order of magnitude for the potential prize. Digital wasn’t pocket change to government – even in the zeros-filled world of government accounting, 10-figure amounts are worthy of notice. Digital transformation could therefore be positioned as a substantial side-dish within the government’s overall savings menu, but not as a main course.

The third, and somewhat accidental, benefit to publishing an economic case for digitisation was that it allowed the GDS to avoid setting itself a hard target for making savings. The £1.8 billion figure for savings by 2015 set an expectation. Nonetheless, it was not a formal target. This meant that the digital team didn’t have to organise its behaviour and priorities around financial targets in the way a unit solely dedicated to saving money would. This provided a small but crucial difference in outlook. The digital institution could keep a focus on meeting user needs at the same time as saving government money. If these priorities had been reversed – saving money before meeting needs – it is unlikely the users would get much of a look in.

Summary


52 https://www.GOV.UK/government/publications/digital-efficiency-report. The GDS ultimately saved the government £4.1 billion between 2011-15, through a combination of digital and IT savings.