Eleven

How to Outguess Fake Numbers

Mark Nigrini grew up in Cape Town, South Africa, charmed by the magic of numbers. He came to the US for a PhD in accounting. By April 1989 he was a grad student in search of a dissertation topic. One day at the University of Cincinnati he ran across a brief reference to something called Benford’s law. “I went to the library that night and got Benford’s paper,” Nigrini recalled. Reading it changed his professional life.

Frank Benford had been a physicist working for General Electric in Schenectady, New York, in the 1920s. At that time, scientific calculations meant looking up numbers in a book of logarithms. Benford noticed that the front pages of his logarithm book were worn from long use, while the back pages looked almost new. It’s this idle observation, rather than anything General Electric paid him to do, that has preserved Benford’s name for posterity.

The numbers that Benford had cause to look up tended to begin with low digits, and low digits were at the front of the book. Benford found, for instance, that about 30 percent of the numbers encountered in science and engineering began with the digit 1. In contrast, only about 5 percent of the numbers started with 9. This left the back part of the book relatively pristine.

Benford mentioned this fact to GE chemist Irving Langmuir (later a Nobel laureate). Langmuir encouraged him to publish a paper on it. Methodical if nothing else, Benford pursued this obscure finding over the next decade. It was not, he found, unique to scientific numbers. He tried tallying the first digits of baseball statistics and found the same distribution. He recorded every number mentioned in an issue of Reader’s Digest. Ditto. Tennis scores, stock quotes, lengths of rivers, atomic weights, electric bills in the Solomon Islands, and numbers mentioned on the front page of the New York Times produced the same pattern. It was like a conspiracy theory. Everything was connected.

Benford finally published his results in a 1938 issue of the Proceedings of the American Philosophical Society. There he derived a precise formula for the proportion of numbers beginning with each digit. The proportions are:

First digit	Proportion
1	30.1%
2	17.6%
3	12.5%
4	9.7%
5	7.9%
6	6.7%
7	5.8%
8	5.1%
9	4.6%

You may be wondering why 0 isn’t included. Benford’s observation deals with the first nonzero digit. So 7,129,600 and 0.000072002 each have a first digit of 7.

Benford’s formula also predicts the proportions of second digits, third digits, and so on. In these cases 0 is a possibility. However, the preponderance of low digits is much less pronounced after the first. For that reason, Benford’s observation is sometimes called the first-digit phenomenon.

Benford used another name for his paper’s title, “The Law of Anomalous Numbers.” Now it’s almost always known as Benford’s law. That’s unfair, as it turns out. The same phenomenon had been discovered and published over half a century earlier by a scientist much better known than Benford, astronomer Simon Newcomb. Newcomb’s paper, in an 1881 issue of the American Journal of Mathematics, opens with the ostensibly well-known fact, “That the ten digits do not occur with equal frequency must be evident to anyone making much use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones.”

I suppose this is further proof of how hard it is to be original, and how even original ideas don’t always get noticed. For some reason Newcomb’s article was quickly forgotten, while Benford’s got traction. One possible explanation is that Benford’s article piggybacked on the fame of an important physics paper by Hans Bethe that appeared after it in the Proceedings.

It’s now known that Benford’s law applies to all sorts of data that even the indefatigable Benford didn’t think to test. It’s also known that Benford’s law doesn’t apply to many common types of numbers: phone numbers, ages, weights, Social Security numbers, IQs, winning lottery numbers, and zip codes.

Those with mathematical intuition may find this self-evident. To everyone else, it can appear a cosmic mystery. Why does Benford’s law apply to street numbers (fairly well) but not to zip codes? How does the New York Times “know” to mention six times more numbers starting with 1 than 9?

Benford’s law applies to some numbers that express quantities or measurements, such as city populations or credit card charges. For a quick, intuitive explanation, imagine you put $1,000 in an investment account that doubles in value every ten years. The first digit of the account balance will remain 1 for the first ten years of growth. The value will increase to $1,100, $1,200, $1,300, and so on, up to $1,900, and finally hitting $2,000 at the end of the first decade.

It will take another ten years to double again. During that time the value will climb from $2,000 to $3,000 to $4,000. That means the account balance spends as much time with a 2 or 3 as the lead digit as it did with 1.

In the third decade, the account value will go from $4,000 to $8,000, spanning the leading digits 4, 5, 6, and 7. Then in the fourth decade, the value increases to $16,000, blasting through leading digits 8 and 9 and spending the rest of the decade in the 1s again.

The investment value would spend more time with a leading digit of 1 than with 2, more time with 2 than with 3, and so on. Should you choose a random moment in time to check the account’s value, the chance of each possible leading digit would be precisely that of Benford’s distribution.

The world is full of things that grow exponentially, from bacteria colonies to social networks. They don’t usually grow as steadily as in my example, but when natural growth scatters values over several orders of magnitude, they approximate Benford’s distribution. Have a chimpanzee repeatedly throw a dart at the financial pages, and the stock prices it hits will follow Benford’s law quite well.

Not every set of measurements fits the Benford distribution. One example is the weight of adult American men. Obviously, 1 is the most common leading digit, to a far more lopsided degree than the 30 percent predicted by Benford’s law. A leading digit of 6 is much less common than in the Benford distribution: not many men weigh 60 to 69 or 600 to 699 pounds.

Nor does Benford’s law apply to assigned numbers like phone or Social Security numbers. There the assigners use all or almost all of the possibilities. Those beginning with 1 are about as common as those beginning with any other digit.

Benford’s law is a reminder that numbers are an artificial way of talking about the quantities we find in the world around us. As Benford himself wrote, his law “is really the theory of phenomena and events, and the numbers but play the poor part of lifeless symbols for living things.”

“I thought, if there are indeed predictable patterns to numbers, then maybe auditors can use this to tell whether data was authentic or made up,” said Mark Nigrini.

Accountants and tax agencies would love to have a formula for determining which numbers are honest and which aren’t. Nigrini quickly resolved to do his dissertation on using Benford’s law to detect financial fraud.

He found that little had been written on this subject since Benford’s paper. The only one to see practical value in Benford’s law was economist Hal Varian (now head economist for Google). In 1972 Varian proposed using the law as a baloney detector. Public policy decisions are based on elaborate projections of costs and benefits. The numbers in these projections ought to fit the Benford distribution, Varian argued. Otherwise, it could indicate that the forecaster was pulling numbers out of the air or tweaking the figures for political ends.

Varian had not followed up his idea, nor had anyone else. This stoked Nigrini’s enthusiasm, though not that of his advisor. “They prefer you to be the eightieth person to write about a topic,” Nigrini explained. He went ahead with his dissertation anyway. Not until he had written two-thirds of the research was he able to get it approved. He finished it four months later.

The idea that struck Varian and Nigrini lends itself to a picture. When you have a lot of numbers, you can make a bar chart (histogram) showing how many times each digit occurs as the first digit. Just count how many of the numbers start with the digit 1, how many start with 2, 3, and so on. For honest data that follows Benford’s law, the chart will look like this:

This smooth curve is Benford’s law in visual form.

Varian and Nigrini’s brainstorm was that people who make up numbers won’t know about Benford’s law. An embezzler or tax cheat will have no reason to think that any digit should be more common than any other. Therefore, a set of made-up numbers might be expected to show an even distribution of leading digits, without the curve.

That was the back-of-the-envelope concept, anyway. Randomness experiments (which were not widely known) had already shown that fabricated numbers almost never use all digits equally. Alphonse Chapanis made bar charts of his results, and they didn’t look anything like a flat distribution.

Another issue is that honest financial data often fits the Benford curve to a T—and then sometimes it doesn’t. It can be tough to tell beforehand which case you’re dealing with. One example would be sales data from the 99 cents store. The amounts would include a lot of 9s. As Nigrini points out, this tells you that prices are made-up numbers, invented by humans as part of a marketing strategy. But if you’re managing a 99 cents store, that’s your reality, and it doesn’t indicate fraud. There are many other situations where the nature of a business might produce an un-Benford-like distribution of first digits, for perfectly innocent reasons.

Nigrini’s basic idea was right, though: Invented numbers are different from honest ones. He began to haunt the Cincinnati courthouse, looking for criminal cases involving numbers.

One of the early fraud cases he studied was from Arizona. Wayne James Nelson, a forty-three-year-old manager in the office of the Arizona state treasurer, launched a short embezzling career with a check for $1,927.48 from the State of Arizona to a fictitious vendor. Over the next few days he made twenty-two more fake checks, for a total of almost $1.9 million.

When caught, Nelson claimed that he had written the checks in a noble effort to demonstrate vulnerabilities in Arizona’s accounts payable system. He had “neglected” to enlighten anyone in the treasurer’s office about these vulnerabilities, and the funds had been directed to Nelson’s own accounts.

At a glance, you can see some patterns in Nelson’s check amounts.

$1,927.48	$96,879.27
$27,902.31	$91,806.47
$86,241.90	$84,991.67
$72,117.46	$90,831.83
$81,321.75	$93,766.67
$97,473.96	$88,338.72
$93,249.11	$94,639.49
$89,658.17	$83,709.28
$87,776.89	$96,412.21
$92,105.83	$88,432.86
$79,949.16	$71,552.16
$87,602.93

Nelson “was the anti-Benford,” said Nigrini. All but the first two check amounts start with high digits 7, 8, and 9. Nelson kept the amounts under $100,000, probably because six-figure sums would have attracted unwelcome attention.

Here’s a histogram of the leading digits of Nelson’s check amounts.

Dodgy numbers are usually mixed in with legitimate ones. An auditor would not just be looking at the fake check amounts (how would he know which were fake?). He would look at all of Nelson’s check amounts, or all of his department’s amounts. Even so, Nelson’s lopsided preference for 8s and 9s in the fake amounts would augment the 8s and 9s in the aggregate amounts. This might be detectable.

Nigrini found that Nelson’s check amounts showed other idiosyncrasies typical of invented numbers. Suppose we tally the very last (rightmost) digits of the check amounts. These represent pennies, and surely Nelson had no financial interest in that. There is a pattern nonetheless. Nelson favored amounts ending in 6 and 7. He didn’t use 4 at all.

This looks much like the charts that Chapanis made. Just like Chapanis’s volunteers, Nelson unconsciously repeated himself. In twenty-three check amounts, he managed to repeat 87, 88, 93, and 96 as the first two digits. He likewise repeated the cents figures 16, 67, and 83.

The IRS sells tax form data, with identifying information stripped out, to researchers. Nigrini bought a package of 100,000 returns for tax years 1985 and 1988 and began analyzing them on the university’s VAX minicomputer. He wanted to see whether he could tell which entries had the most cheating.

Many entries on a tax form are calculated totals, differences, or products of other entries. It would make no sense to manipulate them, as IRS computers check the math. Other entries are backed by third-party documentation, such as W-2s for wages or 1099-INTs for interest income. That provided useful comparison. Nigrini found that reports of interest income fit Benford’s law to high precision. Interest paid did not fit the curve, however. At the time, mortgage lenders did not report interest amounts to the IRS. Consumer credit interest was deductible (and was not backed by documentation, either). This meant that taxpayers were tempted to exaggerate their interest paid and hope they didn’t get audited. Nigrini’s analysis suggested that many were doing just that.

At the time of his run for president, Bill Clinton released his tax returns from 1977 onward. Nigrini was able to cull 380 income numbers and 511 deduction numbers, all honor-system entries, from the Clinton returns. He found nothing suspicious except a preponderance of round numbers—a common finding in tax returns. There was, for instance, a used men’s suit donated to charity and valued at $100. The suit’s value is obviously an estimate. One of the ways we indicate estimation is through round numbers. Claiming a hundred dollars is more honest than inventing an implausibly precise amount like $107.03.

One of Nigrini’s first believers was Robert Burton, chief financial investigator with the Brooklyn district attorney’s office. In 1995 Burton used Nigrini’s software to analyze checks at seven companies suspected of criminal ties. Burton found evidence of invented numbers and, upon further investigation, charged bookkeepers and payroll clerks with fraud. This resulted in a favorable write-up in the Wall Street Journal. Benford’s law was called “a tool worthy of Sherlock Holmes.” Burton was quoted: “Bingo, that means fraud.”

The Wall Street Journal piece helped publicize Benford’s law, while also contributing to the myth that it was some kind of magic lie detector. Since then, use of Nigrini’s techniques has greatly expanded in law and tax enforcement and in the private sector. Today’s routine analysis of consumer data makes it easy to flag suspicious numbers for further scrutiny. Yet digit analysis remains a new field, incompletely tested. It is important to understand what it can do, and what it can’t.

“I get very upset, regularly, when I read about people using Benford’s law badly,” Nigrini told me. Doubtless some hear about Benford’s law, skim the Wikipedia article, and take it to mean that any numbers whose first digits don’t fit the curve are fraudulent. That is definitely not a proper conclusion. There are so many reasons why legitimate first digits may fail to fit the Benford distribution that first-digit tests are rarely of much use. Nigrini finds a test of the first two digits much more useful. This produces a column chart with 100 bars. When there is enough data (thousands of numbers), Benford-compliant data produces a smooth curve.

Another useful test charts the last two digits of big numbers. This isn’t even properly a “Benford’s law” test. What you’re looking for is Chapanis-style idiosyncrasies of invented numbers. Note that the last-two-digit test works even for data that ought not to obey Benford’s law.

In the hands of a professional, digit analysis entails many distinct tests and calculations of the tests’ statistical significance. The ultimate standard of comparison should be the history of that particular set of data. This quarter’s expense invoices should be compared to previous quarters’. Nigrini calls this principle My Law. The name refers to the generic file names that some software proposes for new files (My File, My Worksheet, etc.). The My Law approach avoids the most common error of half-cocked numerology, which is to assume that all numerical datasets fit Benford’s law closely. They don’t. Nor are Chapanis’s features of invented numbers 100 percent foolproof. These paradigms may or may not apply in a given instance, for inscrutable reasons. It’s easier and more relevant to adopt past digit distributions as the baseline.

After all, every fraud has to start at some point in time. Should Stan in accounting start embezzling next Tuesday, that will shift his digit patterns—and will do so regardless of how closely the original numbers may have been “random” or fit the Benford curve.

As an illustration of the My Law approach, Nigrini cites a 2011 experiment devised by seventeen-year-old student Kha Bui for his math class in Koblenz, Germany. The class was divided into five groups of four students each. Some of the groups were given newspapers and instructed to make a list of 500 numbers that they found in the news. The other groups were told to invent 500 numbers. The point was to see whether it was possible to distinguish the news numbers from the invented ones by digit patterns alone.

To make the challenge as difficult as possible, the fakers were told to invent numbers such as would be found in a newspaper (as opposed to random numbers). This made the task more like a real-world fraud, where the faker is playing chameleon.

None of the five sets of numbers, real or faked, had an especially good fit to the Benford curve. But anyone could see that they fell into two groups. One group had “big spikes”—a few first-two-digit pairs that occurred far more commonly than expected. The other group had smaller spikes and conformed better to the Benford curve. As we’ve seen, repeated digit pairs can betray the unconscious repetition of a faker. You might think that the “smaller spikes” group was the honest newspaper numbers. You’d be wrong.

Remember, the fake numbers were invented by teams of four. Because the digits that people unconsciously favor vary from person to person, each faker’s quirks were diluted by a factor of four. This would have made it much harder to detect fraud.

The real tip-off was this. The newspapers had many mentions of the then-current year (2011) and recent years. The digit charts therefore had a spike for 20 as the first two digits. The fakers invented some recent year numbers, too, but not nearly enough.

Someone using either Benford or Chapanis as their standards would have guessed that the shorter-spikes sets were the honest ones. A wiser approach would have been to first examine the digit patterns of other newspapers. This would have revealed the profusion of recent-year mentions and led to the correct identification.

When the digits of important numbers don’t fit the expected distribution, a good forensic investigator will be able to find out why. There are nonetheless a couple of easy, do-it-yourself tests that anyone can use to get a quick read on whether numbers appear honest. In the following pages I will show some ways to identify the likelihood of invented or manipulated numbers. These tests are intended mainly to distinguish real data from that which is 100 percent fake, made up by a single individual. You won’t always have that stark a contrast. Nevertheless, there have been plenty of cases where a lone bad guy presented his victims with numbers that were completely bogus. Used as a preliminary screen, these tests are quick and independent of all the other due diligence you’re likely to do.

Every Sunday, the owner of a fast-food restaurant began the week by making up the dollar sales for the previous week. Every number was fake! She needed something to report on her taxes.

The restaurant’s bookkeeper happened to be one of Nigrini’s students. Nigrini took a look at the invented numbers. “It wasn’t the first digits that caught her,” he explained. A fast-food outlet with steady business might have, say, about $5,000 of business each weekday without too much variation. The first digits didn’t follow the Benford distribution, nor would they be expected to. It was the last two digits that betrayed the fabrication. None of the numbers ended in 00. That’s a common tip-off, as fakers often think a round number doesn’t look random enough. Also, about 6.5 percent of the numbers ended in 40 (you’d expect it to account for just 1 percent). Using 40 for the last two digits was an unconscious tic of this particular business owner.

Someday that fast-food place will be sold, and the buyers will scrutinize the books. Perhaps the owner will invent a new, inflated set of numbers to show. Will the buyers suspect that the numbers were pulled out of the air?

Daily sales figures for a small business are the sum of a great many register totals. The last two digits of these sums tend to be random, with each digit pair from 00 to 99 occurring about 1 percent of the time.

You don’t always have cents figures. Some reports are rounded to even dollars, and others may be truncated to thousands of dollars. In these cases you can use the two rightmost reported digits.

To run a last-digit test, tally how many times each possible pair of last digits occurs in the reported numbers. There are 100 possible pairs, so make a histogram chart with 100 bars.

The chart will give you some idea of what honest numbers look like. It records 500 random numbers (generated by an Excel spreadsheet). Five hundred is a reasonable amount of data for a small business—about seventeen months of daily sales or ten years of weekly figures. Even with 500 numbers, the chart is noisy, with a great deal of variation. In this case, there’s a pair (68) that doesn’t occur at all in the data, and three pairs (10, 53, 74) that occur twice as much as the expected 1 percent. This is the normal variation you should expect for random data.

Now let’s look at fabricated data.

The chart on the next page shows the last two digits of 500 human-invented numbers. Even at a glance you can see there’s a lot more variation. Two pairs (93 and 94) occur more than 4 percent of the time, something very unlikely with honest numbers. Twelve pairs don’t occur at all, and that’s also highly improbable.

Ask the following three questions. A “yes” answer to any of the three should raise the suspicion level.

(a) Is there a pair (or pairs) that is unaccountably more common than the others?

(b) Are doubled digits (especially 00 and 55) consistently less common than average?

In the example here, the answer to (a) is a resounding yes. The data also avoids doubled digits (b). You would expect that 10 percent of all numbers would end in a pair of doubled digits. Here are 20 occurrences out of 500, only 4 percent. The pairs 00, 55, and 77 do not occur at all.

There are 44 descending pairs out of 500. That’s almost exactly the expected 9 percent (as there are nine descending pairs out of 100 possibilities). By criterion (c) the data is unsuspicious.

This data fails two of the three tests. Were this a small business’s sales, it would be wise to ask for more figures, or more detailed figures—and to see how the seller reacts to that request.

You don’t have to worry about counting all those digits. In practice, it’s just a matter of cut and paste. Request the data as an Excel file or something compatible and copy it into a Benford’s law test template. Examples can be found on the Web for free, including one by Nigrini (NigriniCycle.xlsx). After you paste in your numbers, follow the instructions to fill down some columns with preexisting formulas. Tabs then give already-formatted charts for the two last digits and other common tests. They also give mathematical measures of statistical significance, which are of course a lot more reliable than eyeballing the data.

Recap: How to Outguess Fake Numbers

• When the digits of recent data depart from the company’s customary distributions, it can be a tip-off to fraud.

• Embezzlers and fraud artists making up numbers unconsciously overuse descending pairs of digits (like 10, 21, 32, etc.)

• Fakers underuse doubled digits (like 00 or 55), thinking they don’t look “random” enough.