The Art of Probability

Simulation is better than reality!

We have already done a number of simulations to check our theoretical deductions. They are also very useful when we are not able to compute the answer to the problem posed, and even more useful when in some situations it is very hard to cast the problem into a clean mathematical form suitable for mathematical solution.

The development and spread of computers has greatly increased the number of simulations done, but the idea is not new. The Buffon (1707–1783) needle estimate of π is an old example. In the late ’20s the Bell Telephone laboratories ran simulations of the behavior of proposed central office switching machines using a combination of girls with desk calculators plus some simple mechanical gear and random number sources. To solve a number of problems in particle physics Fermi did a some simulations back in the late ‘30s. Modern computers have popularized the idea and made many simulations relatively painless. This chapter merely indicates some of the possibilities of simulation as related to probability problems.

It is not possible to give a precise definition of what a simulation is. Simulations on computers grade off into simulations in laboratories, and on in to classical laboratory experiments, which in their turn were often regarded as simulations of what would happen in the field. Thus we have to leave the concept undefined, but what we mean is fairly clear—it is the limits of the idea of a simulation that apparently cannot be defined sharply. In a sense machine simulations, (since they are usually cheaper than laboratory experiments and can often do what no real experiment can do), have brought us back to the Medieval scholastic approach of deciding by mere thinking what will happen in a given situation without looking at reality (Galileo, the founder of modern science, advocated looking at reality and opposed this conceptual approach, though he was guilty of using it himself many times.)

10.2 Simulations for Checking Purposes

The main use we have made of simulations has been to check results we computed. Such simulations are valuable, but have their disadvantages. For example, suppose the simulation and the theory disagree; what to do? The difference may be due to:

(1) an error (bug) in the coding,

(2) an error in the plan of the program so that it does not represent the problem,

(3) a misinterpretation, for example you may have computed the median value and be comparing it with the expected value,

(4) the theoretical result may be wrong,

(5) you may be merely facing a sampling fluctuation.

As you try to find the cause of the differences between the theoretical value and the simulation you may need to repeat the simulation a number of times, hence you must be able to use exactly the same run of random numbers again and again—something that you need to think about before starting the simulation!

Sampling fluctuations, for reasonably sized samples, will not bother you much since you can often increase the size of the sample until it is very unlikely to be the cause of the difference. But persistant small differences between the theory and the sample are troublesome, and should be pursued until you are satisfied as to the reason (it may be an approximation in the theory).

10.3 When You Cannot Compute the Result

Often you cannot analytically compute the results you want from the model, either because the mathematics at your command are inadequate, or because you simply cannot formulate the problem in a sufficiently mathematical form to handle it. Then, of course, you have only simulations and experiments to guide you. When experiment and simulation differ, that can be annoying but can also be a possibility for finding new things that the experimentalist never thought about. The difference should not be swept under the rug and explained away as a sampling fluctuation; it is an opportunity to learn new things.

It has been said, “The only good simulation is a dead simulation.” and in the author’s experience one result obtained at great labor (in those days) suggested in a few minutes the right way to think about the problem and hence how to derive the desired equations for the general case. Hence a simulation may show you how to solve a problem analytically.

Example 10.3–1 Random Triangles in a Circle

What is the probable area of a random triangle inside a unit circle?

While with great effort this can be solved in closed form to get the result 35/(48π) = 0.2321009… it is easy to simulate and get an approximate answer. To get a random point in the circle you pick a pair of random numbers in the range 0 to 1 and transform them to the range −1 to +1 by the simple transformation

$x^{'} = - 1 + 2 x$

You then calculate the sum of the squares to find if the point is inside the unit circle. About 1 − π/4 of the time you will be outside and have to compute a new point inside the circle. Thus to get a random triangle you will need about 4 trial points per triangle. You can now compute the area by the standard formula

$A = \frac{1}{2} | \begin{matrix} 1 & x_{1} & y_{1} \\ 1 & x_{2} & y_{2} \\ 1 & x_{3} & y_{3} \end{matrix} |$

and take the absolute value of this result to get a positive area. You are not going to get the exact answer by this simulation, but it is likely that you can get a good enough result for most engineering purposes. It is certainly easier and faster than the theoretical computation, but for a theoretician less interesting.

Exercises 10.3

10.3–1 Simulate finding the area of the triangle in Example 10.3–1.

10.3–2 Simulate the area of a random triangle in a square.

10.3–3 Compute and simulate the probability of a unit circle falling at random completely in a square of side 2 if the center is randomly chosen in the square.

10.3–4 If two points fall in the unit square and determine a line, find the probable distance of a third random point from the line. [Hint: use the normal form of the line.]

10.4 Direct Simulations

Most simulations are devoted to comparing alternate designs so that a good design can be found. Hence it is often the differences between simulations more than the actual numbers that matter.

For purposes of discussion we shall assume that we are concerned with the possible “blocking of calls” in a proposed central office. Suppose we have 100 recorded cases of when a phone call is placed and how long it lasted. To see how things would go with the proposed central office we could simply use these 100 cases and see, trying one call after the other, what would happen (with of course an inital state of the central office also assumed). From the simulation we could assert that had the central office been in that state at the start and seen that previously recorded set of 100 calls then what we simulated would have been what would have been seen (assuming of course that the simulation is accurate). By trying the same sequence of calls on the alternate designs we could evaluate which design would have done better (in terms of the measure of success we have decided on) for that particular sequence of calls.

In the beginning of the design, when a reasonable fraction of proposed designs will be better than the current one, this simple proceedure will tend to approach a good design. But when you are near an optimum then most variations of the design will be poorer than the current one, and due to the chance fluctuations of the simulation (the actual sample you used) on a single step of the optimization you would often choose an inferior design over a better one. Hence as you near an optimal design you need to increase the number of calls simulated before deciding which design is preferable.

One way to increase the number of trial calls is simply run through the list several times, and since almost surely the inital state of the central office would be different for each sequence of the 100 trials you will get new information about the behavior of the design.

This exact repetition of the same sequence of calls leaves one with an uneasy feeling that some peculiar feature of the sequence might just happen to favor one possible design over the others. Hence you think that maybe you should select the next call at random from the list of 100 recorded calls.

This raises the question, “Sampling with or without replacement?” especially if you intend to run through say 4 or 5 hundred calls. We showed in Example 3.6–3 that without replacement then a sequence of 100 random choices would get every recorded call, but that with replacement you could expect to miss about 1/e of the original recorded calls. Of course on the second or still later repetition of 100 calls you would get some of them. Still, using sampling without replacement each recorded call would enter equally in the final simulation, while with replacement there would a fair amount of unequal representation in the number of times various apparently equally likely original calls were represented in the simulation. Of course you can say that due to the random sampling the result is unbiased, but unequal representation of the original data in the simulation leaves one feeling dubious about the wisdom of sampling with replacement.

It should be noted that obviously the same sequence of random choices should be used in each comparative trial since this will reduce the variability in the comparisons (without improving the reliability of any one simulation).

Exercises 10.4

10.4–1 Show that 1/e² ~ 0.135 is a good estimate for the number of missed items in taking 200 samples from 100 items when sampling with replacement.

10.4–2 Similarly, that for 300 samples you will miss about 5%.

10.4–3 Show how to randomly sample without replacement a list of n items.

10.5 The Use of Some Modeling

It may be that from other sources you have reason to believe that the interarrivals times of the calls are closely modeled by an appropriate exponential distribution. If this assumption is correct then you can use the recorded data only for the length of calls and use random times from the appropriate exponential distribution to simulate the times the calls occur.

If the assumption of the appropriate exponential distribution is indeed valid then, since you have put in more information, you will get better results. But if it is not an accurate representation then you have done yourself harm. It is the same situation as in the use of nonparametric or parametric statistics; nonparametric statistics gives poorer bounds than parametric statistics, but if the assumed model is seriously wrong then the parametric statistics result is almost surely worse! Similarly, when you model the time of occurence and the duration of the calls, then insofar as you get them correct the simulation will be better, and insofar as you get them wrong you will get worse answers, though the statistical estimates will indicate better results!

There are, however, other aspects that need to be discussed. In all the above there is clearly the assumption that the future will be like the past. Of course if you have knowledge of trends these can be incorporated into the simulation—at the risk of being worse off if your trend assumptions are bad. There is, always, the assumption that the sample you originally have is typical and not peculiar, so there is always the problem of the original data being misleading.

Another aspect, greatly neglected by the experts, is the question of the believability of the simulation by those who have the power of decision. In an inventory simulation, once I realized that the vice presidents who would make the decision were quite intelligent but wary of hi falutin’ mathematics (perhaps justly so!), I therefore used only the simplest modeling and I could then state, “If you had used the proposed inventory rules at the time the data was gathered then you would have seen the results we exhibited (with the quibble that some few events were missed and some few would not have occurred).” This simulation carried enough conviction to produce further work on their part.

Therefore, before you start any simulation thought should be given to this point—for whom is the simulation being done? Who has the power of decision? If they are not willing to act on your simulation then for whom is it being done? Yourself? (The latter may well be true!) The crude simulation mentioned above produced action, especially as we could (and did) turn over the crude method to their people to be run on their machines. We did not remain in the loop of computing beyond teaching them how to run the simulation, what it meant, and then stood beside them, as it were, whenever they asked for help. Thus the use of elaborate simulation models to meet ideal conditions, rather than practical needs, is often foolish. On the other hand, if the deciders lack confidence in their own judgements then they are apt to want every known effect included, whether or not it is significant. How you go about a simulation depends somewhat on your understanding of the minds of the consumers.

It is remarkable how many simple simulations can be done with merely paper and pencil, and at most a hand calculator. Such simulations often reveal “what is going on” so that you can then approach the problem with the proper analytical tools and get solutions which are “understandable by the human mind” rather than get tables of numbers and pictures of curves. Indeed, after every simulation the results should be closely inspected in the hopes of gaining insight and hence long term progress.

It should be noted that there are often situations in which it is difficult to describe the system under study in standard mathematical notation. In many of the above kinds of simulations all one needs are the logical relationships between the parts, and when and where things happen to the system. This is a great advantage in ill-explored areas where the backlog of mathematical formulations and results has not yet been built up.

10.6 Thought Simulations

We have already illustrated in the text a number of thought (gedanken) simulations. For example, in the gold and silver coin problem, Example 1.9–1 and following, we imagined several different simulations, noting which cases were eliminated, how many trials there were, and how many were successes. From this we understood why the analytical results were reasonable. Had we not had the analytical result the corresponding thinking might well have led us to a good estimate of the result, and likely indicated how to get the analytical answer.

This method of thought experiments is very useful when not much is known or understood; you merely imagine doing the simulation, and then by mentally watching it you often see the underlying features of the problem. It may, at times, be worth thinking of the detailed programming of the proposed simulation on some computer even if you have no intention of doing it—the act of programming, with its demands on explicitly describing all the details, often clarifies in your mind the muddled situation you started with.

10.7 Monte Carlo Methods

At present the name of Monte Carlo (which comes from the name of the famous gambling place) is used to refer to any simulation that uses random numbers. Originally it was used to cover those problems which were deterministic and where there was a random problem which would lead to the same set of equations. Thus solving the random problem gave an estimate of the solution of the corresponding deterministic problem. Buffon’s needle is an example of replacing the computation of a perfectly definite number π with the simulation of a random situation that has the same solution.

If you are going to do a large Monte Carlo problem then it is wise to look at the topic of “swindles,” the deliberate introduction of negative correlations to reduce the variability. For example, in the Buffon needle problem if we replace the needle by a cross of two needles, then on any one toss if one bar of the cross does not intersect a line then it is highly likely that the other bar will produce an intersection. In this fashion we see the reduction of the variance and hence the reduction in the number of trials (for a given reliability) used to estimate the number.

10.8 Some Simple Distributions

One usually has a random number generator that has been supplied, either on the hand calculator you own, on the personal computer you have, or on the large central computing facility. The random number generator usually produces 1/4 of all the possible numbers that can be represented in the format used, and are uniformly distributed in the range of 0 to 1, [H, p. 138]. The generator goes through the whole set and then exactly repeats the sequence. Thus one may regard them as doing a random sampling without replacement. If you were to remove the lower digits of the numbers from the generator then you would have each number in the sequence repeated exactly the same number of times when you got around the whole loop. If you want truly random numbers, meaning among other things sampling with replacement, then the random number generator does not deliver what is wanted.

After years of effort on the problem there is still no completely satisfactory random number generator, and unless you are willing to devote a very large amount of effort you will probably settle for the generator supplied. You are advised, however, because there are still a number of very bad generators in use, to make a few of your own tests that seem to be appropriate for the planned use before using it.

We often need random samples from distributions other than the flat distribution which the random number generator supplies. To get numbers from an other distribution, f(y), we start with the uniform distribution between 0 and 1 and equate the corresponding points on their respective cumulative distributions, that is we set

$\int_{0}^{x} 1 d x = \int_{- \infty}^{y} f (t) d t = F (y)$

Thus we have the equation

$F (y) = x$

If we can analytically invert the function F(y) we have the corresponding transformation

$y = F^{(- 1)} (x)$

to make on each random number x from the uniform generator.

Example 10.8–1 The Exponential Distribution

Suppose we want random numbers from the distribution f(y)= exp(−y). As above we get

$x = \int_{0}^{y} e^{- t} d t = 1 - e^{- y}$

Solving for y we get

$y = - \ln (1 - x)$

but since x is uniform from 0 to 1 we can replace 1 – x by x to get

$y = - \ln x$

as the transformation.

You cannot invert analytically the cumulative distribution of the normal distribution, so we have to resort to some other approach.

Example 10.8–2 The Normal Distribution

To get random numbers from a normal distribution you can add 12 numbers from the flat generator and subtract 6 (see Example 5.6–2). The mean of the random number generator is 1/2 so that the subtraction of 6 from the sum gives a mean of exactly 0. Since the variance of the original flat distribution is 1/12 the 12 independent numbers give a distribution with variance exactly 1.

The above could be expensive of machine time, and if you are going to use a large amount of machine time generating the numbers from a normal distribution in your simulation then you need to look farther into the known methods of getting them cheaper—however, if you try to develop your own it may well take more machine time to test out your ideas than you will probably save!

In table 9.4–1 we saw that even the sum of 3 random numbers from the flat random number generator closely approximated the normal distribution; the sum of 12 naturally does much better. But of course all your numbers will be within ±6σ. Table 9.2–2 shows that the tails of the normal distribution are negligible at 4σ, and they are infinitessimal at 6σ since we have about 2 × 10^-9 outside the range and we can safely ignore the tails except in simulations which depend heavily on the effects of the tails.

Example 10.8–3 The Reciprocal Distribution

To get mantissas for simulating floating point numbers from the reciprocal distribution we rely on the observation that succesive products from a flat distribution, (provided we shift off any leading digits that are 0, rapidly approaches the reciprocal distribution. That is, we use the simple formula, where the x_i are the random numbers from the uniform distribution and the Y_i are those we want,

$\begin{matrix} Y_{n} = Y_{n - 1} x_{n} & (shift) \end{matrix}$

with shifting to remove any leading zeros in the products. Tests show that this is remarkably effective, [H; p. 144].

This is not a book on statistics; in such books you can often find methods of generating random numbers from all kinds of distributions, so we will not go on developing them here. It is sufficient to say that most distributions have fairly simple methods of generating random numbers from them.

10.9 Notes on Programming Many Simulations

If you plan to do many different simulations in the course of your career, it is wise to think about taking the time, as you do the individual ones, to build the tools for the class of simulations you expect to meet in the future. Of course you want to get the current simulation done, and to build a general purpose subroutine to do various parts seems to be a waste of time (now), but a good workman builds the tools to do his trade. No small set of subroutines can be given since individuals tend to work in various areas and therefore have different needs. It is foolish to try to build the whole library before you ever start; it is equally foolish never to build any parts at all. Each person should think through their needs. There are, of course, packages that are designed for such purposes, but again the effort to get one, then domesticate it on your particular machine, is apt to be much more than most people initially think it will be. The library programs will, generally speaking, be better than those you build for yourself, but building them yourself means that you will understand their peculiarities, and not be misled by some odd feature you never thought about when doing the simulation—thus potentially vitiating the whole effort with no warning!

10.10 Summary

Simulations, now that computers are widely available, are very practical, especially in the field of probability, and their value is apt to be overlooked by theoreticians as being beneath their dignity. But amateurs are apt to overestimate their value and reliability.

Crude simulations done early in the problem can give valuable guidance towards how to solve the problem; accurate simulations are often very expensive (but running a personal computer all night does not use a lot of electricity nor produce much wear and tear on the computer).

Even after an analytic result is obtained, the numerical simulation may shed light on why the result is the way it is—in probability problems we often have a poor intuition so that the details of the simulation may illuminate many a dark corner of our experience.