image

Simple calculus shows that the preceding is minimized when

a=-Cov(f(X),g(X))Var(f(X))

image

and, for this value of aimage,

Var(W)=Var(g(X))-[Cov(f(X),g(X))]2Var(f(X))

image

Because Var(f(X))image and Cov(f(X),g(X))image are usually unknown, the simulated data should be used to estimate these quantities.

Dividing the preceding equation by Var(g(X))image shows that

Var(W)Var(g(X))=1-Corr2(f(X),g(X))

image

where Corr(X,Y)image is the correlation between Ximage and Yimage. Consequently, the use of a control variate will greatly reduce the variance of the simulation estimator whenever f(X)image and g(X)image are strongly correlated.

Example 11.20

Consider a continuous-time Markov chain that, upon entering state iimage, spends an exponential time with rate viimage in that state before making a transition into some other state, with the transition being into state jimage with probability Pi,j,i0,jiimage. Suppose that costs are incurred at rate C(i)0image per unit time whenever the chain is in state i,i0image. With X(t)image equal to the state at time timage, and αimage being a constant such that 0<α<1image, the quantity

W=0e-αtC(X(t))dt

image

represents the total discounted cost. For a given initial state, suppose we want to use simulation to estimate E[W]image. Whereas at first it might seem that we cannot obtain an unbiased estimator without simulating the continuous-time Markov chain for an infinite amount of time (which is clearly impossible), we can make use of the results of Example 5.1, which gives the equivalent expression for E[W]image:

E[W]=E0TC(X(t))dt

image

where Timage is an exponential random variable with rate αimage that is independent of the continuous-time Markov chain. Therefore, we can first generate the value of Timage, then generate the states of the continuous-time Markov chain up to time Timage, to obtain the unbiased estimator 0TC(X(t))dtimage. Because all the cost rates are nonnegative this estimator is strongly positively correlated with Timage, which will thus make an effective control variate. ■

Example 11.21

A Queueing System

Let Dn+1image denote the delay in queue of the n+1image customer in a queueing system in which the interarrival times are independent and identically distributed (i.i.d.) with distribution Fimage having mean μFimage and are independent of the service times, which are i.i.d. with distribution Gimage having mean μGimage. If Xiimage is the interarrival time between arrival iimage and i+1image, and if Siimage is the service time of customer i,i1image, we may write

Dn+1=g(X1,,Xn,S1,,Sn)

image

To take into account the possibility that the simulated variables Xi,Siimage may by chance be quite different from what might be expected we can let

f(X1,,Xn,S1,,Sn)=i=1n(Si-Xi)

image

As E[f(X,S)]=n(μG-μF)image we could use

g(X,S)+a[f(X,S)-n(μG-μF)]

image

as an estimator of E[Dn+1]image. Since Dn+1image and fimage are both increasing functions of Si,-Xi,i=1,,nimage it follows from Theorem 11.1 that f(X,S)image and Dn+1image are positively correlated, and so the simulated estimate of aimage should turn out to be negative.

If we wanted to estimate the expected sum of the delays in queue of the first N(T)image arrivals, then we could use i=1N(T)Siimage as our control variable. Indeed as the arrival process is usually assumed independent of the service times, it follows that

Ei=1N(T)Si=E[S]E[N(T)]

image

where E[N(T)]image can either be computed by the method suggested in Section 7.8 or estimated from the simulation as in Example 11.18. This control variable could also be used if the arrival process were a nonhomogeneous Poisson with rate λ(t)image; in this case,

E[N(T)]=0Tλ(t)dt

image

11.6.4 Importance Sampling

Let X=(X1,,Xn)image denote a vector of random variables having a joint density function (or joint mass function in the discrete case) f(x)=f(x1,,xn)image, and suppose that we are interested in estimating

θ=E[h(X)]=h(x)f(x)dx

image

where the preceding is an nimage-dimensional integral. (If the Xiimage are discrete, then interpret the integral as an nimage-fold summation.)

Suppose that a direct simulation of the random vector Ximage, so as to compute values of h(X)image, is inefficient, possibly because (a) it is difficult to simulate a random vector having density function f(x)image, or (b) the variance of h(X)image is large, or (c) a combination of (a) and (b).

Another way in which we can use simulation to estimate θimage is to note that if g(x)image is another probability density such that f(x)=0image whenever g(x)=0image, then we can express θimage as

θ=h(x)f(x)g(x)g(x)dx=Egh(X)f(X)g(X) (11.14)

image (11.14)

where we have written Egimage to emphasize that the random vector Ximage has joint density g(x)image.

It follows from Equation (11.14) that θimage can be estimated by successively generating values of a random vector Ximage having density function g(x)image and then using as the estimator the average of the values of h(X)f(X)/g(X)image. If a density function g(x)image can be chosen so that the random variable h(X)f(X)/g(X)image has a small variance then this approach—referred to as importance sampling—can result in an efficient estimator of θimage.

Let us now try to obtain a feel for why importance sampling can be useful. To begin, note that f(X)image and g(X)image represent the respective likelihoods of obtaining the vector Ximage when Ximage is a random vector with respective densities fimage and gimage. Hence, if Ximage is distributed according to gimage, then it will usually be the case that f(X)image will be small in relation to g(X)image and thus when Ximage is simulated according to gimage the likelihood ratio f(X)/g(X)image will usually be small in comparison to 1. However, it is easy to check that its mean is 1:

Egf(X)g(X)=f(x)g(x)g(x)dx=f(x)dx=1

image

Thus we see that even though f(X)/g(X)image is usually smaller than 1, its mean is equal to 1; thus implying that it is occasionally large and so will tend to have a large variance. So how can h(X)f(X)/g(X)image have a small variance? The answer is that we can sometimes arrange to choose a density gimage such that those values of ximage for which f(x)/g(x)image is large are precisely the values for which h(x)image is exceedingly small, and thus the ratio h(X)f(X)/g(X)image is always small. Since this will require that h(x)image sometimes be small, importance sampling seems to work best when estimating a small probability; for in this case the function h(x)image is equal to 1 when ximage lies in some set and is equal to 0 otherwise.

We will now consider how to select an appropriate density gimage. We will find that the so-called tilted densities are useful. Let M(t)=Ef[etX]=etxf(x)dximage be the moment generating function corresponding to a one-dimensional density fimage.

Definition 11.2

A density function

ft(x)=etxf(x)M(t)

image

is called a tilted density of f,-<t<image.

A random variable with density ftimage tends to be larger than one with density fimage when t>0image and tends to be smaller when t<0image.

In certain cases the tilted distributions ftimage have the same parametric form as does fimage.

Example 11.22

If fimage is the exponential density with rate λimage then

ft(x)=Cetxλe-λx=λCe-(λ-t)x

image

where C=1/M(t)image does not depend on ximage. Therefore, for tλ,ftimage is an exponential density with rate λ-timage.

If fimage is a Bernoulli probability mass function with parameter pimage, then

f(x)=px(1-p)1-x,x=0,1

image

Hence, M(t)=Ef[etX]=pet+1-pimage and so

ft(x)=1M(t)(pet)x(1-p)1-x=petpet+1-px1-ppet+1-p1-x (11.15)

image (11.15)

That is, ftimage is the probability mass function of a Bernoulli random variable with parameter

pt=petpet+1-p

image

We leave it as an exercise to show that if fimage is a normal density with parameters μimage and σ2image then ftimage is a normal density with mean μ+σ2timage and variance σ2image. ■

In certain situations the quantity of interest is the sum of the independent random variables X1,,Xnimage. In this case the joint density fimage is the product of one-dimensional densities. That is,

f(x1,,xn)=f1(x1)fn(xn)

image

where fiimage is the density function of Xiimage. In this situation it is often useful to generate the Xiimage according to their tilted densities, with a common choice of timage employed.

Example 11.23

Let X1,,Xnimage be independent random variables having respective probability density (or mass) functions fiimage, for i=1,,nimage. Suppose we are interested in approximating the probability that their sum is at least as large as aimage, where aimage is much larger than the mean of the sum. That is, we are interested in

θ=P{Sa}

image

where S=i=1nXiimage, and where a>i=1nE[Xi]image. Letting I{Sa}image equal 1 if Saimage and letting it be 0 otherwise, we have that

θ=Ef[I{Sa}]

image

where f=(f1,,fn)image. Suppose now that we simulate Xiimage according to the tilted mass function fi,t,i=1,,nimage, with the value of t,t>0image left to be determined. The importance sampling estimator of θimage would then be

θˆ=I{Sa}fi(Xi)fi,t(Xi)

image

Now,

fi(Xi)fi,t(Xi)=Mi(t)e-tXi

image

and so

θˆ=I{Sa}M(t)e-tS

image

where M(t)=Mi(t)image is the moment generating function of Simage. Since t>0image and I{Sa}image is equal to 0 when S<aimage, it follows that

I{Sa}e-tSe-ta

image

and so

θˆM(t)e-ta

image

To make the bound on the estimator as small as possible we thus choose t,t>0image, to minimize M(t)e-taimage. In doing so, we will obtain an estimator whose value on each iteration is between 0 and mintM(t)e-taimage. It can be shown that the minimizing timage, call it timage, is such that

Et[S]=Eti=1nXi=a

image

where, in the preceding, we mean that the expected value is to be taken under the assumption that the distribution of Xiimage is fi,timage for i=1,,nimage.

For instance, suppose that X1,,Xnimage are independent Bernoulli random variables having respective parameters piimage, for i=1,,nimage. Then, if we generate the Xiimage according to their tilted mass functions pi,t,i=1,,nimage, the importance sampling estimator of θ=P{Sa}image is

θˆ=I{Sa}e-tSi=1npiet+1-pi

image

Since pi,timage is the mass function of a Bernoulli random variable with parameter piet/(piet+1-pi)image it follows that

Eti=1nXi=i=1npietpiet+1-pi

image

The value of timage that makes the preceding equal to aimage can be numerically approximated and then utilized in the simulation.

As an illustration, suppose that n=20,pi=0.4,anda=16image. Then

Et[S]=200.4et0.4et+0.6

image

Setting this equal to 16 yields, after a little algebra,

et=6

image

Thus, if we generate the Bernoullis using the parameter

0.4et0.4et+0.6=0.8

image

then because

M(t)=(0.4et+0.6)20ande-tS=(1/6)S

image

we see that the importance sampling estimator is

θˆ=I{S16}(1/6)S320

image

It follows from the preceding that

θˆ(1/6)16320=81/216=0.001236

image

That is, on each iteration the value of the estimator is between 0 and 0.001236. Since, in this case, θimage is the probability that a binomial random variable with parameters 20, 0.4 is at least 16, it can be explicitly computed with the result θ=0.000317image. Hence, the raw simulation estimator Iimage, which on each iteration takes the value 0 if the sum of the Bernoullis with parameter 0.4 is less than 16 and takes the value 1 otherwise, will have variance

Var(I)=θ(1-θ)=3.169×10-4

image

On the other hand, it follows from the fact that 0θˆ0.001236image that (see Exercise 33)

Var(θˆ)2.9131×10-7

image

Example 11.24

Consider a single-server queue in which the times between successive customer arrivals have density function fimage and the service times have density gimage. Let Dnimage denote the amount of time that the nimageth arrival spends waiting in queue and suppose we are interested in estimating α=P{Dna}image when aimage is much larger than E[Dn]image. Rather than generating the successive interarrival and service times according to fimage and gimage, respectively, they should be generated according to the densities f-timage and gtimage, where timage is a positive number to be determined. Note that using these distributions as opposed to fimage and gimage will result in smaller interarrival times (since -t<0image) and larger service times. Hence, there will be a greater chance that Dn>aimage than if we had simulated using the densities fimage and gimage. The importance sampling estimator of αimage would then be

αˆ=I{Dn>a}et(Sn-Yn)[Mf(-t)Mg(t)]n

image

where Snimage is the sum of the first nimage interarrival times, Ynimage is the sum of the first nimage service times, and Mfimage and Mgimage are the moment generating functions of the densities fimage and gimage, respectively. The value of timage used should be determined by experimenting with a variety of different choices. ■

11.7 Determining the Number of Runs

Suppose that we are going to use simulation to generate rimage independent and identically distributed random variables Y(1),,Y(r)image having mean μimage and variance σ2image. We are then going to use

Y¯r=Y(1)++Y(r)r

image

as an estimate of μimage. The precision of this estimate can be measured by its variance

Var(Y¯r)=E[(Y¯r-μ)2]=σ2/r

image

Hence, we would want to choose rimage, the number of necessary runs, large enough so that σ2/rimage is acceptably small. However, the difficulty is that σ2image is not known in advance. To get around this, you should initially simulate kimage runs (where k30image) and then use the simulated values Y(1),,Y(k)image to estimate σ2image by the sample variance

i=1k(Y(i)-Y¯k)2/(k-1)

image

Based on this estimate of σ2image the value of rimage that attains the desired level of precision can now be determined and an additional r-kimage runs can be generated.

11.8 Generating from the Stationary Distribution of a Markov Chain

11.8.1 Coupling from the Past

Consider an irreducible Markov chain with states 1,,mimage and transition probabilities Pi,jimage and suppose we want to generate the value of a random variable whose distribution is that of the stationary distribution of this Markov chain. Whereas we could approximately generate such a random variable by arbitrarily choosing an initial state, simulating the resulting Markov chain for a large fixed number of time periods, and then choosing the final state as the value of the random variable, we will now present a procedure that generates a random variable whose distribution is exactly that of the stationary distribution.

If, in theory, we generated the Markov chain starting at time -image in any arbitrary state, then the state at time 0image would have the stationary distribution. So imagine that we do this, and suppose that a different person is to generate the next state at each of these times. Thus, if X(-n)image, the state at time -nimage, is iimage, then person -nimage would generate a random variable that is equal to jimage with probability Pi,j,j=1,,mimage, and the value generated would be the state at time -(n-1)image. Now suppose that person -1image wants to do his random variable generation early. Because he does not know what the state at time -1image will be, he generates a sequence of random variables N-1(i),i=1,,mimage, where N-1(i)image, the next state if X(-1)=iimage, is equal to jimage with probability Pi,j,j=1,,mimage. If it results that X(-1)=iimage, then person -1image would report that the state at time 0image is

S-1(i)=N-1(i),i=1,,m

image

(That is, S-1(i)image is the simulated state at time 0image when the simulated state at time -1image is iimage.)

Now suppose that person -2image, hearing that person -1image is doing his simulation early, decides to do the same thing. She generates a sequence of random variables N-2(i),i=1,,mimage, where N-2(i)image is equal to jimage with probability Pi,j,j=1,,mimage. Consequently, if it is reported to her that X(-2)=iimage, then she will report that X(-1)=N-2(i)image. Combining this with the early generation of person -1image shows that if X(-2)=iimage, then the simulated state at time 0image is

S-2(i)=S-1(N-2(i)),i=1,,m

image

Continuing in the preceding manner, suppose that person -3image generates a sequence of random variables N-3(i),i=1,,mimage, where N-3(i)image is to be the generated value of the next state when X(-3)=iimage. Consequently, if X(-3)=iimage then the simulated state at time 0image would be

S-3(i)=S-2(N-3(i)),i=1,,m

image

Now suppose we continue the preceding, and so obtain the simulated functions

S-1(i),S-2(i),S-3(i),,i=1,,m

image

Going backward in time in this manner, we will at some time, say -rimage, have a simulated function S-r(i)image that is a constant function. That is, for some state j,S-r(i)image will equal jimage for all states i=1,,mimage. But this means that no matter what the simulated values from time -image to -rimage, we can be certain that the simulated value at time 0image is jimage. Consequently, jimage can be taken as the value of a generated random variable whose distribution is exactly that of the stationary distribution of the Markov chain.

Example 11.25

Consider a Markov chain with states 1, 2, 3 and suppose that simulation yielded the values

N-1(i)=3,ifi=12,ifi=22,ifi=3

image

and

N-2(i)=1,ifi=13,ifi=21,ifi=3

image

Then

S-2(i)=3,ifi=12,ifi=23,ifi=3

image

If

N-3(i)=3,ifi=11,ifi=21,ifi=3

image

then

S-3(i)=3,ifi=13,ifi=23,ifi=3

image

Therefore, no matter what the state is at time -3image, the state at time 0image will be 3image. ■

Remark

The procedure developed in this section for generating a random variable whose distribution is the stationary distribution of the Markov chain is called coupling from the past.

11.8.2 Another Approach

Consider a Markov chain whose state space is the nonnegative integers. Suppose the chain has stationary probabilities, and denote them by πi,i0image. We now present another way of simulating a random variable whose distribution is given by the πi,i0image, which can be utilized if the chain satisfies the following property. Namely, that for some state, which we will call state 0, and some positive number αimage

Pi,0α>0

image

for all states iimage. That is, whatever the current state, the probability that the next state will be 0 is at least some positive value αimage.

To simulate a random variable distributed according to the stationary probabilities, start by simulating the Markov chain in the obvious manner. Namely, whenever the chain is in state iimage, generate a random variable that is equal to jimage with probability Pi,j,j0image, and then set the next state equal to the generated value of this random variable. In addition, however, whenever a transition into state 0 occurs a coin, whose probability of coming up heads depends on the state from which the transition occurred, is flipped. Specifically, if the transition into state 0 was from state iimage, then the coin flipped has probability α/Pi,0image of coming up heads. Call such a coin an iimage-coin, i0image. If the coin comes up heads then we say that an event has occurred. Consequently, each transition of the Markov chain results in an event with probability αimage, implying that events occur at rate αimage. Now say that an event is an iimage-event if it resulted from a transition out of state iimage; that is, an event is an iimage-event if it resulted from the flip of an iimage-coin. Because πiimage is the proportion of transitions that are out of state iimage, and each such transition will result in an iimage-event with probability αimage, it follows that the rate at which iimage-events occur is απiimage. Therefore, the proportion of all events that are iimage-events is απi/α=πi,i0image.

Now, suppose that X0=0image. Fix iimage, and let Ijimage equal 1 if the jthimage event that occurs is an iimage-event, and let Ijimage equal 0 otherwise. Because an event always leaves the chain in state 0 it follows that Ij,j1image, are independent and identically distributed random variables. Because the proportion of the Ijimage that are equal to 1 is πiimage, we see that

πi=limnI1++Inn=E[I1]=P(I1=1)

image

where the second equality follows from the strong law of large numbers. Hence, if we let

T=min{n>0:an event occurs at timen}

image

denote the time of the first event, then it follows from the preceding that

πi=P(I1=1)=P(XT-1=i)

image

As the preceding is true for all states iimage, it follows that XT-1image, the state of the Markov chain at time T-1image, has the stationary distribution.

Exercises

*1. Suppose it is relatively easy to simulate from the distributions Fi,i=1,2,,nimage. If nimage is small, how can we simulate from

F(x)=i=1nPiFi(x),Pi0,iPi=1?

image


Give a method for simulating from

F(x)=1-e-2x+2x3,0<x<13-e-2x3,1<x<

image

2. Give a method for simulating a negative binomial random variable.

*3. Give a method for simulating a hypergeometric random variable.

4. Suppose we want to simulate a point located at random in a circle of radius rimage centered at the origin. That is, we want to simulate X,Yimage having joint density

f(x,y)=1πr2,x2+y2r2

image

(a) Let R=X2+Y2,θ=tan-1Y/Ximage denote the polar coordinates. Compute the joint density of R,θimage and use this to give a simulation method. Another method for simulating X,Yimage is as follows:

Step 1: Generate independent random numbers U1,U2image and set Z1=2rU1-r,Z2=2rU2-rimage. Then Z1,Z2image is uniform in the square whose sides are of length 2rimage and which encloses, the circle of radius rimage (see Figure 11.6).

Step 2: If (Z1,Z2)image lies in the circle of radius rimage—that is, if Z12+Z22r2image—set (X,Y)=(Z1,Z2)image. Otherwise return to step 1.

(b) Prove that this method works, and compute the distribution of the number of random numbers it requires.

5. Suppose it is relatively easy to simulate from Fiimage for each i=1,,nimage. How can we simulate from

(a) F(x)=i=1nFi(x)image?

(b) F(x)=1-i=1n(1-Fi(x))image?

(c) Give two methods for simulating from the distribution F(x)=xn,0<x<1image.

*6. In Example 11.4 we simulated the absolute value of a standard normal by using the Von Neumann rejection procedure on exponential random variables with rate 1. This raises the question of whether we could obtain a more efficient algorithm by using a different exponential density—that is, we could use the density g(x)=λe-λximage. Show that the mean number of iterations needed in the rejection scheme is minimized when λ=1image.

7. Give an algorithm for simulating a random variable having density function

f(x)=30(x2-2x3+x4),0<x<1

image

8. Consider the technique of simulating a gamma (n,λ)image random variable by using the rejection method with gimage being an exponential density with rate λ/nimage.

(a) Show that the average number of iterations of the algorithm needed to generate a gamma is nne1-n/(n-1)!image.

(b) Use Stirling’s approximation to show that for large nimage the answer to part (a) is approximately equal to e[(n-1)/(2π)]1/2image.

(c) Show that the procedure is equivalent to the following:

Step 1: Generate Y1image and Y2image, independent exponentials with rate 1.

Step 2: If Y1<(n-1)[Y2-log(Y2)-1]image, return to step 1.

Step 3: Set X=nY2/λimage.

(d) Explain how to obtain an independent exponential along with a gamma from the preceding algorithm.

9. Set up the alias method for simulating from a binomial random variable with parameters n=6,p=0.4image.

10. Explain how we can number the Q(k)image in the alias method so that kimage is one of the two points that Q(k)image gives weight.
Hint: Rather than giving the initial Qimage the name Q(1)image, what else could we call it?

11. Complete the details of Example 11.10.

12. Let X1,,Xkimage be independent with

P{Xi=j}=1n,j=1,,n,i=1,,k

image


If Dimage is the number of distinct values among X1,,Xkimage show that

E[D]=n1-n-1nkk-k22nwhenk2nis small

image

13. The Discrete Rejection Method: Suppose we want to simulate Ximage having probability mass function P{X=i}=Pi,i=1,,nimage and suppose we can easily simulate from the probability mass function Qi,iQi=1,Qi0image. Let Cimage be such that PiCQi,i=1,,nimage. Show that the following algorithm generates the desired random variable:

Step 1: Generate Yimage having mass function Qimage and Uimage an independent random number.

Step 2: If UPY/CQYimage, set X=Yimage. Otherwise return to step 1.

14. The Discrete Hazard Rate Method: Let Ximage denote a nonnegative integer valued random variable. The function λ(n)=P{X=nXn},n0image, is called the discrete hazard rate function.

(a) Show that P{X=n}=λ(n)i=0n-1(1-λ(i))image.

(b) Show that we can simulate Ximage by generating random numbers U1,U2,image stopping at

X=min{n:Unλ(n)}

image

(c) Apply this method to simulating a geometric random variable. Explain, intuitively, why it works.

(d) Suppose that λ(n)p<1image for all nimage. Consider the following algorithm for simulating Ximage and explain why it works: Simulate Xi,Ui,i1image where Xiimage is geometric with mean 1/pimage and Uiimage is a random number. Set Sk=X1++Xkimage and let

X=min{Sk:Ukλ(Sk)/p}

image

15. Suppose you have just simulated a normal random variable Ximage with mean μimage and variance σ2image. Give an easy way to generate a second normal variable with the same mean and variance that is negatively correlated with Ximage.

*16. Suppose nimage balls having weights w1,w2,,wnimage are in an urn. These balls are sequentially removed in the following manner: At each selection, a given ball in the urn is chosen with a probability equal to its weight divided by the sum of the weights of the other balls that are still in the urn. Let I1,I2,,Inimage denote the order in which the balls are removed—thus I1,,Inimage is a random permutation with weights.

(a) Give a method for simulating I1,,Inimage.

(b) Let Xiimage be independent exponentials with rates wi,i=1,,nimage. Explain how Xiimage can be utilized to simulate I1,,Inimage.

17. Order Statistics: Let X1,,Xnimage be i.i.d. from a continuous distribution Fimage, and let X(i)image denote the iimageth smallest of X1,,Xn,i=1,,nimage. Suppose we want to simulate X(1)<X(2)<<X(n)image. One approach is to simulate nimage values from Fimage, and then order these values. However, this ordering, or sorting, can be time consuming when nimage is large.

(a) Suppose that λ(t)image, the hazard rate function of Fimage, is bounded. Show how the hazard rate method can be applied to generate the nimage variables in such a manner that no sorting is necessary.

Suppose now that F-1image is easily computed.

(b) Argue that X(1),,X(n)image can be generated by simulating U(1)<U(2)<<U(n)image—the ordered values of nimage independent random numbers—and then setting X(i)=F-1(U(i))image. Explain why this means that X(i)image can be generated from F-1(βi)image where βiimage is beta with parameters i,n+i+1image.

(c) Argue that U(1),,U(n)image can be generated, without any need for sorting, by simulating i.i.d. exponentials Y1,,Yn+1image and then setting

U(i)=Y1++YiY1++Yn+1,i=1,,n

image


Hint: Given the time of the (n+1)imagest event of a Poisson process, what can be said about the set of times of the first nimage events?

(d) Show that if U(n)=yimage then U(1),,U(n-1)image has the same joint distribution as the order statistics of a set of n-1image uniform (0,y)image random variables.

(e) Use part (d) to show that U(1),,U(n)image can be generated as follows:

Step 1: Generate random numbers U1,,Unimage.

Step 2: Set

U(n)=U11/n,U(n-1)=U(n)(U2)1/(n-1),U(j-1)=U(j)(Un-j+2)1/(j-1),j=2,,n-1

image

18. Let X1,,Xnimage be independent exponential random variables each having rate 1. Set

W1=X1/n,Wi=Wi-1+Xin-i+1,i=2,,n

image


Explain why W1,,Wnimage has the same joint distribution as the order statistics of a sample of nimage exponentials each having rate 1.

19. Suppose we want to simulate a large number nimage of independent exponentials with rate 1—call them X1,X2,,Xnimage. If we were to employ the inverse transform technique we would require one logarithmic computation for each exponential generated. One way to avoid this is to first simulate Snimage, a gamma random variable with parameters (n,1)image (say, by the method of Section 11.3.3). Now interpret Snimage as the time of the nimageth event of a Poisson process with rate 1 and use the result that given Snimage the set of the first n-1image event times is distributed as the set of n-1image independent uniform (0,Sn)image random variables. Based on this, explain why the following algorithm simulates nimage independent exponentials:

Step 1: Generate Snimage, a gamma random variable with parameters (n,1)image.

Step 2: Generate n-1image random numbers U1,U2,,Un-1image.

Step 3: Order the Ui,i=1,,n-1image to obtain U(1)<U(2)<<U(n-1)image.

Step 4: Let U(0)=0,U(n)=1image, and set Xi=Sn(U(i)-U(i-1)),i=1,,nimage.

When the ordering (step 3) is performed according to the algorithm described in Section 11.5, the preceding is an efficient method for simulating nimage exponentials when all nimage are simultaneously required. If memory space is limited, however, and the exponentials can be employed sequentially, discarding each exponential from memory once it has been used, then the preceding may not be appropriate.

20. Consider the following procedure for randomly choosing a subset of size kimage from the numbers 1,2,,nimage: Fix pimage and generate the first nimage time units of a renewal process whose interarrival distribution is geometric with mean 1/pimage—that is, P{interarrival time=k}=p(1-p)k-1,k=1,2,image. Suppose events occur at times i1<i2<<imnimage. If m=kimage, stop; i1,,imimage is the desired set. If m>kimage, then randomly choose (by some method) a subset of size kimage from i1,,imimage and then stop. If m<kimage, take i1,,imimage as part of the subset of size kimage and then select (by some method) a random subset of size k-mimage from the set {1,2,,n}-{i1,,im}image. Explain why this algorithm works. As E[N(n)]=npimage a reasonable choice of pimage is to take pk/nimage. (This approach is due to Dieter.)

21. Consider the following algorithm for generating a random permutation of the elements 1,2,,nimage. In this algorithm, P(i)image can be interpreted as the element in position iimage.

Step 1: Set k=1image.

Step 2: Set P(1)=1image.

Step 3: If k=nimage, stop. Otherwise, let k=k+1image.

Step 4: Generate a random number Uimage, and let

P(k)=P([kU]+1),P([kU]+1)=k.Go tostep 3.

image

(a) Explain in words what the algorithm is doing.

(b) Show that at iteration kimage—that is, when the value of P(k)image is initially set—that P(1),P(2),,P(k)image is a random permutation of 1,2,,kimage.


Hint: Use induction and argue that

Pk{i1,i2,,ij-1,k,ij,,ik-2,i}=Pk-1{i1,i2,,ij-1,i,ij,,ik-2}1k=1k!by the induction hypothesis

image

The preceding algorithm can be used even if nimage is not initially known.

22. Verify that if we use the hazard rate approach to simulate the event times of a nonhomogeneous Poisson process whose intensity function λ(t)image is such that λ(t)λimage, then we end up with the approach given in method 1 of Section 11.5.

*23. For a nonhomogeneous Poisson process with intensity function λ(t),t0image, where 0λ(t)dt=image, let X1,X2,image denote the sequence of times at which events occur.

(a) Show that 0X1λ(t)dtimage is exponential with rate 1.

(b) Show that Xi-1Xiλ(t)dt,i1image, are independent exponentials with rate 1, where X0=0image.

In words, independent of the past, the additional amount of hazard that must be experienced until an event occurs is exponential with rate 1.

24. Give an efficient method for simulating a nonhomogeneous Poisson process with intensity function

λ(t)=b+1t+a,t0

image

25. Let (X,Y)image be uniformly distributed in a circle of radius rimage about the origin. That is, their joint density is given by

f(x,y)=1πr2,0x2+y2r2

image

Let R=X2+Y2image and θ=arctanY/Ximage denote their polar coordinates. Show that Rimage and θimage are independent with θimage being uniform on (0,2π)image and P{R<a}=a2/r2,0<a<rimage.

26. Let Rimage denote a region in the two-dimensional plane. Show that for a two-dimensional Poisson process, given that there are nimage points located in Rimage, the points are independently and uniformly distributed in Rimage—that is, their density is f(x,y)=c,(x,y)Rimage where cimage is the inverse of the area of Rimage.

27. Let X1,,Xnimage be independent random variables with E[Xi]=θ,Var(Xi)=σi2i=1,,nimage, and consider estimates of θimage of the form i=1nλiXiimage where i=1nλi=1image. Show that Vari=1nλiXiimage is minimized when

λi=(1/σi2)/j=1n1/σj2,i=1,,n.

image


Possible Hint: If you cannot do this for general nimage, try it first when n=2image.
The following two problems are concerned with the estimation of 01g(x)dx=E[g(U)]image where Uimage is uniform (0,1)image.

28. The Hit–Miss Method: Suppose gimage is bounded in [0,1image]—for instance, suppose 0g(x)bimage for x[0,1]image. Let U1,U2image be independent random numbers and set X=U1,Y=bU2image—so the point (X,Y)image is uniformly distributed in a rectangle of length 1 and height bimage. Now set

I=1,ifY<g(X)0,otherwise

image


That is, accept (X,Y)image if it falls in the shaded area of Figure 11.7.

(a) Show that E[bI]=01g(x)dximage.

(b) Show that Var(bI)Var(g(U))image, and so hit–miss has larger variance than simply computing gimage of a random number.

29. Stratified Sampling: Let U1,,Unimage be independent random numbers and set U¯i=(Ui+i-1)/n,i=1,,nimage. Hence, U¯i,i1image, is uniform on ((i-1)/n,i/n)image. i=1ng(U¯i)/nimage is called the stratified sampling estimator of 01g(x)dximage.

(a) Show that E[i=1ng(U¯i)/n]=01g(x)dximage.

(b) Show that Var[i=1ng(U¯i)/n]Var[i=1ng(Ui)/n]image.


Hint: Let Uimage be uniform (0,1)image and define Nimage by N=iimage if (i-1)/n<U<i/n,i=1,,nimage. Now use the conditional variance formula to obtain

Var(g(U))=E[Var(g(U)N)]+Var(E[g(U)N])E[Var(g(U)N)]=i=1nVar(g(U)N=i)n=i=1nVar[g(U¯i)]n

image

30. If fimage is the density function of a normal random variable with mean μimage and variance σ2image, show that the tilted density ftimage is the density of a normal random variable with mean μ+σ2timage and variance σ2image.

31. Consider a queueing system in which each service time, independent of the past, has mean μimage. Let Wnimage and Dnimage denote, respectively, the amounts of time customer nimage spends in the system and in queue. Hence, Dn=Wn-Snimage where Snimage is the service time of customer nimage. Therefore,

E[Dn]=E[Wn]-μ

image

If we use simulation to estimate E[Dn]image, should we

(a) use the simulated data to determine Dnimage, which is then used as an estimate of E[Dn]image; or

(b) use the simulated data to determine Wnimage and then use this quantity minus μimage as an estimate of E[Dn]image?

Repeat for when we want to estimate E[Wn]image.

*32. Show that if Ximage and Yimage have the same distribution then

Var((X+Y)/2)Var(X)

image


Hence, conclude that the use of antithetic variables can never increase variance (though it need not be as efficient as generating an independent set of random numbers).

33. If 0Xaimage, show that

(a) E[X2]aE[X]image,

(b) Var(X)E[X](a-E[X])image,

(c) Var(X)a2/4image.

34. Suppose in Example 11.19 that no new customers are allowed in the system after time t0image. Give an efficient simulation estimator of the expected additional time after t0image until the system becomes empty.

35. Suppose we are able to simulate independent random variables Ximage and Yimage. If we simulate 2kimage independent random variables X1,,Xkimage and Y1,,Ykimage, where the Xiimage have the same distribution as does Ximage, and the Yjimage have the same distribution as does Yimage, how would you use them to estimate P(X<Y)image?

36. If U1,U2,U3image are independent uniform (0, 1) random variables, find Pi=13Ui>0.1image.

image
Figure 11.6
image
Figure 11.7

Hint: Relate the desired probability to one about a Poisson process.