image

Substituting the identities E[Xn]=1,E[Xn2]=2image in the preceding shows that

E[Sn]=n+n2/2

image

and the induction proof is complete. (c) If we let Cjimage denote the number of hats chosen by person j,j=1,,nimage then

j=1nCj=Sn

image

Taking expectations, and using the fact that each Cjimage has the same mean, yields the result

E[Cj]=E[Sn]/n=1+n/2

image

Hence, the expected number of false selections by person jimage is

E[Cj-1]=n/2.

image

Example 3.16

Analyzing the Quick-Sort Algorithm

Suppose we are given a set of nimage distinct values—x1,,xnimage—and we desire to put these values in increasing order or, as it is commonly called, to sortimage them. An efficient procedure for accomplishing this is the quick-sort algorithm, which is defined recursively as follows: When n=2image the algorithm compares the two values and puts them in the appropriate order. When n>2image it starts by choosing at random one of the nimage values—say, xiimage—and then compares each of the other n-1image values with xiimage, noting which are smaller and which are larger than xiimage. Letting Siimage denote the set of elements smaller than xiimage, and S¯iimage the set of elements greater than xiimage, the algorithm now sorts the set Siimage and the set S¯iimage. The final ordering, therefore, consists of the ordered set of the elements in Siimage, then xiimage, and then the ordered set of the elements in S¯iimage. For instance, suppose that the set of elements is 10, 5, 8, 2, 1, 4, 7. We start by choosing one of these values at random (that is, each of the 7 values has probability of 17image of being chosen). Suppose, for instance, that the value 4 is chosen. We then compare 4 with each of the other six values to obtain

{2,1},4,{10,5,8,7}

image

We now sort the set {2, 1} to obtain

1,2,4,{10,5,8,7}

image

Next we choose a value at random from {10,5,8,7}image—say 7 is chosen—and compare each of the other three values with 7 to obtain

1,2,4,5,7,{10,8}

image

Finally, we sort {10,8}image to end up with

1,2,4,5,7,8,10

image

One measure of the effectiveness of this algorithm is the expected number of comparisons that it makes. Let us denote by Mnimage the expected number of comparisons needed by the quick-sort algorithm to sort a set of nimage distinct values. To obtain a recursion for Mnimage we condition on the rank of the initial value selected to obtain

Mn=j=1nE[number of comparisonsvalue selected isjth smallest]1n

image

Now, if the initial value selected is the jimageth smallest, then the set of values smaller than it is of size j-1image, and the set of values greater than it is of size n-jimage. Hence, as n-1image comparisons with the initial value chosen must be made, we see that

Mn=j=1n(n-1+Mj-1+Mn-j)1n=n-1+2nk=1n-1Mk(sinceM0=0)

image

or, equivalently,

nMn=n(n-1)+2k=1n-1Mk

image

To solve the preceding, note that upon replacing nimage by n+1image we obtain

(n+1)Mn+1=(n+1)n+2k=1nMk

image

Hence, upon subtraction,

(n+1)Mn+1-nMn=2n+2Mn

image

or

(n+1)Mn+1=(n+2)Mn+2n

image

Therefore,

Mn+1n+2=2n(n+1)(n+2)+Mnn+1

image

Iterating this gives

Mn+1n+2=2n(n+1)(n+2)+2(n-1)n(n+1)+Mn-1n==2k=0n-1n-k(n+1-k)(n+2-k)sinceM1=0

image

Hence,

Mn+1=2(n+2)k=0n-1n-k(n+1-k)(n+2-k)=2(n+2)i=1ni(i+1)(i+2),n1

image

Using the identity i/(i+1)(i+2)=2/(i+2)-1/(i+1)image, we can approximate Mn+1image for large nimage as follows:

Mn+1=2(n+2)i=1n2i+2-i=1n1i+12(n+2)3n+22xdx-2n+11xdx=2(n+2)[2log(n+2)-log(n+1)+log2-2log3]=2(n+2)log(n+2)+logn+2n+1+log2-2log32(n+2)log(n+2)

image

Although we usually employ the conditional expectation identity to more easily enable us to compute an unconditional expectation, in our next example we show how it can sometimes be used to obtain the conditional expectation.

Example 3.17

In the match problem of Example 2.31 involving n,n>1image, individuals, find the conditional expected number of matches given that the first person did not have a match.

Solution: Let Ximage denote the number of matches, and let X1image equal 1image if the first person has a match and 0image otherwise. Then,

E[X]=E[XX1=0]P{X1=0}+E[XX1=1]P{X1=1}=E[XX1=0]n-1n+E[XX1=1]1n

image

But, from Example 2.31

E[X]=1

image

Moreover, given that the first person has a match, the expected number of matches is equal to 1image plus the expected number of matches when n-1image people select among their own n-1image hats, showing that

E[XX1=1]=2

image

Therefore, we obtain the result

E[XX1=0]=n-2n-1

image

3.4.1 Computing Variances by Conditioning

Conditional expectations can also be used to compute the variance of a random variable. Specifically, we can use

Var(X)=E[X2]-(E[X])2

image

and then use conditioning to obtain both E[X]image and E[X2]image. We illustrate this technique by determining the variance of a geometric random variable.

Another way to use conditioning to obtain the variance of a random variable is to apply the conditional variance formula. The conditional variance of Ximage given that Y=yimage is defined by

Var(XY=y)=E(X-E[XY=y])2Y=y

image

That is, the conditional variance is defined in exactly the same manner as the ordinary variance with the exception that all probabilities are determined conditional on the event that Y=yimage. Expanding the right side of the preceding and taking expectation term by term yields

Var(XY=y)=E[X2Y=y]-(E[XY=y])2

image

Letting Var(XY)image denote that function of Yimage whose value when Y=yimage is Var(XY=y)image, we have the following result.

Proposition 3.1

The Conditional Variance Formula

Var(X)=EVar(XY)+VarE[XY] (3.7)

image (3.7)

Example 3.20

The Variance in the Matching Rounds Problem

Consider the matching rounds problem of Example 3.14, and let Vn=Var(Rn)image denote the variance of the number of rounds needed when there are initially nimage people. Using the conditional variance formula, we will show that

Vn=n,n2

image

The proof of the preceding is by induction on nimage. To begin, note that when n=2image the number of rounds needed is geometric with parameter p=1/2image and so

V2=1-pp2=2

image

So assume the induction hypothesis that

Vj=j,2j<n

image

and now consider the case when there are nimage individuals. If Ximage is the number of matches in the first round then, conditional on Ximage, the number of rounds Rnimage is distributed as 1 plus the number of rounds needed when there are initially n-Ximage individuals. Consequently,

E[RnX]=1+E[Rn-X]=1+n-XbyExample 3.14

image

Also, with V0=0image,

Var(RnX)=Var(Rn-X)=Vn-X

image

Hence, by the conditional variance formula

Vn=E[Var(RnX)]+Var(E[RnX])=E[Vn-X]+Var(X)=j=0nVn-jP(X=j)+Var(X)=VnP(X=0)+j=1nVn-jP(X=j)+Var(X)

image

Because P(X=n-1)=0image, it follows from the preceding and the induction hypothesis that

Vn=VnP(X=0)+j=1n(n-j)P(X=j)+Var(X)=VnP(X=0)+n(1-P(X=0))-E[X]+Var(X)

image

As it is easily shown (see Example 2.31 and Exercise 72 of Chapter 2) that E[X]=Var(X)=1image, the preceding gives

Vn=VnP(X=0)+n(1-P(X=0))

image

thus proving the result. ■

3.5 Computing Probabilities by Conditioning

Not only can we obtain expectations by first conditioning on an appropriate random variable, but we may also use this approach to compute probabilities. To see this, let Eimage denote an arbitrary event and define the indicator random variable Ximage by

X=1,ifEoccurs0,ifEdoes not occur

image

It follows from the definition of Ximage that

E[X]=P(E),E[XY=y]=P(EY=y),for any random variableY

image

Therefore, from Equations (3.2a) and (3.2b) we obtain

P(E)=yP(EY=y)P(Y=y),ifYis discrete=-P(EY=y)fY(y)dy,ifYis continuous

image

Example 3.21

Suppose that Ximage and Yimage are independent continuous random variables having densities fXimage and fYimage, respectively. Compute P{X<Y}image.

Solution: Conditioning on the value of Yimage yields

P{X<Y}=-P{X<YY=y}fY(y)dy=-P{X<yY=y}fY(y)dy=-P{X<y}fY(y)dy=-FX(y)fY(y)dy

image

where

FX(y)=-yfX(x)dx

image

Example 3.23

Suppose that the number of people who visit a yoga studio each day is a Poisson random variable with mean λimage. Suppose further that each person who visits is, independently, female with probability pimage or male with probability 1-pimage. Find the joint probability that exactly nimage women and mimage men visit the academy today.

Solution: Let N1image denote the number of women and N2image the number of men who visit the academy today. Also, let N=N1+N2image be the total number of people who visit. Conditioning on Nimage gives

P{N1=n,N2=m}=i=0P{N1=n,N2=mN=i}P{N=i}

image

Because P{N1=n,N2=mN=i}=0image when in+mimage, the preceding equation yields

P{N1=n,N2=m}=P{N1=n,N2=mN=n+m}e-λλn+m(n+m)!

image

Given that n+mimage people visit it follows, because each of these n+mimage is independently a woman with probability pimage, that the conditional probability that nimage of them are women (and mimage are men) is just the binomial probability of nimage successes in n+mimage trials. Therefore,

P{N1=n,N2=m}=n+mnpn(1-p)me-λλn+m(n+m)!=(n+m)!n!m!pn(1-p)me-λpe-λ(1-p)λnλm(n+m)!=e-λp(λp)nn!e-λ(1-p)(λ(1-p))mm!

image

Because the preceding joint probability mass function factors into two products, one of which depends only on nimage and the other only on mimage, it follows that N1image and N2image are independent. Moreover, because

P{N1=n}=m=0P{N1=n,N2=m}=e-λp(λp)nn!m=0e-λ(1-p)(λ(1-p))mm!=e-λp(λp)nn!

image

and, similarly,

P{N2=m}=e-λ(1-p)(λ(1-p))mm!

image

we can conclude that N1image and N2image are independent Poisson random variables with respective means λpimage and λ(1-p)image. Therefore, this example establishes the important result that when each of a Poisson number of events is independently classified either as being type 1image with probability pimage or type 2image with probability 1-pimage, then the numbers of type 1image and type 2image events are independent Poisson random variables. ■

The result of Example 3.23 generalizes to the case where each of a Poisson distributed number of events, Nimage, with mean λimage is independently classified as being one of kimage types, with the probability that it is type iimage being pi,i=1,,k,i=1kpi=1image. If Niimage is the number that are classified as type iimage, then N1,,Nkimage are independent Poisson random variables with respective means λp1,,λpkimage. This follows, since for n=i=1kniimage

P(N1=n1,,Nk=nk)=P(N1=n1,,Nk=nkN=n)P(N=n)=n!n1!nk!p1n1pknke-λλn/n!=i=1ke-λpi(λpi)ni/ni!

image

where the second equality used that, given a total of nimage events, the numbers of each type has a multinomial distribution with parameters (n,p1,,pk)image.

Example 3.24

The Distribution of the Sum of Independent Bernoulli Random Variables

Let X1,,Xnimage be independent Bernoulli random variables, with Xiimage having parameter pi,i=1,,nimage. That is, P{Xi=1}=pi,P{Xi=0}=qi=1-piimage. Suppose we want to compute the probability mass function of their sum, X1++Xnimage. To do so, we will recursively obtain the probability mass function of X1++Xkimage, first for k=1image, then k=2image, and on up to k=nimage. To begin, let

Pk(j)=P{X1++Xk=j}

image

and note that

Pk(k)=i=1kpi,Pk(0)=i=1kqi

image

For 0<j<kimage, conditioning on Xkimage yields the recursion

Pk(j)=P{X1++Xk=jXk=1}pk+P{X1++Xk=jXk=0}qk=P{X1++Xk-1=j-1Xk=1}pk+P{X1++Xk-1=jXk=0}qk=P{X1++Xk-1=j-1}pk+P{X1++Xk-1=j}qk=pkPk-1(j-1)+qkPk-1(j)

image

Starting with P1(1)=p1,P1(0)=q1image, the preceding equations can be recursively solved to obtain the functions P2(j),P3(j)image, up to Pn(j)image. ■

Example 3.25

The Best Prize Problem

Suppose that we are to be presented with nimage distinct prizes in sequence. After being presented with a prize we must immediately decide whether to accept it or reject it and consider the next prize. The only information we are given when deciding whether to accept a prize is the relative rank of that prize compared to ones already seen. That is, for instance, when the fifth prize is presented we learn how it compares with the first four prizes already seen. Suppose that once a prize is rejected it is lost, and that our objective is to maximize the probability of obtaining the best prize. Assuming that all n!image orderings of the prizes are equally likely, how well can we do?

Solution: Rather surprisingly, we can do quite well. To see this, fix a value k,0k<nimage, and consider the strategy that rejects the first kimage prizes and then accepts the first one that is better than all of those first kimage. Let Pkimage (best) denote the probability that the best prize is selected when this strategy is employed. To compute this probability, condition on Ximage, the position of the best prize. This gives

Pk(best)=i=1nPk(bestX=i)P(X=i)=1ni=1nPk(bestX=i)

image

Now, if the overall best prize is among the first kimage, then no prize is ever selected under the strategy considered. On the other hand, if the best prize is in position iimage, where i>kimage, then the best prize will be selected if the best of the first kimage prizes is also the best of the first i-1image prizes (for then none of the prizes in positions k+1,k+2,,i-1image would be selected). Hence, we see that

Pk(bestX=i)=0,ifikPk(bestX=i)=P{best of firsti-1is among the firstk}=k/(i-1),ifi>k

image

From the preceding, we obtain

Pk(best)=kni=k+1n1i-1knkn-11xdx=knlogn-1kknlognk

image

Now, if we consider the function

g(x)=xnlognx

image

then

g(x)=1nlognx-1n

image

and so

g(x)=0log(n/x)=1x=n/e