image

Random Variables

Abstract

Random variables are quantities whose value is determined by the outcome of an experiment. This chapter introduces two types of random variables: discrete and continuous, and studies a variety of such of each type. The important idea of the expected value of a random variable is introduced.

Keywords

Discrete Random Variables; Continuous Random Variables; Binomial Random Variable; Poisson Random Variable; Geometric Random Variable; Uniform Random Variable; Exponential Random Variable; Expected Value; Variance; Joint Distributions

2.1 Random Variables

It frequently occurs that in performing an experiment we are mainly interested in some functions of the outcome as opposed to the outcome itself. For instance, in tossing dice we are often interested in the sum of the two dice and are not really concerned about the actual outcome. That is, we may be interested in knowing that the sum is seven and not be concerned over whether the actual outcome was (1, 6) or (2, 5) or (3, 4) or (4, 3) or (5, 2) or (6, 1). These quantities of interest, or more formally, these real-valued functions defined on the sample space, are known as random variables.

Since the value of a random variable is determined by the outcome of the experiment, we may assign probabilities to the possible values of the random variable.

Example 2.2

For a second example, suppose that our experiment consists of tossing two fair coins. Letting Yimage denote the number of heads appearing, then Yimage is a random variable taking on one of the values 0, 1, 2 with respective probabilities

P{Y=0}=P{(T,T)}=14,P{Y=1}=P{(T,H),(H,T)}=24,P{Y=2}=P{(H,H)}=14

image

Of course, P{Y=0}+P{Y=1}+P{Y=2}=1image. ■

Example 2.3

Suppose that we toss a coin having a probability pimage of coming up heads, until the first head appears. Letting Nimage denote the number of flips required, then assuming that the outcome of successive flips are independent, Nimage is a random variable taking on one of the values 1,2,3,,image with respective probabilities

P{N=1}=P{H}=p,P{N=2}=P{(T,H)}=(1-p)p,P{N=3}=P{(T,T,H)}=(1-p)2p,P{N=n}=P{(T,T,,Tn-1,H)}=(1-p)n-1p,n1

image

As a check, note that

Pn=1{N=n}=n=1P{N=n}=pn=1(1-p)n-1=p1-(1-p)=1

image

Example 2.4

Suppose that our experiment consists of seeing how long a battery can operate before wearing down. Suppose also that we are not primarily interested in the actual lifetime of the battery but are concerned only about whether or not the battery lasts at least two years. In this case, we may define the random variable Iimage by

I=1,ifthelifetimeofbatteryistwoormoreyears0,otherwise

image

If Eimage denotes the event that the battery lasts two or more years, then the random variable Iimage is known as the indicator random variable for event Eimage. (Note that Iimage equals 1 or 0 depending on whether or not Eimage occurs.) ■

Example 2.5

Suppose that independent trials, each of which results in any of mimage possible outcomes with respective probabilities p1,,pm,i=1mpi=1image, are continually performed. Let Ximage denote the number of trials needed until each outcome has occurred at least once.

Rather than directly considering P{X=n}image we will first determine P{X>n}image, the probability that at least one of the outcomes has not yet occurred after nimage trials. Letting Aiimage denote the event that outcome iimage has not yet occurred after the first nimage trials, i=1,,mimage, then

P{X>n}=Pi=1mAi=i=1mP(Ai)-i<jP(AiAj)+i<j<kP(AiAjAk)-+(-1)m+1P(A1Am)

image

Now, P(Ai)image is the probability that each of the first nimage trials results in a non-iimage outcome, and so by independence

P(Ai)=(1-pi)n

image

Similarly, P(AiAj)image is the probability that the first nimage trials all result in a non-iimage and non-jimage outcome, and so

P(AiAj)=(1-pi-pj)n

image

As all of the other probabilities are similar, we see that

P{X>n}=i=1m(1-pi)n-i<j(1-pi-pj)n+i<j<k(1-pi-pj-pk)n-

image

Since P{X=n}=P{X>n-1}-P{X>n}image, we see, upon using the algebraic identity (1-a)n-1-(1-a)n=a(1-a)n-1image, that

P{X=n}=i=1mpi(1-pi)n-1-i<j(pi+pj)(1-pi-pj)n-1+i<j<k(pi+pj+pk)(1-pi-pj-pk)n-1-

image

In all of the preceding examples, the random variables of interest took on either a finite or a countable number of possible values.* Such random variables are called discrete. However, there also exist random variables that take on a continuum of possible values. These are known as continuous random variables. One example is the random variable denoting the lifetime of a car, when the car’s lifetime is assumed to take on any value in some interval (a,bimage).

The cumulative distribution function (cdf) (or more simply the distribution function) F(·)image of the random variable Ximage is defined for any real number b,-<b<image, by

F(b)=P{Xb}

image

In words, F(b)image denotes the probability that the random variable Ximage takes on a value that is less than or equal to bimage. Some properties of the cdf Fimage are

Property (i) follows since for a<bimage the event {Xa}image is contained in the event {Xb}image, and so it must have a smaller probability. Properties (ii) and (iii) follow since Ximage must take on some finite value.

All probability questions about Ximage can be answered in terms of the cdf F(·)image. For example,

P{a<Xb}=F(b)-F(a)foralla<b

image

This follows since we may calculate P{a<Xb}image by first computing the probability that Xbimage (that is, F(b)image) and then subtracting from this the probability that Xaimage (that is, F(a)image).

If we desire the probability that Ximage is strictly smaller than bimage, we may calculate this probability by

P{X<b}=limh0+P{Xb-h}=limh0+F(b-h)

image

where limh0+image means that we are taking the limit as himage decreases to 0. Note that P{X<b}image does not necessarily equal F(b)image since F(b)image also includes the probability that Ximage equals bimage.

2.2 Discrete Random Variables

As was previously mentioned, a random variable that can take on at most a countable number of possible values is said to be discrete. For a discrete random variable Ximage, we define the probability mass function p(a)image of Ximage by

p(a)=P{X=a}

image

The probability mass function p(a)image is positive for at most a countable number of values of aimage. That is, if Ximage must assume one of the values x1,x2,image then

p(xi)>0,i=1,2,p(x)=0,allothervaluesofx

image

Since Ximage must take on one of the values xiimage, we have

i=1p(xi)=1

image

The cumulative distribution function Fimage can be expressed in terms of p(a)image by

F(a)=allxiap(xi)

image

For instance, suppose Ximage has a probability mass function given by

p(1)=12,p(2)=13,p(3)=16

image

then, the cumulative distribution function Fimage of Ximage is given by

F(a)=0,a<112,1a<256,2a<31,3a

image

This is graphically presented in Figure 2.1.

image
Figure 2.1 Graph of F(x)image.

Discrete random variables are often classified according to their probability mass functions. We now consider some of these random variables.

2.2.1 The Bernoulli Random Variable

Suppose that a trial, or an experiment, whose outcome can be classified as either a “success” or as a “failure” is performed. If we let Ximage equal 1 if the outcome is a success and 0 if it is a failure, then the probability mass function of Ximage is given by

p(0)=P{X=0}=1-p,p(1)=P{X=1}=p (2.2)

image (2.2)

where p,0p1image, is the probability that the trial is a “success.”

A random variable Ximage is said to be a Bernoulli random variable if its probability mass function is given by Equation (2.2) for some p(0,1)image.

2.2.2 The Binomial Random Variable

Suppose that nimage independent trials, each of which results in a “success” with probability pimage and in a “failure” with probability 1-pimage, are to be performed. If Ximage represents the number of successes that occur in the nimage trials, then Ximage is said to be a binomial random variable with parameters (n,p)image.

The probability mass function of a binomial random variable having parameters (n,p)image is given by

p(i)=nipi(1-p)n-i,i=0,1,,n (2.3)

image (2.3)

where

ni=n!(n-i)!i!

image

equals the number of different groups of iimage objects that can be chosen from a set of nimage objects. The validity of Equation (2.3) may be verified by first noting that the probability of any particular sequence of the nimage outcomes containing iimage successes and n-iimage failures is, by the assumed independence of trials, pi(1-p)n-iimage. Equation (2.3) then follows since there are niimage different sequences of the nimage outcomes leading to iimage successes and n-iimage failures. For instance, if n=3,i=2image, then there are 32=3image ways in which the three trials can result in two successes. Namely, any one of the three outcomes (s,s,f),(s,f,s),(f,s,s)image, where the outcome (s,s,f)image means that the first two trials are successes and the third a failure. Since each of the three outcomes (s,s,f),(s,f,s),(f,s,s)image has a probability p2(1-p)image of occurring the desired probability is thus 32p2(1-p)image.

Note that, by the binomial theorem, the probabilities sum to one, that is,

i=0p(i)=i=0nnipi(1-p)n-i=p+(1-p)n=1

image

Example 2.6

Four fair coins are flipped. If the outcomes are assumed independent, what is the probability that two heads and two tails are obtained?

Example 2.7

It is known that any item produced by a certain machine will be defective with probability 0.1, independently of any other item. What is the probability that in a sample of three items, at most one will be defective?

Remark on Terminology

If Ximage is a binomial random variable with parameters (n,p)image, then we say that Ximage has a binomial distribution with parameters (n,p)image.

2.2.3 The Geometric Random Variable

Suppose that independent trials, each having probability pimage of being a success, are performed until a success occurs. If we let Ximage be the number of trials required until the first success, then Ximage is said to be a geometric random variable with parameter pimage. Its probability mass function is given by

p(n)=P{X=n}=(1-p)n-1p,n=1,2, (2.4)

image (2.4)

Equation (2.4) follows since in order for Ximage to equal nimage it is necessary and sufficient that the first n-1image trials be failures and the nth trial a success. Equation (2.4) follows since the outcomes of the successive trials are assumed to be independent.

To check that p(n)image is a probability mass function, we note that

n=1p(n)=pn=1(1-p)n-1=1

image

2.2.4 The Poisson Random Variable

A random variable Ximage, taking on one of the values 0,1,2,image, is said to be a Poisson random variable with parameter λimage, if for some λ>0image,

p(i)=P{X=i}=e-λλii!,i=0,1, (2.5)

image (2.5)

Equation (2.5) defines a probability mass function since

i=0p(i)=e-λi=0λii!=e-λeλ=1

image

The Poisson random variable has a wide range of applications in a diverse number of areas, as will be seen in Chapter 5.

An important property of the Poisson random variable is that it may be used to approximate a binomial random variable when the binomial parameter nimage is large and pimage is small. To see this, suppose that Ximage is a binomial random variable with parameters (n,p)image, and let λ=npimage. Then

P{X=i}=n!(n-i)!i!pi(1-p)n-i=n!(n-i)!i!λni1-λnn-i=n(n-1)(n-i+1)niλii!(1-λ/n)n(1-λ/n)i

image

Now, for nimage large and pimage small

1-λnne-λ,n(n-1)(n-i+1)ni1,1-λni1

image

Hence, for nimage large and pimage small,

P{X=i}e-λλii!

image

Example 2.10

Suppose that the number of typographical errors on a single page of this book has a Poisson distribution with parameter λ=1image. Calculate the probability that there is at least one error on this page.

2.3 Continuous Random Variables

In this section, we shall concern ourselves with random variables whose set of possible values is uncountable. Let Ximage be such a random variable. We say that Ximage is a continuous random variable if there exists a nonnegative function f(x)image, defined for all real x(-,)image, having the property that for any set Bimage of real numbers

P{XB}=Bf(x)dx (2.6)

image (2.6)

The function f(x)image is called the probability density function of the random variable Ximage.

In words, Equation (2.6) states that the probability that Ximage will be in Bimage may be obtained by integrating the probability density function over the set Bimage. Since Ximage must assume some value, f(x)image must satisfy

1=P{X(-,)}=-f(x)dx

image

All probability statements about Ximage can be answered in terms of f(x)image. For instance, letting B=[a,b]image, we obtain from Equation (2.6) that

P{aXb}=abf(x)dx (2.7)

image (2.7)

If we let a=bimage in the preceding, then

P{X=a}=aaf(x)dx=0

image

In words, this equation states that the probability that a continuous random variable will assume any particular value is zero.

The relationship between the cumulative distribution F(·)image and the probability density f(·)image is expressed by

F(a)=P{X(-,a]}=-af(x)dx

image

Differentiating both sides of the preceding yields

ddaF(a)=f(a)

image

That is, the density is the derivative of the cumulative distribution function. A somewhat more intuitive interpretation of the density function may be obtained from Equation (2.7) as follows:

Pa-ε2Xa+ε2=a-ε/2a+ε/2f(x)dxεf(a)

image

when εimage is small. In other words, the probability that Ximage will be contained in an interval of length εimage around the point aimage is approximately εf(a)image. From this, we see that f(a)image is a measure of how likely it is that the random variable will be near aimage.

There are several important continuous random variables that appear frequently in probability theory. The remainder of this section is devoted to a study of certain of these random variables.

2.3.1 The Uniform Random Variable

A random variable is said to be uniformly distributed over the interval (0,1)image if its probability density function is given by

f(x)=1,0<x<10,otherwise

image

Note that the preceding is a density function since f(x)0image and

-f(x)dx=01dx=1

image

Since f(x)>0image only when x(0,1)image, it follows that Ximage must assume a value in (0,1)image. Also, since f(x)image is constant for x(0,1),Ximage is just as likely to be “near” any value in (0, 1) as any other value. To check this, note that, for any 0<a<b<1image,

P{aXb}=abf(x)dx=b-a

image

In other words, the probability that Ximage is in any particular subinterval of (0,1)image equals the length of that subinterval.

In general, we say that Ximage is a uniform random variable on the interval (α,β)image if its probability density function is given by

f(x)=1β-α,ifα<x<β0,otherwise (2.8)

image (2.8)

Example 2.14

If Ximage is uniformly distributed over (0,10)image, calculate the probability that (a) X<3image, (b) X>7image, (c) 1<X<6image.

2.3.2 Exponential Random Variables

A continuous random variable whose probability density function is given, for some λ>0image, by

f(x)=λe-λx,ifx00,ifx<0

image

is said to be an exponential random variable with parameter λimage. These random variables will be extensively studied in Chapter 5, so we will content ourselves here with just calculating the cumulative distribution function Fimage:

F(a)=0aλe-λxdx=1-e-λa,a0

image

Note that F()=0λe-λxdx=1image, as, of course, it must.

2.3.3 Gamma Random Variables

A continuous random variable whose density is given by

f(x)=λe-λx(λx)α-1Γ(α),ifx00,ifx<0

image

for some λ>0,α>0image is said to be a gamma random variable with parameters α,λimage. The quantity Γ(αimage) is called the gamma function and is defined by

Γ(α)=0e-xxα-1dx

image

It is easy to show by induction that for integral αimage, say, α=nimage,

Γ(n)=(n-1)!

image

2.3.4 Normal Random Variables

We say that Ximage is a normal random variable (or simply that Ximage is normally distributed) with parameters μimage and σ2image if the density of Ximage is given by

f(x)=12πσe-(x-μ)2/2σ2,-<x<

image

This density function is a bell-shaped curve that is symmetric around μimage (see Figure 2.2).

An important fact about normal random variables is that if Ximage is normally distributed with parameters μimage and σ2image then Y=αX+βimage is normally distributed with parameters αμ+βandα2σ2image. To prove this, suppose first that α>0image and note that FY(·)image*, the cumulative distribution function of the random variable Yimage, is given by

FY(a)=P{Ya}=P{αX+βa}=PXa-βα=FXa-βα=-(a-β)/α12πσe-(x-μ)2/2σ2dx=-a12πασexp-(v-(αμ+β))22α2σ2dv (2.9)

image (2.9)

where the last equality is obtained by the change in variables v=αx+βimage. However, since FY(a)=-afY(v)dvimage, it follows from Equation (2.9) that the probability density function fY(·)image is given by

fY(v)=12πασexp-(v-(αμ+β))22(ασ)2,-<v<

image

Hence, Yimage is normally distributed with parameters αμ+βand(ασ)2image. A similar result is also true when α<0image.

One implication of the preceding result is that if Ximage is normally distributed with parameters μandσ2image then Y=(X-μ)/σimage is normally distributed with parameters 0 and 1. Such a random variable Yimage is said to have the standard or unit normal distribution.

2.4 Expectation of a Random Variable

2.4.1 The Discrete Case

If Ximage is a discrete random variable having a probability mass function p(x)image, then the expected value of Ximage is defined by

E[X]=x:p(x)>0xp(x)

image

In other words, the expected value of Ximage is a weighted average of the possible values that Ximage can take on, each value being weighted by the probability that Ximage assumes that value. For example, if the probability mass function of Ximage is given by

p(1)=12=p(2)

image

then

E[X]=112+212=32

image

is just an ordinary average of the two possible values 1 and 2 that Ximage can assume. On the other hand, if

p(1)=13,p(2)=23

image

then

E[X]=113+223=53

image

is a weighted average of the two possible values 1 and 2 where the value 2 is given twice as much weight as the value 1 since p(2)=2p(1)image.

Example 2.18

Expectation of a Geometric Random Variable

Calculate the expectation of a geometric random variable having parameter pimage.

Solution: By Equation (2.4), we have

E[X]=n=1np(1-p)n-1=pn=1nqn-1

image

where q=1-pimage,

E[X]=pn=1ddq(qn)=pddqn=1qn=pddqq1-q=p(1-q)2=1p

image

In words, the expected number of independent trials we need to perform until we attain our first success equals the reciprocal of the probability that any one trial results in a success. ■

Example 2.19

Expectation of a Poisson Random Variable

Calculate E[X]image if Ximage is a Poisson random variable with parameter λimage.

Solution: From Equation (2.5), we have

E[X]=i=0ie-λλii!=i=1e-λλi(i-1)!=λe-λi=1λi-1(i-1)!=λe-λk=0λkk!=λe-λeλ=λ

image

where we have used the identity k=0λk/k!=eλimage. ■

2.4.2 The Continuous Case

We may also define the expected value of a continuous random variable. This is done as follows. If Ximage is a continuous random variable having a probability density function f(x)image, then the expected value of Ximage is defined by

E[X]=-xf(x)dx

image

Example 2.20

Expectation of a Uniform Random Variable

Calculate the expectation of a random variable uniformly distributed over (α,β)image.

Solution: From Equation (2.8) we have

E[X]=αβxβ-αdx=β2-α22(β-α)=β+α2

image

In other words, the expected value of a random variable uniformly distributed over the interval (α,β)image is just the midpoint of the interval. ■

Example 2.21

Expectation of an Exponential Random Variable

Let Ximage be exponentially distributed with parameter λimage. Calculate E[X]image.