Chapter Six

Sample Path Generation for Continuous-Time Models

Many models of interest in financial engineering can be represented by continuous-time stochastic differential equations. The application of Monte Carlo methods to this class of models requires sample path generation, which in turn requires a way to discretize them. Numerical analysis offers several ways to discretize deterministic differential equations for simulation purposes, including some very ingenious ones; here we will consider rather simple strategies, but we note that, due to the stochastic character of our applications, we have to tackle both numerical and statistical issues. To motivate the content of the chapter practically, we will mostly consider simple asset pricing problems, leaving more complex examples to Chapter 11 on option pricing, where we also apply variance reduction techniques, and Chapter 13 on risk management and hedging. The risk-neutral pricing approach that we discussed in Section 3.9 allows to price financial instruments by taking suitable expectations, which in turn requires Monte Carlo simulation, when the model is complicated enough to preclude an analytical solution. The following cases, listed in increasing order of complexity, may occur, depending on the nature of the underlying stochastic process and the kind of derivative contract we are evaluating:

1. We just need the value of a random variable at a single point in time and we are able to sample paths exactly. This is the lucky case where we may just need to sample a single random variable, and we do not need to discretize time. A relevant example is vanilla European-style call option in the Black–Scholes–Merton world.
2. We need the value of a random variable at multiple points in time, but we are able to sample paths exactly. Hence, we have to discretize time and the computational effort is increased. In some cases, we need not wonder about the most suitable time discretization, as this is written in the contract. Significant examples are a path-dependent European-style Asian option with discrete-time averaging and a Bermudan-style option, featuring early exercise opportunities but only at prespecified time instants. In both of these cases, the time instants at which we need to sample the value of the underlying asset are stated precisely in the contract. On the contrary, if we are dealing with an American-style derivative, the discretization step will have an impact on both accuracy and computational effort, as the option may be exercised anywhere before maturity.
3. We are not able to sample paths exactly. This may happen when we step out of the safe domain of geometric Brownian motion (GBM) and deal with mean reversion, stochastic volatility, non-Gaussian processes, etc. In such cases, we not only have to deal with the sampling and estimation errors that are common to all Monte Carlo applications, but with discretization issues as well.

Since here we are dealing with differential equations, the level of mathematical complexity is nontrivial, and we will take an informal view, keeping practical applications in mind. The necessary background on stochastic calculus was given in Section 3.7

We introduce the essential issues of path generation in Section 6.1, where we outline trivial and less trivial discretization strategies, like Euler and Milstein, as well as predictor-corrector methods. In Section 6.2 we investigate the easy case of geometric Brownian motion. There, we also cover the generation of sample paths for correlated processes and the Brownian bridge construction. Simple option pricing examples, concerning derivatives written on non-dividend-paying stocks, will help us along the way. We also apply these basic ideas to the simulation of rather simple interest rate models, like Vasicek and Cox–Ingersoll–Ross short-rate models in Section 6.3; such models involve mean reversion and may result or not in Gaussian processes, but they still feature a constant volatility. We move on and allow for stochastic volatility in Section 6.4. Finally, in Section 6.5 we also introduce jumps. We defer to Chapter 9 the use of low-discrepancy sequences for path generation within the framework of quasi–Monte Carlo simulation.

6.1 Issues in path generation

Let us consider the stochastic process St describing, e.g., the price of a non-dividend-paying stock.1 A wide class of models can be cast into the form of an Itô stochastic differential equation:

(6.1) equation

where the function a(St, t) defines the drift component, the function b(St, t) defines the volatility component, and dWt is the differential of a standard Wiener process. We know that the standard Wiener process is a Gaussian process with stationary and independent increments such that

equation

Therefore, for a small time interval δt, we have

equation

which in practice means that we may simulate small increments as follows:

equation

The notation tt may be preferred to t, since it points out clearly that the increment of the Wiener process is generated after observing the state at time t; however, this is not quite essential, and in the following we will often just write , since no confusion should arise. Sampling paths of a standard Wiener process with any time discretization involves no more than the generation of a stream of independent standard normals. When dealing with a simple generalized Wiener process like

(6.2) equation

where a and b are constant, we have just to integrate both sides:

(6.3) equation

This immediately yields the following discretization scheme:

(6.4) equation

which allows for easy sample path generation. When the discretization step is uniform, it may be better to introduce integer time stamps k, k = 0, 1, 2, …, corresponding to time instants tk = k δt, and rewrite Eq. (6.4) as

(6.5) equation

where we may insist on writing k+1 to point out that this shock is not known when we observe Sk. We should note that if we generate sample paths of the generalized Wiener process and collect sample statistics, we are prone to sampling errors, in the sense that sample statistics will be subject to statistical fluctuations calling for the evaluation of confidence intervals. Sampling error is due to the random nature of Monte Carlo methods, and it can be mitigated using variance reduction strategies. However, there is no discretization error, in the sense that the marginal distributions of the random variables Sk, both conditional and unconditional, are correct.

By analogy, when faced with the more general Itô process of Eq. (6.1), the simplest discretization approach that comes to mind is

(6.6) equation

Such a discretization is known as the Euler, or Euler–Maruyama, scheme. Note that we are assuming that functions a(St, t) and b(St, t) are constant over the time interval [t, t + δt). Since this is not really the case, it is clear that we are introducing a discretization error. The issue is faced also when dealing with an ordinary deterministic differential equation, and it can be kept under control by a wide array of numerical analysis techniques. Still, it might be argued that when δt → 0, the discretized solution should converge to the “true” solution of the stochastic differential equations. Unfortunately, convergence is a critical concept in stochastic differential equations, and a proper analysis would require a relatively deep mathematical machinery that is beyond the scope of this book.2 Nevertheless, it is quite easy to see that discretization errors are relevant and may even change the marginal distributions. For instance, consider the geometric Brownian motion described by

(6.7) equation

The Euler scheme yields

equation

This is very easy to grasp and to implement, but the marginal distribution of each variable St is normal, whereas we know from Section 3.7.3 that it should be lognormal. This is not quite a negligible issue when we model stock prices or interest rates that are supposed not to get negative. In this specific case, we show in Section 6.2 that we may use Itô’s lemma and get rid of the discretization error altogether, but this ideal state of the matter cannot always be achieved. When we have to cope with inexact discretization, the Euler scheme is just the simplest choice and an array of more refined discretization schemes have been proposed in the literature. Below we outline a couple of alternatives, namely, the Milstein scheme and predictor-corrector methods.

A final, and possibly less obvious, question is also worth mentioning: Time discretization may induce another form of bias. To see this, imagine pricing an American-style option, which may be exercised at any time instant before and including its expiration date. When pricing it by Monte Carlo, we are actually pricing a Bermudan-style option that can be exercised at a discrete set of time instants. Clearly, the value of the Bermudan-style option is a lower bound on the price of the corresponding American-style option; hence, we are introducing a low bias.3 One possible countermeasure is to adopt Richardson extrapolation, which is a rather common approach in numerical analysis to approximate a limit when δt → 0. Given a sequence of prices of Bermudan-style options that can be exercised with higher and higher frequencies, we may approximate a continuous-time exercise opportunity limit.

6.1.1 EULER VS. MILSTEIN SCHEMES

The Milstein scheme is based on a higher order Taylor expansion of Eq. (6.1):

(6.8) equation

We will justify this expansion later; for now, we should note that the formal rule (dW)2 = dt shows that the last term disappears in the limit, so that we get back to the original differential equation.

Example 6.1 GBM processes by the Milstein scheme

As an immediate illustration, let us apply the scheme of Eq. (6.8) to a GBM process. Since b(S, t) = σS for this kind of process, we have

equation

and Eq. (6.8) becomes

equation

Now, from a computational point of view, we rewrite , which also implies

equation

Finally, we obtain

(6.9) equation

It is interesting to compare Euler and Milstein approximations against the Taylor expansion of the exact formula based on an exponential:

equation

where in the last line we have neglected terms of order higher than δt. We understand that the Milstein scheme adds some missing term in the Euler approximation

equation

so that now the formula is correct and yields an exact expectation and variance up to order δt.

For the sake of simplicity, we provide a justification of the Milstein scheme in the scalar case, assuming that functions a(·, ·) and b(·, ·) only depend on S, and not on time t. The starting point is recalling that a stochastic differential equation is actually a shorthand for an integral equation

equation

The Euler scheme approximates integrals over the interval (t, t + δt) by “freezing” the integrand functions to their value at time t:

(6.10) equation

We may try to improve on this by finding a better approximation of the diffusion term b(St) over the time interval (t, t + δt). In order to get a clue, let us consider Itô’s lemma for the process b(St) (here we do not need to denote derivatives as partial ones, since we have ruled out dependence on time):

equation

where we introduce μb(St) and σb(Xt) as shorthands for the drift and volatility functions in the differential equation of b(St). If we apply the Euler approximation to b(St) over the time interval (t, u), we find

equation

where we have neglected the drift term, as it is of order [ut], whereas the increment of the Wiener process is of order . Rather than freezing the integrand in Eq. (6.10), we may substitute the last approximation:

(6.11) equation

The missing piece of the puzzle is the last integral, which we may compute by recalling Example 3.18. Applying that result we immediately find:

equation

Plugging this into Eq. (6.11) yields

equation

Substituting this refined approximation of the diffusion term into the Euler scheme finally yields the Milstein scheme:

equation

Generalizations of this result are available, for which we refer to the chapter references.

6.1.2 PREDICTOR-CORRECTOR METHODS

When we apply the Euler scheme to the Itô differential equation

equation

over the time interval (t, t + δt), we “freeze” the drift and volatility functions to their initial values and sample

equation

The advantage of this scheme is that it is explicit, but we could try to account for the change in the functions a(·, ·) and b(·, ·) is some way. There is an array of ideas, originally developed for the numerical solution of deterministic (ordinary) differential equations, which can be adapted to the stochastic case, including implicit schemes. An implicit version of the Euler scheme could be

equation

The scheme is implicit because now we have to solve an equation to find the new value of Stt. Since this approach could be numerically troublesome, as it may require division by a Wiener increment close to zero, an alternative form is

(6.12) equation

where the diffusion term is kept explicit.

Example 6.2 An implicit Euler scheme

If we consider the stochastic differential equation

equation

the scheme of Eq. (6.12) yields

equation

which may be rewritten as

equation

Needless to say, things are not always this easy, not to mention the open question of convergence. A possible alternative is to introduce a correction mechanism comparing the actual and the predicted value of the process in t + δt. For instance, the value obtained by straightforward application of the Euler scheme may be considered as a predictor

equation

Since the value of the functions and at the end of the timestep are different from what was used in the prediction step, we may introduce a correction as follows:

(6.13) equation

where α and η are weights in the interval [0, 1] and

(6.14) equation

The role played by the weights α and η is rather intuitive, as they blend the initial and terminal values of functions a(·, ·), and b(·, ·), controlling the amount of correction; often, they are set close to 0.5. The further adjustment of Eq. (6.14) is less intuitive, but we refer the interested reader to [3] for further information, as well as a serious treatment of convergence issues. The aim of this short section was just to get readers acquainted with the richness of discretization approaches, but for the sake of simplicity we will refrain from using advanced path generation methods in what follows. We should also notice that, in the literature, the usefulness of some of these approaches with respect to a simpler Euler scheme with a suitably short time step has been sometimes questioned.

6.2 Simulating geometric Brownian motion

Sometimes, we may use suitable transformations of a differential equation to cast it into a more manageable form. From Section 3.7 we know that, by using Itô’s lemma, we may rewrite Eq. (6.7), which defines GBM, as

(6.15) equation

Now this is a generalized Wiener process that we may easily integrate as we have seen in the previous section. Taking exponentials to get rid of the logarithm yields

equation

where ν = μ − σ2/2. Since the integral of a standard Wiener process is normally distributed, from a computational point of view we may recast the last expression as

equation

which shows lognormality of prices. From a path generation point of view, the last expression is used as

(6.16) equation

from which it is easy to generate sample paths of GBMs. A straightforward code is given in Fig. 6.1. Experienced R programmers will certainly notice that the function includes two dreaded nested for loops, which does not help efficiency; we will come back to that in a moment. The following snapshot produces the sample paths in Fig. 6.2, showing the impact of volatility:

FIGURE 6.1 R code to generate sample paths of a geometric Brownian motion.

FIGURE 6.2 Sample paths of GBMs for different values of volatility.

set.seed(55555)
paths1 <- simGBM(50,0.1,0.1,1,300,5)
paths2 <- simGBM(50,0.1,0.4,1,300,5)
yBounds <- c(min(min(paths1),min(paths2)),
max(max(paths1),max(paths2)))
par(mfrow=c(1,2))
plot(1:301, paths1[1,], ylim=yBounds,type=′1′,
    xlab=“sigma=10%”,ylab=”)
for (j in 2:5)
lines(1:301,paths1[j,]) plot(1:301,paths2[1,],ylim=yBounds,type=′1′,
    xlab=“sigma=40%”,ylab=”)
for (j in 2:5)
lines(1:301,paths2[j,])
par(mfrow=c(1,1))

The code in Fig. 6.1 is based on two nested for loops, contrary to common recommendations concerning R code efficiency. In order to vectorize the code, it is convenient to rewrite Eq. (6.16) as

equation

We generate differences in the logarithm of the asset prices and then use the cumsum function to obtain cumulative sums of log-increments along each sample path. The resulting function simvGBM is illustrated in Fig. 6.3. We may compare the two implementations in terms of speed:

FIGURE 6.3 Vectorized code to generate sample paths of GBM.

> system.time(simGBM(50,0.1,0.1,1,3000,100))
user system elapsed
2.54     0.00     2.54
> system.time(simvGBM(50,0.1,0.1,1,3000,100))
user system elapsed
0.15     0.00     0.16

6.2.1 AN APPLICATION: PRICING A VANILLA CALL OPTION

As an immediate application of path generation for GBMs, we may price a vanilla call option on a non-dividend-paying stock. Clearly, there is no point in using Monte Carlo methods in this case, as we may apply the Black–Scholes–Merton formula that we discussed in Section 3.9.2. However, it is quite useful to verify the accuracy of numerical methods in easily solved case to check their viability and possible limitations. In Fig. 6.4 we recall for convenience the function giving the exact option price, and we also propose a very simple function to estimate the price by Monte Carlo. We just need:

FIGURE 6.4 Exact formula and a simple Monte Carlo pricer for a vanilla call option.

1. To sample lognormal prices using Eq. (6.16).
2. To evaluate and discount the corresponding payoff (assuming a constant risk-free rate).
3. To return the estimated price along with a confidence interval, whose confidence level c.level is 95% by default. Note that we return a list with fields value ci containing the point and the interval estimate, respectively.4

The code also includes a little snapshot to verify accuracy. The relevant output is

> EuVanillaCall(S0,K,T,r,sigma)
[1] 5.234659
> out$estimate
[1] 5.164366
> out$conf.int
[1] 5.056212 5.272520
> 100*(out$conf.int[2]-out$conf.int[1])/2/out$estimate
[1] 2.09423

We observe that the point estimate is not quite precise, even though the true value is included in the 95% confidence interval. Taking the ratio between the half-length of the confidence interval and the point estimate suggests an error above 2%. One crude way to improve the estimate would be to increase the number of replications. We will see in Chapter 8 that variance reduction strategies may be a much better option.

6.2.2 MULTIDIMENSIONAL GBM

When dealing with options on multiple assets or managing multiple risk factors, we may need to generate a multidimensional geometric Brownian motion. The key issue is how to deal with correlation among risk factors, and there are two modeling choices, which have an impact on how simulation is carried out. For the sake of simplicity we illustrate the two approaches in the bidimensional case, where two Wiener processes drive prices S1(t) and S2(t). We may consider two independent standard Wiener processes that enter in both equations, thus inducing a correlation between prices:

equation

An alternative is to consider instantaneously correlated Wiener processes and write the equations in the somewhat more familiar form:

equation

The role of instantaneous correlation ρ can be understood by referring to the following formal rule of stochastic calculus:

equation

We note that the analogous rule (dW)2 = dt is actually a specific case of this rule, which may be justified heuristically along the lines of Section 3.7.3. We will take the second approach, which requires the generation of correlated standard normals. We know from Section 5.5.2 that this may be accomplished by taking the Cholesky factorization of the covariance matrix, which in this case is actually a correlation matrix, since we are dealing with standard normals. We illustrate the idea first in the bidimensional case, to reinforce intuition. In fact, when only two processes are involved, we may generate increments based on the correlated standard normal variables 1 and 2 obtained as a transformation of independent variables Z1, Z2 ~ N(0, 1):

equation

The simple code in Fig. 6.5 generates a bidimensional sample path using the above transformation. The following R snapshot produces the plots in Fig. 6.6, which illustrate the impact of correlation:

FIGURE 6.5 Simple R code to generate sample paths of bidimensional GBM.

FIGURE 6.6 Sample paths of a bidimensional GBM for different values of instantaneous correlation.

> set.seed(55555)
> S0 <- c(30,50)
> mus <- c(0.1, 0.1)
> sigmas <- c(0.3,0.3)
> paths1 <- sim2GBM(S0,mus,sigmas,0.3,1,300)
> paths2 <- sim2GBM(S0,mus,sigmas,0.9,1,300)
> yBounds <- c(min(min(paths1),min(paths2)),
+ max(max(paths1),max(paths2)))
> par(mfrow=c(1,2))
> plot(1:301,paths1[1,],ylim=yBounds,type=′1′,
xlab=“rho=0.4”,ylab=”)
> lines (1:301,paths1[2,])
> plot(1:301,paths2[1,],ylim=yBounds,type=′1′,
xlab=“rho=0.9”,ylab=”)
> lines(1:301,paths2[2,])
> par(mfrow=c(1,1))

The previous implementation is rather simple as it involves a single replication. The code in Fig. 6.7 is a bit more involved since, to store multiple replications, we need a tridimensional array; this is created by the function array, which requires a vector specifying the array size along each dimension.

FIGURE 6.7 Generating sample paths of multidimensional GBMs.

6.2.3 THE BROWNIAN BRIDGE

In the previous sections we have generated asset paths according to a natural scheme, which proceeds forward in time. Actually, the Wiener process enjoys some peculiar properties which allow us to generate the sample paths in a different way. Consider a time interval with left and right end points tl and tr, respectively, and an intermediate time instant s, such that tl < s < tr. In standard path generation, we would generate the Wiener process in the natural order: W (tl), W (s), and finally W (tr). Using the so-called Brownian bridge, we may generate W(s) as the last value, conditional on the values W(tl) = wl and W(tr) = wr. The conditional value of W(s) is a normal variable with expected value

(6.17) equation

and variance

(6.18) equation

This is a consequence of the properties of the conditional distribution of a multivariate normal distribution, which we have stated in Theorem 3.7. Before proving Eqs. (6.17) and (6.18), it is important to realize that they are quite intuitive: The conditional expected value of W(s) is obtained by linear interpolation through wl and wr; the variance is low near the two end points tl and tr and is at its maximum in the middle of the interval.

A preliminary step to prove Eqs. (6.17) and (6.18) is to find the covariance between W(t) and W(u), when t < u:

equation

The statement can be generalized as Cov(W(t), W(u)) = min{t, u}. Then, the joint distribution of the three values that we are considering is

equation

Note that we have listed the time instant in view of the application of Theorem 3.7, since we need the value in s conditional on the values in tl and tr. Partitioning vector of expected values and the covariance matrix accordingly yields

equation

and

equation

By plugging the inverse matrix

equation

into the above formulas we obtain Eqs. (6.17) and (6.18).

By using the Brownian bridge, we may generate sample paths by a sort of bisection strategy. Given W(0) = 0, we sample W(T); then we sample W(T/2). Given W(0) and W(T/2) we sample W(T/4); given W(T/2) and W(T) we sample W(3T/4), etc. Actually, we may generate sample paths in any order we wish, with nonhomogeneous time steps. One could wonder why this complicated construction could be useful. There are at least two reasons:

1. It may help in using variance reduction by stratification, to be discussed in Chapter 8. It is difficult to use stratification in multiple dimensions, but we may use stratification just on the terminal value of the asset price, and maybe an intermediate point. Then we generate the remaining values along the sample path by using the bridge.
2. The Brownian bridge construction is also useful when used in conjunction with low-discrepancy sequences, as we shall see in Chapter 9. Such sequences may not work well in very high dimensional domains, but using Brownian bridge we may use high-quality sequences to outline the paths of the Wiener process, by sampling points acting as milestones; then we can fill the trajectory by Monte Carlo sampling.

In Fig. 6.8 we illustrate an R function to generate sample paths of the standard Wiener process using the Brownian bridge technique, but only in the specific case in which the time interval [0, T] is successively bisected (i.e., the number of intervals is a power of 2). In this case, we may simplify the formulas above to sample W(s): If we let δt = tltr, we obtain

FIGURE 6.8 Implementing path generation by a Brownian bridge for the standard Wiener process.

equation

where is a standard normal variable. The function receives the length T of the time interval and the number numSteps of subintervals in which it must be partitioned, and it returns a vector containing one sample path. Assume that the number of intervals is 8 (a power of 2). Then we must carry out 3 bisections.

Given the initial condition W(t0) = 0, we must first sample W(t8), which means “jumping” over an interval of length T, which is the initial value TJump in the program. Since we store elements in a vector of nine elements (starting with index 1, and including W(t0)), we must jump eight places in this vector to store the new value. The number of places to jump is stored in IJump.
Then we start the outermost for loop. In the first pass we must only sample W(t4), given W(t0) and W(t8). Given positions left<-1 and right<-IJump+1, we must generate a new value and store it in position i<-IJump/2 + 1, which is 4+1 = 5 in this case. Here we generate only one value, and we divide both time and index jumps by 2.
In the second iteration we must sample W(t2), given W(t0) and W(t4), and W(t6), given W(t4) and W(t8). The nested loop will be executed twice, and indexes left, right, and i are incremented by 4.
In the third and final iteration we generate the remaining four values.

The construction above is illustrated in Fig. 6.9. Figure 6.10 includes a script to check that the marginal distributions of the stochastic process that we generate are the correct ones. Expected values should be zero, and standard deviation should be the square root of time:

FIGURE 6.9 Illustrating the Brownian bridge construction by successive bisection.

FIGURE 6.10 Checking the path generation for the standard Wiener process by a Brownian bridge.

> m <- colMeans(Wpaths[,-1]);m
[1] 5.234951e-05 1.624087e-03 3.127389e-03 1.542460e-03
> sdev <- apply(Wpaths[,-1],2,sd);sdev
[1] 0.4993618 0.7080760 0.8653553 0.9971483
> sqrt((1:numSteps)*T/numSteps)
[1] 0.5000000 0.7071068 0.8660254 1.0000000

We see that, within sampling errors, the result looks correct. Given a way to generate the standard Wiener process, it is easy to simulate geometric Brownian motion. The function is given in Fig. 6.11 and uses a similar approach as the vectorized version simvGBM of Fig. 6.3. One thing we should note is the use of the function diff to generate the vector of Increments in the logarithmic asset price. In fact, in standard Monte Carlo we generate the underlying Wiener process by successive increments; with the Brownian bridge construction we build directly the values of the process at different time instants, and we must use the function diff to obtain the relative differences. In some sense, diff works in the opposite way to ecumsum.

FIGURE 6.11 Simulating geometric Brownian motion by the Brownian bridge.

6.3 Sample paths of short-term interest rates

Modeling interest rates is a quite difficult endeavor, as it involves different rates applying to different maturities. The simplest class of models deals with only one interest rate applying to a very short time horizon. Let rt be the risk-free interest rate applying to the time interval (t,t + dt). This may be called the instantaneous interest rate, although it is often referred to as the short rate. There are different models for the short rate, but in this section we consider only the two that we have already mentioned in previous chapters:

1. The Vasicek model, characterized by a stochastic differential equation featuring mean reversion:

(6.19) equation

2. The Cox–Ingersoll–Ross (CIR) model, which is quite similar to the Vasicek model, but involves a slight change in the volatility term:

(6.20) equation

In the following, we show that this slight difference has a significant consequence:

Short rates are normally distributed in the Vasicek model; the CIR model involves a more complicated noncentral chi-square distribution (see Section 3.2.3.2).
The easier distribution of the Vasicek model results in better analytical tractability; we are able to price bonds and also some options analytically, whereas the CIR model involves more complicated formulas, when available.
However, the Vasicek model can be criticized as the normal distribution allows for negative interest rates, whereas the volatility term in the CIR models avoids this difficulty, provided that

equation

The intuition is that when the short rate rt gets close to zero, the volatility term in Eq. (6.20) vanishes; if the pull toward the long-term average rate is strong enough, the short rate will stay positive.

In both cases, generating sample paths is not too difficult, as we shall see, since we know the exact distribution of interest rates. More involved models call for more sophisticated path generation strategies, possibly involving discretization errors.

6.3.1 THE VASICEK SHORT-RATE MODEL

The process described in Eq. (6.19) is not a GBM; actually, it is an example of an Ornstein–Uhlenbeck process. To try and solve the equation, we may resort to a typical trick of the trade, related to the method of integrating factors in deterministic ordinary differential equations. The trick is to apply Itô’s lemma to the process

equation

The usual drill requires the calculation of partial derivatives:

equation

Hence, the differential equation for f(rt, t) is

equation

This equation does not look too bad and can be integrated over the interval (0,t):

equation

which can be rewritten as

equation

This expression includes a stochastic integral involving the Wiener process that, as we know, is normally distributed and has zero expected value. Thus, we may immediately conclude that the short rate rt under the Vasicek model is normally distributed with expected value:5

(6.21) equation

Finding variance is a bit trickier and requires the following result.

THEOREM 6.1 (Itô’s isometry) Let Xt be a stochastic process adapted to the standard Wiener process Wt. Then,

equation

Actually, the result can be proven in a more general form, but it allows us to find the variance of the short rate:

(6.22) equation

It is important to get an intuitive feeling for Eqs. (6.21) and (6.22):

At time t = 0 the short rate is r0 and variance is zero.
When t → + ∞, the short rate tends to the long-term average value ; the larger the speed of mean reversion γ, the faster the convergence.
Long-term variance is given by σ2/(2γ); as expected, it is influenced by the volatility σ; however, also γ plays a role, and a strong mean reversion tends to kill volatility.

In order to use the above knowledge to generate sample paths with time step δt, we should express rtt conditional on rt:

equation

where ~ N(0,1).

Now we may generate sample paths by the code illustrated in Fig. 6.12. The code produces the plots in Fig. 6.13. We clearly see the impact of the mean reversion coefficient γ. In the plots for γ = 0.1 we also notice how short rates can indeed get negative in the Vasicek model.

FIGURE 6.12 Simulating short rates by the Vasicek model.

FIGURE 6.13 Illustrating the effect of mean reversion on short rates in the Vasicek model.

6.3.2 THE COX–INGERSOLL–ROSS SHORT-RATE MODEL

The CIR model relies on a square-root diffusion process, which is not as easy to deal with as the Ornstein–Uhlenbeck process underlying the Vasicek model. However, by studying the transition density of the square-root diffusion process, a link with the noncentral chi-square distribution6 can be established. We state the following result without proof.

THEOREM 6.2 (CIR short-rate model) The transition law from r0 to rt for the CIR model can be expressed as

(6.23) equation

where the degrees of freedom are v = 4γ/α and the noncentrality parameter is

equation

By applying the theorem to the transition from rt to rtt, we immediately obtain a path generation mechanism. It is also useful to find the conditional expectation and variance of rt given r0, which is easily obtained by recalling Eq. (3.18). For the expected value we have

(6.24) equation

We notice that the expected value has the same intuitive form as in the Vasicek model. Variance is a bit trickier:

(6.25) equation

On the basis of Theorem 6.2 it is easy to build an R function to generate sample paths for the CIR short-rate model. It is also interesting to compare the exact sample paths against those generated by the straightforward Euler discretization

(6.26) equation

where ~ N(0, 1) as usual. The code displayed in Fig. 6.14 includes two such functions. Figure 6.15 shows a script to compare the exact distribution against a one-step Euler approximation. For the cases T = 0.25 and T = 1 we sample the rates obtained in one step, with the following parameters:7

FIGURE 6.14 Simulating short rates by the CIR model: exact approach and Euler approximation.

FIGURE 6.15 Comparing kernel estimates of CIR interest rates by an exact approach and the Euler approximation.

equation

Note that this setting of α corresponds to a volatility coefficient σ = 0.1 (see Eq. (6.20)). In Fig. 6.16 we compare the kernel densities estimated by sampling 100,000 replications; the densities estimated on the basis of the Euler scheme are drawn with dashed lines. We observe that for T = 0.25 there is a close match, but for T = 1 the Euler rates, which are in fact normally distributed, may get negative.

FIGURE 6.16 Kernel densities for exact CIR rates (continuous lines) and rates sampled by the Euler scheme (dashed lines).

6.4 Dealing with stochastic volatility

In geometric Brownian motion models, the volatility σ is constant. On the contrary, an empirical finding is that volatility is stochastic, and clusters of high vs. low volatility can be observed over time. In discrete time, this may be modeled along the lines of GARCH time series models that we have seen in Chapter 3. We may also consider continuous-time models, and one possible approach relies on the square-root diffusion process that we have introduced in Section 6.3.2 for short-term interest rates. One such model is the Heston model:

(6.27) equation

(6.28) equation

Equation (6.27) is basically a generalization of GBM, and Eq. (6.28) models variance Vt as a mean-reverting square-root diffusion. An easy way to generate sample paths for this process relies on a simple Euler discretization; we have already seen that this may result in negative values for quantities that are non-negative by definition. The more advanced path generation strategies that we have discussed earlier in the chapter may be exploited to remedy the situation. We defer details to Chapter 11, where we will see how to price a path-dependent, lookback option under stochastic volatility.

6.5 Dealing with jumps

Diffusion processes are continuous (with probability 1), and they are often criticized for this very reason, as sudden jumps are indeed observed in financial markets. A simple approach to incorporate jumps is to add the increments of a compound Poisson process, which is characterized by a rate λ and a distribution of the jump size. In practice, this means that jump epochs are separated by a stream of independent exponential variables with rate λ, and that at each such epoch we have to add a random jump size to the increment of the diffusion component. For instance, we may consider a generalized Wiener process, described by the differential equation

equation

which can be easily solved:

equation

How can we add normally distributed jumps N(ν, ξ)? The theoretical framework we need is provided by the theory of Lévy processes, which generalize both diffusions and Poisson (pure jump) processes. The essential features of Lévy processes are:

1. Increments are independent.
2. Increments are stationary.
3. The process is stochastically continuous.
4. The initial value is zero (with probability 1).

These are the same properties of the standard Wiener process, and the only point worth mentioning is property 3, which may seem at odds with discontinuous sample paths of Poisson processes. A correct statement of the property, in a one-dimensional case, is

equation

Hence, jumps are not actually ruled out, but they cannot be “dense.” For instance, in a Poisson process the probability that there is more than one jump in an infinitesimal time interval dt is negligible.

A key result is the Lévy–Itô decomposition, whose proper statement is beyond the scope of this book.8 Loosely speaking, the theorem states that we may decompose a Lévy process into separate components, accounting for diffusion and jump subprocesses. Let us illustrate how we may take advantage of this theorem by adding a jump component to the above generalized Wiener process. To this aim, we shall take advantage of the following properties of the Poisson process:

The number N of jumps in the time interval [0,T] is a Poisson random variable with parameter λT.
Conditional on N, the jump epochs are a sequence of N i.i.d. uniform variables on [0, T].

Therefore, in order to generate a sample path of this jump–diffusion process, we may:

1. Generate the increments of the diffusion component as usual.
2. Sample the number of jumps as a Poisson variable.
3. Generate the time epochs as uniform random variables.
4. Sample the jump sizes and add them to the corresponding increments of the diffusion process.

The code in Fig. 6.17 applies the above ideas and produces the sample path displayed in Fig. 6.18. Following the traditional convention to plot discontinuous functions, we have added a bullet–circle pair corresponding to each point of discontinuity. Note that the process is continuous from the right.

FIGURE 6.17 Simulating a jump–diffusion process.

FIGURE 6.18 Sample path of a jump–diffusion process.

For further reading

For a sound theoretical treatment of numerical methods to solve stochastic differential equations, see [3].
Several examples, at a more advanced level with respect to this chapter are described in [1] and [5]. See also [4], which requires some advanced measure theoretic background.
An interesting discussion of transformation methods and the Milstein scheme is given in [2].

References

1 P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer, New York, 2004.

2 P. Jaeckel. Monte Carlo Methods in Finance. Wiley, Chichester, 2002.

3 P.E. Kloeden and E. Platen. Numerical Solution of Stochastic Differential Equations. Springer, Berlin, 1992.

4 R. Korn, E. Korn, and G. Kroisandt. Monte Carlo Methods and Models in Finance and Insurance. CRC Press, Boca Raton, FL, 2010.

5 D.P. Kroese, T. Taimre, and Z.I. Botev. Handbook of Monte Carlo Methods. Wiley, Hoboken, NJ, 2011.

1To avoid notational clutter, we will mostly, but not exclusively, use the style St rather than S(t) in this chapter. The choice will be purely dictated by convenience.

2Suitable references are provided at the end of the chapter.

3We investigate the sources of bias in American-style option pricing in Section 11.3.1.

4We discuss confidence intervals and output analysis in Chapter 7.

5By the way, we can find this expectation by setting σ = 0 and solving the resulting deterministic equation.

6See Section 3.2.3.2.

7We are replicating the results discussed in [1, pp. 124–125].

8See, e.g., [5, p. 209].