Chapter Six: Sample Path Generation for Continuous-Time Models

Many models of interest in financial engineering can be represented by continuous-time stochastic differential equations. The application of Monte Carlo methods to this class of models requires sample path generation, which in turn requires a way to discretize them. Numerical analysis offers several ways to discretize deterministic differential equations for simulation purposes, including some very ingenious ones; here we will consider rather simple strategies, but we note that, due to the stochastic character of our applications, we have to tackle both numerical and statistical issues. To motivate the content of the chapter practically, we will mostly consider simple asset pricing problems, leaving more complex examples to Chapter 11 on option pricing, where we also apply variance reduction techniques, and Chapter 13 on risk management and hedging. The risk-neutral pricing approach that we discussed in Section 3.9 allows to price financial instruments by taking suitable expectations, which in turn requires Monte Carlo simulation, when the model is complicated enough to preclude an analytical solution. The following cases, listed in increasing order of complexity, may occur, depending on the nature of the underlying stochastic process and the kind of derivative contract we are evaluating:

1. We just need the value of a random variable at a single point in time and we are able to sample paths exactly. This is the lucky case where we may just need to sample a single random variable, and we do not need to discretize time. A relevant example is vanilla European-style call option in the Black–Scholes–Merton world.

2. We need the value of a random variable at multiple points in time, but we are able to sample paths exactly. Hence, we have to discretize time and the computational effort is increased. In some cases, we need not wonder about the most suitable time discretization, as this is written in the contract. Significant examples are a path-dependent European-style Asian option with discrete-time averaging and a Bermudan-style option, featuring early exercise opportunities but only at prespecified time instants. In both of these cases, the time instants at which we need to sample the value of the underlying asset are stated precisely in the contract. On the contrary, if we are dealing with an American-style derivative, the discretization step will have an impact on both accuracy and computational effort, as the option may be exercised anywhere before maturity.

3. We are not able to sample paths exactly. This may happen when we step out of the safe domain of geometric Brownian motion (GBM) and deal with mean reversion, stochastic volatility, non-Gaussian processes, etc. In such cases, we not only have to deal with the sampling and estimation errors that are common to all Monte Carlo applications, but with discretization issues as well.

Since here we are dealing with differential equations, the level of mathematical complexity is nontrivial, and we will take an informal view, keeping practical applications in mind. The necessary background on stochastic calculus was given in Section 3.7

We introduce the essential issues of path generation in Section 6.1, where we outline trivial and less trivial discretization strategies, like Euler and Milstein, as well as predictor-corrector methods. In Section 6.2 we investigate the easy case of geometric Brownian motion. There, we also cover the generation of sample paths for correlated processes and the Brownian bridge construction. Simple option pricing examples, concerning derivatives written on non-dividend-paying stocks, will help us along the way. We also apply these basic ideas to the simulation of rather simple interest rate models, like Vasicek and Cox–Ingersoll–Ross short-rate models in Section 6.3; such models involve mean reversion and may result or not in Gaussian processes, but they still feature a constant volatility. We move on and allow for stochastic volatility in Section 6.4. Finally, in Section 6.5 we also introduce jumps. We defer to Chapter 9 the use of low-discrepancy sequences for path generation within the framework of quasi–Monte Carlo simulation.

6.1 Issues in path generation

Let us consider the stochastic process S_t describing, e.g., the price of a non-dividend-paying stock.¹ A wide class of models can be cast into the form of an Itô stochastic differential equation:

where the function a(S_t, t) defines the drift component, the function b(S_t, t) defines the volatility component, and dW_t is the differential of a standard Wiener process. We know that the standard Wiener process is a Gaussian process with stationary and independent increments such that

The notation

_t+δt may be preferred to

_t, since it points out clearly that the increment of the Wiener process is generated after observing the state at time t; however, this is not quite essential, and in the following we will often just write

, since no confusion should arise. Sampling paths of a standard Wiener process with any time discretization involves no more than the generation of a stream of independent standard normals. When dealing with a simple generalized Wiener process like

which allows for easy sample path generation. When the discretization step is uniform, it may be better to introduce integer time stamps k, k = 0, 1, 2, …, corresponding to time instants t_k = k δt, and rewrite Eq. (6.4) as

where we may insist on writing

_k+1 to point out that this shock is not known when we observe S_k. We should note that if we generate sample paths of the generalized Wiener process and collect sample statistics, we are prone to sampling errors, in the sense that sample statistics will be subject to statistical fluctuations calling for the evaluation of confidence intervals. Sampling error is due to the random nature of Monte Carlo methods, and it can be mitigated using variance reduction strategies. However, there is no discretization error, in the sense that the marginal distributions of the random variables S_k, both conditional and unconditional, are correct.

By analogy, when faced with the more general Itô process of Eq. (6.1), the simplest discretization approach that comes to mind is

Such a discretization is known as the Euler, or Euler–Maruyama, scheme. Note that we are assuming that functions a(S_t, t) and b(S_t, t) are constant over the time interval [t, t + δt). Since this is not really the case, it is clear that we are introducing a discretization error. The issue is faced also when dealing with an ordinary deterministic differential equation, and it can be kept under control by a wide array of numerical analysis techniques. Still, it might be argued that when δt → 0, the discretized solution should converge to the “true” solution of the stochastic differential equations. Unfortunately, convergence is a critical concept in stochastic differential equations, and a proper analysis would require a relatively deep mathematical machinery that is beyond the scope of this book.² Nevertheless, it is quite easy to see that discretization errors are relevant and may even change the marginal distributions. For instance, consider the geometric Brownian motion described by

This is very easy to grasp and to implement, but the marginal distribution of each variable S_t is normal, whereas we know from Section 3.7.3 that it should be lognormal. This is not quite a negligible issue when we model stock prices or interest rates that are supposed not to get negative. In this specific case, we show in Section 6.2 that we may use Itô’s lemma and get rid of the discretization error altogether, but this ideal state of the matter cannot always be achieved. When we have to cope with inexact discretization, the Euler scheme is just the simplest choice and an array of more refined discretization schemes have been proposed in the literature. Below we outline a couple of alternatives, namely, the Milstein scheme and predictor-corrector methods.

A final, and possibly less obvious, question is also worth mentioning: Time discretization may induce another form of bias. To see this, imagine pricing an American-style option, which may be exercised at any time instant before and including its expiration date. When pricing it by Monte Carlo, we are actually pricing a Bermudan-style option that can be exercised at a discrete set of time instants. Clearly, the value of the Bermudan-style option is a lower bound on the price of the corresponding American-style option; hence, we are introducing a low bias.³ One possible countermeasure is to adopt Richardson extrapolation, which is a rather common approach in numerical analysis to approximate a limit when δt → 0. Given a sequence of prices of Bermudan-style options that can be exercised with higher and higher frequencies, we may approximate a continuous-time exercise opportunity limit.

6.1.1 EULER VS. MILSTEIN SCHEMES

We will justify this expansion later; for now, we should note that the formal rule (dW)² = dt shows that the last term disappears in the limit, so that we get back to the original differential equation.

Example 6.1 GBM processes by the Milstein scheme

As an immediate illustration, let us apply the scheme of Eq. (6.8) to a GBM process. Since b(S, t) = σS for this kind of process, we have

equation

and Eq. (6.8) becomes

equation

Now, from a computational point of view, we rewrite , which also implies

equation

Finally, we obtain

(6.9) equation

It is interesting to compare Euler and Milstein approximations against the Taylor expansion of the exact formula based on an exponential:

equation

where in the last line we have neglected terms of order higher than δt. We understand that the Milstein scheme adds some missing term in the Euler approximation

equation

so that now the formula is correct and yields an exact expectation and variance up to order δt.

For the sake of simplicity, we provide a justification of the Milstein scheme in the scalar case, assuming that functions a(·, ·) and b(·, ·) only depend on S, and not on time t. The starting point is recalling that a stochastic differential equation is actually a shorthand for an integral equation

The Euler scheme approximates integrals over the interval (t, t + δt) by “freezing” the integrand functions to their value at time t:

We may try to improve on this by finding a better approximation of the diffusion term b(S_t) over the time interval (t, t + δt). In order to get a clue, let us consider Itô’s lemma for the process b(S_t) (here we do not need to denote derivatives as partial ones, since we have ruled out dependence on time):

where we introduce μ_b(S_t) and σ_b(X_t) as shorthands for the drift and volatility functions in the differential equation of b(S_t). If we apply the Euler approximation to b(S_t) over the time interval (t, u), we find

where we have neglected the drift term, as it is of order [u − t], whereas the increment of the Wiener process is of order

. Rather than freezing the integrand in Eq. (6.10), we may substitute the last approximation:

The missing piece of the puzzle is the last integral, which we may compute by recalling Example 3.18. Applying that result we immediately find:

Substituting this refined approximation of the diffusion term into the Euler scheme finally yields the Milstein scheme:

Generalizations of this result are available, for which we refer to the chapter references.

6.1.2 PREDICTOR-CORRECTOR METHODS

over the time interval (t, t + δt), we “freeze” the drift and volatility functions to their initial values and sample

The advantage of this scheme is that it is explicit, but we could try to account for the change in the functions a(·, ·) and b(·, ·) is some way. There is an array of ideas, originally developed for the numerical solution of deterministic (ordinary) differential equations, which can be adapted to the stochastic case, including implicit schemes. An implicit version of the Euler scheme could be

The scheme is implicit because now we have to solve an equation to find the new value of S_t+δt. Since this approach could be numerically troublesome, as it may require division by a Wiener increment close to zero, an alternative form is

Example 6.2 An implicit Euler scheme

If we consider the stochastic differential equation

equation

the scheme of Eq. (6.12) yields

equation

which may be rewritten as

equation

Needless to say, things are not always this easy, not to mention the open question of convergence. A possible alternative is to introduce a correction mechanism comparing the actual and the predicted value of the process in t + δt. For instance, the value obtained by straightforward application of the Euler scheme may be considered as a predictor

Since the value of the functions

and

at the end of the timestep are different from what was used in the prediction step, we may introduce a correction as follows:

The role played by the weights α and η is rather intuitive, as they blend the initial and terminal values of functions a(·, ·), and b(·, ·), controlling the amount of correction; often, they are set close to 0.5. The further adjustment of Eq. (6.14) is less intuitive, but we refer the interested reader to [3] for further information, as well as a serious treatment of convergence issues. The aim of this short section was just to get readers acquainted with the richness of discretization approaches, but for the sake of simplicity we will refrain from using advanced path generation methods in what follows. We should also notice that, in the literature, the usefulness of some of these approaches with respect to a simpler Euler scheme with a suitably short time step has been sometimes questioned.

6.2 Simulating geometric Brownian motion

Sometimes, we may use suitable transformations of a differential equation to cast it into a more manageable form. From Section 3.7 we know that, by using Itô’s lemma, we may rewrite Eq. (6.7), which defines GBM, as

Now this is a generalized Wiener process that we may easily integrate as we have seen in the previous section. Taking exponentials to get rid of the logarithm yields

where ν = μ − σ²/2. Since the integral of a standard Wiener process is normally distributed, from a computational point of view we may recast the last expression as

which shows lognormality of prices. From a path generation point of view, the last expression is used as

from which it is easy to generate sample paths of GBMs. A straightforward code is given in Fig. 6.1. Experienced R programmers will certainly notice that the function includes two dreaded nested for loops, which does not help efficiency; we will come back to that in a moment. The following snapshot produces the sample paths in Fig. 6.2, showing the impact of volatility:

The code in Fig. 6.1 is based on two nested for loops, contrary to common recommendations concerning R code efficiency. In order to vectorize the code, it is convenient to rewrite Eq. (6.16) as

We generate differences in the logarithm of the asset prices and then use the cumsum function to obtain cumulative sums of log-increments along each sample path. The resulting function simvGBM is illustrated in Fig. 6.3. We may compare the two implementations in terms of speed:

6.2.1 AN APPLICATION: PRICING A VANILLA CALL OPTION

As an immediate application of path generation for GBMs, we may price a vanilla call option on a non-dividend-paying stock. Clearly, there is no point in using Monte Carlo methods in this case, as we may apply the Black–Scholes–Merton formula that we discussed in Section 3.9.2. However, it is quite useful to verify the accuracy of numerical methods in easily solved case to check their viability and possible limitations. In Fig. 6.4 we recall for convenience the function giving the exact option price, and we also propose a very simple function to estimate the price by Monte Carlo. We just need:

1. To sample lognormal prices using Eq. (6.16).

2. To evaluate and discount the corresponding payoff (assuming a constant risk-free rate).

3. To return the estimated price along with a confidence interval, whose confidence level c.level is 95% by default. Note that we return a list with fields value ci containing the point and the interval estimate, respectively.⁴

The code also includes a little snapshot to verify accuracy. The relevant output is

We observe that the point estimate is not quite precise, even though the true value is included in the 95% confidence interval. Taking the ratio between the half-length of the confidence interval and the point estimate suggests an error above 2%. One crude way to improve the estimate would be to increase the number of replications. We will see in Chapter 8 that variance reduction strategies may be a much better option.

6.2.2 MULTIDIMENSIONAL GBM

When dealing with options on multiple assets or managing multiple risk factors, we may need to generate a multidimensional geometric Brownian motion. The key issue is how to deal with correlation among risk factors, and there are two modeling choices, which have an impact on how simulation is carried out. For the sake of simplicity we illustrate the two approaches in the bidimensional case, where two Wiener processes drive prices S₁(t) and S₂(t). We may consider two independent standard Wiener processes that enter in both equations, thus inducing a correlation between prices:

An alternative is to consider instantaneously correlated Wiener processes and write the equations in the somewhat more familiar form:

The role of instantaneous correlation ρ can be understood by referring to the following formal rule of stochastic calculus:

We note that the analogous rule (dW)² = dt is actually a specific case of this rule, which may be justified heuristically along the lines of Section 3.7.3. We will take the second approach, which requires the generation of correlated standard normals. We know from Section 5.5.2 that this may be accomplished by taking the Cholesky factorization of the covariance matrix, which in this case is actually a correlation matrix, since we are dealing with standard normals. We illustrate the idea first in the bidimensional case, to reinforce intuition. In fact, when only two processes are involved, we may generate increments based on the correlated standard normal variables

₁ and

₂ obtained as a transformation of independent variables Z₁, Z₂ ~ N(0, 1):

The simple code in Fig. 6.5 generates a bidimensional sample path using the above transformation. The following R snapshot produces the plots in Fig. 6.6, which illustrate the impact of correlation:

The previous implementation is rather simple as it involves a single replication. The code in Fig. 6.7 is a bit more involved since, to store multiple replications, we need a tridimensional array; this is created by the function array, which requires a vector specifying the array size along each dimension.

6.2.3 THE BROWNIAN BRIDGE

In the previous sections we have generated asset paths according to a natural scheme, which proceeds forward in time. Actually, the Wiener process enjoys some peculiar properties which allow us to generate the sample paths in a different way. Consider a time interval with left and right end points t_l and t_r, respectively, and an intermediate time instant s, such that t_l < s < t_r. In standard path generation, we would generate the Wiener process in the natural order: W (t_l), W (s), and finally W (t_r). Using the so-called Brownian bridge, we may generate W(s) as the last value, conditional on the values W(t_l) = w_l and W(t_r) = w_r. The conditional value of W(s) is a normal variable with expected value

This is a consequence of the properties of the conditional distribution of a multivariate normal distribution, which we have stated in Theorem 3.7. Before proving Eqs. (6.17) and (6.18), it is important to realize that they are quite intuitive: The conditional expected value of W(s) is obtained by linear interpolation through w_l and w_r; the variance is low near the two end points t_l and t_r and is at its maximum in the middle of the interval.

A preliminary step to prove Eqs. (6.17) and (6.18) is to find the covariance between W(t) and W(u), when t < u:

The statement can be generalized as Cov(W(t), W(u)) = min{t, u}. Then, the joint distribution of the three values that we are considering is

Note that we have listed the time instant in view of the application of Theorem 3.7, since we need the value in s conditional on the values in t_l and t_r. Partitioning vector of expected values and the covariance matrix accordingly yields

By using the Brownian bridge, we may generate sample paths by a sort of bisection strategy. Given W(0) = 0, we sample W(T); then we sample W(T/2). Given W(0) and W(T/2) we sample W(T/4); given W(T/2) and W(T) we sample W(3T/4), etc. Actually, we may generate sample paths in any order we wish, with nonhomogeneous time steps. One could wonder why this complicated construction could be useful. There are at least two reasons:

1. It may help in using variance reduction by stratification, to be discussed in Chapter 8. It is difficult to use stratification in multiple dimensions, but we may use stratification just on the terminal value of the asset price, and maybe an intermediate point. Then we generate the remaining values along the sample path by using the bridge.

2. The Brownian bridge construction is also useful when used in conjunction with low-discrepancy sequences, as we shall see in Chapter 9. Such sequences may not work well in very high dimensional domains, but using Brownian bridge we may use high-quality sequences to outline the paths of the Wiener process, by sampling points acting as milestones; then we can fill the trajectory by Monte Carlo sampling.

In Fig. 6.8 we illustrate an R function to generate sample paths of the standard Wiener process using the Brownian bridge technique, but only in the specific case in which the time interval [0, T] is successively bisected (i.e., the number of intervals is a power of 2). In this case, we may simplify the formulas above to sample W(s): If we let δt = t_l − t_r, we obtain

where

is a standard normal variable. The function receives the length T of the time interval and the number numSteps of subintervals in which it must be partitioned, and it returns a vector containing one sample path. Assume that the number of intervals is 8 (a power of 2). Then we must carry out 3 bisections.

Given the initial condition W(t₀) = 0, we must first sample W(t₈), which means “jumping” over an interval of length T, which is the initial value TJump in the program. Since we store elements in a vector of nine elements (starting with index 1, and including W(t₀)), we must jump eight places in this vector to store the new value. The number of places to jump is stored in IJump.

Then we start the outermost for loop. In the first pass we must only sample W(t₄), given W(t₀) and W(t₈). Given positions left<-1 and right<-IJump+1, we must generate a new value and store it in position i<-IJump/2 + 1, which is 4+1 = 5 in this case. Here we generate only one value, and we divide both time and index jumps by 2.

In the second iteration we must sample W(t₂), given W(t₀) and W(t₄), and W(t₆), given W(t₄) and W(t₈). The nested loop will be executed twice, and indexes left, right, and i are incremented by 4.

In the third and final iteration we generate the remaining four values.

The construction above is illustrated in Fig. 6.9. Figure 6.10 includes a script to check that the marginal distributions of the stochastic process that we generate are the correct ones. Expected values should be zero, and standard deviation should be the square root of time:

We see that, within sampling errors, the result looks correct. Given a way to generate the standard Wiener process, it is easy to simulate geometric Brownian motion. The function is given in Fig. 6.11 and uses a similar approach as the vectorized version simvGBM of Fig. 6.3. One thing we should note is the use of the function diff to generate the vector of Increments in the logarithmic asset price. In fact, in standard Monte Carlo we generate the underlying Wiener process by successive increments; with the Brownian bridge construction we build directly the values of the process at different time instants, and we must use the function diff to obtain the relative differences. In some sense, diff works in the opposite way to ecumsum.

6.3 Sample paths of short-term interest rates

Modeling interest rates is a quite difficult endeavor, as it involves different rates applying to different maturities. The simplest class of models deals with only one interest rate applying to a very short time horizon. Let r_t be the risk-free interest rate applying to the time interval (t,t + dt). This may be called the instantaneous interest rate, although it is often referred to as the short rate. There are different models for the short rate, but in this section we consider only the two that we have already mentioned in previous chapters:

1. The Vasicek model, characterized by a stochastic differential equation featuring mean reversion:

(6.19) equation

2. The Cox–Ingersoll–Ross (CIR) model, which is quite similar to the Vasicek model, but involves a slight change in the volatility term:

(6.20) equation

In the following, we show that this slight difference has a significant consequence:

Short rates are normally distributed in the Vasicek model; the CIR model involves a more complicated noncentral chi-square distribution (see Section 3.2.3.2).

The easier distribution of the Vasicek model results in better analytical tractability; we are able to price bonds and also some options analytically, whereas the CIR model involves more complicated formulas, when available.

However, the Vasicek model can be criticized as the normal distribution allows for negative interest rates, whereas the volatility term in the CIR models avoids this difficulty, provided that

equation

The intuition is that when the short rate r_t gets close to zero, the volatility term in Eq. (6.20) vanishes; if the pull toward the long-term average rate is strong enough, the short rate will stay positive.

In both cases, generating sample paths is not too difficult, as we shall see, since we know the exact distribution of interest rates. More involved models call for more sophisticated path generation strategies, possibly involving discretization errors.

6.3.1 THE VASICEK SHORT-RATE MODEL

The process described in Eq. (6.19) is not a GBM; actually, it is an example of an Ornstein–Uhlenbeck process. To try and solve the equation, we may resort to a typical trick of the trade, related to the method of integrating factors in deterministic ordinary differential equations. The trick is to apply Itô’s lemma to the process

This equation does not look too bad and can be integrated over the interval (0,t):

This expression includes a stochastic integral involving the Wiener process that, as we know, is normally distributed and has zero expected value. Thus, we may immediately conclude that the short rate r_t under the Vasicek model is normally distributed with expected value:⁵

THEOREM 6.1 (Itô’s isometry) Let X_t be a stochastic process adapted to the standard Wiener process W_t. Then,

Actually, the result can be proven in a more general form, but it allows us to find the variance of the short rate:

At time t = 0 the short rate is r₀ and variance is zero.

When t → + ∞, the short rate tends to the long-term average value

; the larger the speed of mean reversion γ, the faster the convergence.

Long-term variance is given by σ²/(2γ); as expected, it is influenced by the volatility σ; however, also γ plays a role, and a strong mean reversion tends to kill volatility.

In order to use the above knowledge to generate sample paths with time step δt, we should express r_t+δt conditional on r_t:

Now we may generate sample paths by the code illustrated in Fig. 6.12. The code produces the plots in Fig. 6.13. We clearly see the impact of the mean reversion coefficient γ. In the plots for γ = 0.1 we also notice how short rates can indeed get negative in the Vasicek model.

6.3.2 THE COX–INGERSOLL–ROSS SHORT-RATE MODEL

The CIR model relies on a square-root diffusion process, which is not as easy to deal with as the Ornstein–Uhlenbeck process underlying the Vasicek model. However, by studying the transition density of the square-root diffusion process, a link with the noncentral chi-square distribution⁶ can be established. We state the following result without proof.

THEOREM 6.2 (CIR short-rate model) The transition law from r₀ to r_t for the CIR model can be expressed as

where the degrees of freedom are v = 4

γ/α and the noncentrality parameter is

By applying the theorem to the transition from r_t to r_t+δt, we immediately obtain a path generation mechanism. It is also useful to find the conditional expectation and variance of r_t given r₀, which is easily obtained by recalling Eq. (3.18). For the expected value we have

We notice that the expected value has the same intuitive form as in the Vasicek model. Variance is a bit trickier:

On the basis of Theorem 6.2 it is easy to build an R function to generate sample paths for the CIR short-rate model. It is also interesting to compare the exact sample paths against those generated by the straightforward Euler discretization

where

~ N(0, 1) as usual. The code displayed in Fig. 6.14 includes two such functions. Figure 6.15 shows a script to compare the exact distribution against a one-step Euler approximation. For the cases T = 0.25 and T = 1 we sample the rates obtained in one step, with the following parameters:⁷

Note that this setting of α corresponds to a volatility coefficient σ = 0.1 (see Eq. (6.20)). In Fig. 6.16 we compare the kernel densities estimated by sampling 100,000 replications; the densities estimated on the basis of the Euler scheme are drawn with dashed lines. We observe that for T = 0.25 there is a close match, but for T = 1 the Euler rates, which are in fact normally distributed, may get negative.

6.4 Dealing with stochastic volatility

In geometric Brownian motion models, the volatility σ is constant. On the contrary, an empirical finding is that volatility is stochastic, and clusters of high vs. low volatility can be observed over time. In discrete time, this may be modeled along the lines of GARCH time series models that we have seen in Chapter 3. We may also consider continuous-time models, and one possible approach relies on the square-root diffusion process that we have introduced in Section 6.3.2 for short-term interest rates. One such model is the Heston model:

Equation (6.27) is basically a generalization of GBM, and Eq. (6.28) models variance V_t as a mean-reverting square-root diffusion. An easy way to generate sample paths for this process relies on a simple Euler discretization; we have already seen that this may result in negative values for quantities that are non-negative by definition. The more advanced path generation strategies that we have discussed earlier in the chapter may be exploited to remedy the situation. We defer details to Chapter 11, where we will see how to price a path-dependent, lookback option under stochastic volatility.

6.5 Dealing with jumps

Diffusion processes are continuous (with probability 1), and they are often criticized for this very reason, as sudden jumps are indeed observed in financial markets. A simple approach to incorporate jumps is to add the increments of a compound Poisson process, which is characterized by a rate λ and a distribution of the jump size. In practice, this means that jump epochs are separated by a stream of independent exponential variables with rate λ, and that at each such epoch we have to add a random jump size to the increment of the diffusion component. For instance, we may consider a generalized Wiener process, described by the differential equation

How can we add normally distributed jumps N(ν, ξ)? The theoretical framework we need is provided by the theory of Lévy processes, which generalize both diffusions and Poisson (pure jump) processes. The essential features of Lévy processes are:

1. Increments are independent.

2. Increments are stationary.

3. The process is stochastically continuous.

4. The initial value is zero (with probability 1).

These are the same properties of the standard Wiener process, and the only point worth mentioning is property 3, which may seem at odds with discontinuous sample paths of Poisson processes. A correct statement of the property, in a one-dimensional case, is

Hence, jumps are not actually ruled out, but they cannot be “dense.” For instance, in a Poisson process the probability that there is more than one jump in an infinitesimal time interval dt is negligible.

A key result is the Lévy–Itô decomposition, whose proper statement is beyond the scope of this book.⁸ Loosely speaking, the theorem states that we may decompose a Lévy process into separate components, accounting for diffusion and jump subprocesses. Let us illustrate how we may take advantage of this theorem by adding a jump component to the above generalized Wiener process. To this aim, we shall take advantage of the following properties of the Poisson process:

The number N of jumps in the time interval [0,T] is a Poisson random variable with parameter λT.

Conditional on N, the jump epochs are a sequence of N i.i.d. uniform variables on [0, T].

Therefore, in order to generate a sample path of this jump–diffusion process, we may:

1. Generate the increments of the diffusion component as usual.

2. Sample the number of jumps as a Poisson variable.

3. Generate the time epochs as uniform random variables.

4. Sample the jump sizes and add them to the corresponding increments of the diffusion process.

The code in Fig. 6.17 applies the above ideas and produces the sample path displayed in Fig. 6.18. Following the traditional convention to plot discontinuous functions, we have added a bullet–circle pair corresponding to each point of discontinuity. Note that the process is continuous from the right.

For further reading

For a sound theoretical treatment of numerical methods to solve stochastic differential equations, see [3].

Several examples, at a more advanced level with respect to this chapter are described in [1] and [5]. See also [4], which requires some advanced measure theoretic background.

An interesting discussion of transformation methods and the Milstein scheme is given in [2].

References

1 P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer, New York, 2004.

3 P.E. Kloeden and E. Platen. Numerical Solution of Stochastic Differential Equations. Springer, Berlin, 1992.

4 R. Korn, E. Korn, and G. Kroisandt. Monte Carlo Methods and Models in Finance and Insurance. CRC Press, Boca Raton, FL, 2010.

5 D.P. Kroese, T. Taimre, and Z.I. Botev. Handbook of Monte Carlo Methods. Wiley, Hoboken, NJ, 2011.

¹To avoid notational clutter, we will mostly, but not exclusively, use the style S_t rather than S(t) in this chapter. The choice will be purely dictated by convenience.

³We investigate the sources of bias in American-style option pricing in Section 11.3.1.

⁵By the way, we can find this expectation by setting σ = 0 and solving the resulting deterministic equation.