Chapter Eleven: Option Pricing

Option pricing is one of the most important application fields for Monte Carlo methods, since option prices may be expressed as expected values under a suitable probability measure. Monte Carlo sampling is a quite flexible strategy that can be adopted when alternative numerical methods, like binomial/trinomial lattices or finite difference methods for solving partial differential equations, are inefficient or impossible to apply. For low-dimensional and path-independent options, Monte Carlo methods are not quite competitive with these alternatives, but for multidimensional or path-dependent options they are often the only viable computational strategy. Even more so when one has to deal with more complicated processes than an innocent geometric Brownian motion (GBM), or even a full-fledged term structure of interest rates. In the past, it was claimed that a significant limitation of Monte Carlo methods was their inability to cope with early exercise features. More recently, ideas from stochastic optimization and stochastic dynamic programming have been adapted to option pricing, paving the way to some Monte Carlo methods for American- and Bermudan-style options.

In this chapter we rely on the path generation methods of Chapter 6, where we have first seen how to price a vanilla call option by random sampling in Section 6.2.1. We will also make good use of the variance reduction strategies of Chapter 8, which were also illustrated in the simple case of a vanilla call option. Finally, when dealing with early exercise features, we use concepts of stochastic programming, like scenario trees, and approximate dynamic programming, which we introduced in Sections 10.5 and 10.8.

To make things a bit more interesting and realistic, we will consider simple examples of multidimensional and path-dependent options, as well as extremely basic interest rate derivatives. As a first example, we consider bidimensional spread options in Section 11.1, where we apply both naive Monte Carlo and stratified sampling. Then, in Section 11.2 we consider path-dependent derivatives, like barrier and arithmetic average Asian options. In these first two sections we consider European-style options, which lend themselves very well to sampling methods, since only the estimation of an expectation is required. The case of American- or Bermudan-style options is more challenging, as their early exercise features require the solution of an optimization problem. In Section 11.3 we show how certain optimization concepts can be adapted to option pricing. We introduce these fundamental concepts within the easy and comfortable BSM (Black–Scholes–Merton) world. This is a sensible choice from a pedagogical point of view, and sometimes allows us to compare the results obtained by sampling methods against exact prices. However, the BSM world does not capture all of the intricacies of financial markets. In Section 11.4 we illustrate an example of option pricing with stochastic volatility. Finally, we consider extremely simple interest rate derivatives Section 11.5, i.e., zero coupon bonds and vanilla options on them, in the cases of the Vasicek and Cox–Ingersoll–Ross models.

We only consider option pricing in this chapter, but related issues are dealt with in Chapters 12, where we estimate option greeks, and 13, where we measure the risk of a portfolio of options. Some background about risk-neutral pricing is provided in Section 3.9.

11.1 European-style multidimensional options in the BSM world

Let us consider what is arguably the simplest option depending on two risk factors, namely, a European-style spread option. This is an option written on two stocks, whose price dynamics under the risk-neutral measure are modeled by the following GBMs:

where the two Wiener processes have instantaneous correlation ρ. Thus, the driving process is a bidimensional geometric Brownian motion, for which we can generate sample paths as shown in Section 6.2.2. The option payoff is

i.e., there is a positive payoff when the spread between the two prices exceeds the threshold K. When K = 0 the option is also called exchange option. To see where the name comes from, consider the value at maturity of a portfolio consisting of one share of U and one option:

Hence, the option allows us to exchange one asset for the other at maturity. The exchange option is actually fairly simple to deal with, and indeed there is an analytical pricing formula which is a fairly straightforward generalization of the Black–Scholes–Merton formula:

where Φ(·) is the CDF of the standard normal distribution, and V₀ = V(0), U₀ = U(0). The reason why we get this type of formula is that the payoff has a homogeneous form, which allows to simplify the corresponding partial differential equation by considering the ratio V/U of the two prices.¹ R code implementing this formula is shown in Fig. 11.1.

In this simple case, we have a benchmark against which to assess the quality of Monte Carlo estimates. In order to apply Monte Carlo, we need a path generation strategy, which is rather simple in this case. We recall from Section 6.2.2 that, to generate sample paths for two correlated Wiener processes, we need correlated standard normals. One possible approach relies on the Cholesky factor for the covariance matrix, which in this case is actually a correlation matrix:

Hence, to simulate tridimensional correlated Wiener processes, we must generate two independent standard normal variates Z₁ and Z₂ and use

to drive path generation. However, for the sake of generality, let us take advantage of the function simMVGBM that we developed in Chapter 6. The resulting R code is displayed in Fig. 11.2 for a spread option. The code is fairly simple, and the only thing worth mentioning is that the array storing the sample paths is tridimensional. To price the exchange option and check the results against the exact formula, we just need to select K = 0:

Now, can we improve the quality of the above estimate? In this bidimensional case, it is fairly easy to resort to stratification with respect to both random variables. However, for illustration purposes, let us pursue both single and double stratification. By single stratification, we mean stratifying with respect to one random variable, e.g., the standard normal

₁ that drives the price of the first stock. Given Eqs. (11.1) and (11.2), we immediately see that the conditional distribution of

₂ given

₁ is normal with

However, we can obtain the same result by using a more general result about conditional distributions for the multivariate normal, which was given in Section 3.3.4.1 and can be applied in more general settings. The details are given in Example 3.7, and we urge the reader to check that we indeed obtain the same results. When stratifying a standard normal, it may be convenient to stratify the underlying uniform variable. As we have seen in Example 8.4, if we have m strata, indexed by j = 1, …, m, and N replications, we may generate N_j = N/m observations. Each stratum is an interval of the form

1. To generate uniform random variables U_k, k = 1, …, N_j, for each stratum j, and to transform them into

equation

the observation k within stratum j.

2. To invert the CDF of the standard normal and generate

equation

3. To sample

_2jk ~ N(ρ

_1jk, 1 − ρ²).

4. To generate the corresponding lognormal variables and evaluate the option payoff.

All of this is accomplished by the R code in Fig. 11.3. Note that, in order to compute the overall estimate and the confidence interval, according to Eq. (8.13), we have to compute the sample mean and the sample variance within each stratum.

Double stratification requires to generate two uniform variables in a subsquare of the unit square:

The rest of the procedure is quite similar to the one for single stratification and can be implemented as shown in Fig. 11.4.

Now we may compare the accuracy of naive Monte Carlo against the two stratifications. We choose K = 0 in order to use the exact formula for the exchange option, and we also note that, for a fair comparison, if we set the number of strata to 25 in simple stratification, we should set it to 5 in double stratification, since the parameter numStrat specifies the number of strata along each dimension:

The effect of stratification is rather evident. Clearly, this is a fairly trivial example, and when dealing with several random variables, full stratification is not feasible. Nevertheless, in the case of multivariate normal distributions, the aforementioned results about conditioning can be exploited to stratify with respect to the most relevant dimensions. We also note that these results are the foundation of the Brownian bridge technique. In other cases, we may lack explicit results, but rejection sampling strategies have been proposed to apply the idea.

11.2 European-style path-dependent options in the BSM world

Pricing European-style, path-dependent options is, in principle, quite easy within a Monte Carlo framework. All we need to do is generate sample paths, under the risk-neutral measure, and collect the discounted payoff. However, when prices have to be sampled at many points in time, the process can be rather inefficient. Hence, we may try to improve the accuracy/performance trade-off by applying the variance reduction strategies of Chapter 8, or sample path generation by low-discrepancy sequences. In this section we consider two cases of path-dependent options:

1. An example of barrier option, namely, a down-and-out put option, to which we apply importance sampling

2. An arithmetic average Asian call option, to which we apply control variates and path generation by scrambled Sobol sequences

11.2.1 PRICING A BARRIER OPTION

In barrier options, a specific asset price S_b is selected as a barrier value. During the life of the option, this barrier may be crossed or not, an event that may activate or cancel the option. In knock-out options, the contract is canceled if the barrier value is crossed at any time during the whole life; on the contrary, knock-in options are activated only if the barrier is crossed. The barrier S_b may be above or below the initial asset price S₀: if S_b ≥ S₀, we have an up-option; if S_b < S₀, we have a down-option. These features may be combined with the payoffs of call and put options to define an array of barrier options. For instance, a down-and-out put option is a put option that becomes void if the asset price falls below the barrier S_b; in this case we need S_b < S₀ and S_b < K for the option to make sense. The rationale behind such an option is that the risk for the option writer is reduced and, therefore, it is reasonable to expect that a down-and-out put option is cheaper than a vanilla one. From the option holder’s viewpoint, this means that the potential payoff is reduced; however, if you are interested in options to manage risk, and not as a speculator, this also means that you may get cheaper insurance. By the same token, other combinations of features may be devised, like up-and-out call and down-and-in put options. Actually, we do not need the ability of pricing all of them, as precise relationships between their prices may be observed. Let us consider, for instance a down-and-in put option. Holding both a down-and-out and a down-and-in put option, with the same barrier level and strike, is equivalent to holding a vanilla put option. To see this, note that the barrier is either crossed or not; hence, one of the two options will be active at maturity. Thus, we have the following parity relationship:

where P is the price of the vanilla put, and P_di and P_do are the prices for the down-and-in and the down-and-out options, respectively. Sometimes a rebate is paid to the option holder if the barrier is crossed and option is canceled; in such a case the parity relationship above is not correct.

In principle, the barrier might be monitored continuously; in practice, periodic monitoring is applied (e.g., the price could be checked each day at the close of trading). This may affect the price, as a lower monitoring frequency makes the detection of barrier crossing less likely. As a general rule for path-dependent options, continuous-time monitoring is more amenable to analytical treatment. Indeed, exact formulas are available for certain barrier options, such as a down-and-out put option with continuous-time monitoring. The following formula, where S₀, r, σ, K, and T have the usual meaning, gives the price of this option:²

R code implementing these formulas is given in Fig. 11.5. When monitoring occurs in discrete time, we should expect that the price for a down-and-out option is increased, since breaching the barrier is less likely. An approximate correction has been suggested,³ based on the idea of applying the analytical formula above with a correction on the barrier level:

where the term 0.5826 derives from the Riemann zeta function, δt is time elapsing between two consecutive monitoring time instants, and the sign ± depends on the option type. For a down-and-out put we should choose the minus sign, as the barrier level should be lowered to reflect the reduced likelihood of crossing the barrier. Notice that this barrier option is less expensive than the corresponding vanilla; the price of the barrier option tends to that of the vanilla option if S_b → 0. By lowering the barrier according to Eq. (11.4) we increase the option price. The following snapshot, where we assume daily monitoring for an option maturing in 2 months consisting of 30 days each, illustrates the difference between a vanilla and a down-and-out put, and the effect of the correction:

If we want to price the option using Monte Carlo simulation, we must generate sample paths like

where the integer time subscript refers to time step of length δt, and T = M δt. The naive Monte Carlo estimator is

By relying on the function simvGBM to sample a GBM process, it is easy to write R code pricing the option by Monte Carlo, as shown in Fig. 11.6. Note the use of any to avoid a for loop to inspect the path. The function returns the point estimate and the confidence interval as usual, but also the number of sample paths on which the barrier is crossed, for reasons that we will clarify shortly. With the same parameters as before we obtain

We notice that the barrier has been crossed in 7835 replications out of 50,000. If we have to price a down-and-in put option, we may price the down-and-out and use the parity relationship of Eq. (11.3), or we may adapt the code, which is very easy to do.

The choice between these alternatives may by a matter of convenience. Let us suppose that the barrier is low and that crossing it is a rare event. If we consider pricing a down-and-in option, crossing the barrier is the “interesting” event, and if it is rare, then in most replications the payoff is just zero. Hence, we may consider using importance sampling to improve the performance. In this case, it is more convenient to price the down-and-in directly, and then use parity to price the down-and-out. One possible idea is changing the drift of the asset price in such a way that crossing the barrier is more likely.⁴ In this case we should decrease the drift, simulate GBM under this distorted measure, and correct the estimator by a likelihood ratio, just as we did in Section 8.6.1. Here, however, the likelihood ratio involves several random variables, and in order to figure it out we should step back and consider what we do to generate the sample path S. For each time step, we generate a normal variate Z_j with expected value

and variance σ² δt. All of these variates are mutually independent, and the asset price path is generated by setting

Let Z be the vector of the normal variates, and let f(Z) be its joint density. Now, let us assume that we lower the drift r by an amount b and generate normals with expected value

Note mat we interpret b as an annualized drift component, with continuous compounding just like the risk-free rate r, and we have to multiply it by the time step δt. Let g(Z) be the joint density for the normal variates generated with this modified expected value. Thanks to independence, both multivariate densities are just the product of M univariate normal densities, and their ratio can be manipulated as follows:

where in the last equality we replace the generic drift μ according to Eq. (11.5). R code implementing importance sampling for a down-and-in put option is reported in Fig. 11.7. By setting the parameter b to 0, we essentially run naive Monte Carlo, so we may just use this function to assess the variance reduction in pricing the down-and-in put option:

It seems that we do gain something by applying importance sampling. However, apart from the intuition that rare and interesting events should be made more likely, it is not quite clear how the drift should be selected. In Section 13.4 we will consider a more detailed analysis to justify the selection of an importance sampling measure.

11.2.2 PRICING AN ARITHMETIC AVERAGE ASIAN OPTION

Barrier options are path-dependent, but to a weak degree, since the only relevant information is whether the barrier has been crossed or not. A stronger degree of path dependency is typical of Asian options, as the payoff depends on the average asset price over the option life. Different Asian options may be devised, depending on how the average is calculated. Sampling may be carried out in discrete or (in principle) continuous time, and the average itself may be arithmetic or geometric. The discrete arithmetic average is

where t_j, j = 1, …, M, are the discrete sampling times. The corresponding geometric average is

The chosen average A, whatever it is, defines the option payoff by playing the role of either a rate or a strike. For instance, an average rate call has a payoff given by

Early exercise features may also be defined in the contract. Hence, we may devise American- or Bermudan-style Asian options, even though the name does sound a bit weird.

Pricing an Asian option is, in general, a nontrivial task. The problem is not trivial not only because the payoff is more complicated than in a vanilla option, but also because when pricing the option after its inception, we must account for the average cumulated so far, and the fact that even if the sampling period δt is uniform, the time to the next price observation is different. Nevertheless, there are some analytical pricing formulas for Asian option. Consider the payoff of a discrete-time, geometric average Asian option:

Since the product of lognormal random variables is still lognormal, it is possible to find an analytical formula for this option, which looks like a modified Black–Scholes–Merton formula.⁵ In the formula, we are pricing the option at time t, and G_t is the current geometric average, if any observation has been already collected. If so, t_m is the last time instant at which we observed the price of the underlying asset, m is the corresponding index, t_m+1 is the time instant of the next observation, and t_M = T is the option maturity:

The formula gets considerably simplified if we just consider the option price at its inception, i.e., at time t = 0. In such a case m = 0, and the resulting R implementation is illustrated in Fig. 11.8.

In the case of an arithmetic average option, since the sum of lognormals is not lognormal, we have to use numerical methods. Pricing a European-style Asian option by Monte Carlo is quite easy in principle, since we have just to generate sample paths and take an average. In Fig. 11.9 we show the R code for a discrete, arithmetic average rate call option:

It is interesting to note that the Asian option is cheaper than the corresponding vanilla, as there is less volatility in the average price than in the price at maturity. This explains why Asian options may provide cheaper insurance when multiple time instants are relevant for a hedger. In order to improve the quality of estimates, we may pursue variance reduction strategies, most notably control variates, or adopt path generation by low-discrepancy sequences.

11.2.2.1 Control variates for an arithmetic average Asian option

We know from Section 8.3 that, in order to apply this kind of variance reduction strategy, we need a suitable control variate, i.e., a random variable, with a known expected value, which is correlated with the option payoff. In the BSM framework, the following possibilities can be considered:

Using the expected value of the arithmetic average, which is known in the case of geometric Brownian motion. Let us recall the formula

equation

The expected value of the arithmetic average of stock prices is (under the risk-neutral measure):

equation

This control variate does capture an essential feature of the option payoff, but not its nonlinearity.

Using a vanilla call option, whose price is given by the BSM formula. This control variate, unlike the previous one, does capture nonlinearity in the option payoff, but it disregards the sample path.

Using the geometric average Asian, whose price is given in Eq. (11.6). This control variate is definitely the most sophisticated and it embodies very deep knowledge, even though the kind of average is different.

The R function code in Fig. 11.10 is a generic function to implement control variates for the arithmetic average Asian option. The input argument sampleCV is actually a function that returns the value of the control variate for a given sample path; expCV is the expected value of the control variate. The script of Fig. 11.11 shows how to use the generic function to implement the three proposed strategies and produced the following (edited) output:

We clearly see that the geometric average option does an excellent job, even though the two alternatives also reduce variance a bit with respect to naive Monte Carlo. This is easily explained by checking the correlation between the naive Monte Carlo estimator and the three control variates. This can be done by using the function of Fig. 11.12, which yields the following estimates of correlation:

Indeed, there is an extremely strong correlation between the arithmetic and geometric average payoffs, which explains why this control variates does such a good job.

11.2.2.2 Using low-discrepancy sequences

Armed with the function lowDiscrGBM to generate sample paths of a GBM process (see Fig. 9.9), it is easy to price the arithmetic average Asian option using low-discrepancy sequences. The code is given in Fig. 11.13, and we may check its performance on the same example that we have used with control variates. We may consider the value provided by the most powerful control variate, 5.394813, as a quite accurate estimate. In the following snapshot we compare the performance of Halton and Sobol sequences with different scramblings:

We see that, with a limited sample size, scrambled Sobol sequences provide us with a satisfactory result, whereas Halton sequences are a bit less effective. See the related discussion in Section 9.5.

11.3 Pricing options with early exercise features

In this section we illustrate the basic issues of pricing derivatives with early exercise features, such as American- and Bermudan-style options. Strictly speaking, since Monte Carlo methods rely on time discretization, we can only price the latter kind of options. Anyway, the price of a Bermudan-style option can be considered as useful approximation for the corresponding American-style option, and it can be refined by Richardson extrapolation. The price of this kind of derivative is related to the following optimization problem:⁶

where the function f(·) is the option payoff, the expectation is taken under a risk-neutral measure

, and τ is a stopping time. Essentially, a stopping time is a random variable representing the time at which the option is exercised. This random variable is adapted to the available information, which is the sample path observed so far; in other words, the stopping time is associated with a nonanticipative exercise policy. The early exercise decision should be made by comparing the intrinsic value of the option, i.e., the payoff obtained by exercising the option immediately, and the value of continuation, i.e., the value of keeping the option alive. This has the effect of partitioning the state space in a continuation and an exercise region; see Fig. 10.8. Finding the optimal exercise policy is not trivial in general. When dealing with low-dimensional problems, alternative approaches can be applied, like binomial lattices and finite differences, which determine the optimal exercise policy by going backward in time. These methods are not feasible for high-dimensional and possibly path-dependent problems, which is the typical range of problems well-suited to Monte Carlo methods, in the case of European-style options. Indeed, years ago it was believed that Monte Carlo methods were not suitable to price options featuring early exercise opportunities. The situation has changed to a certain degree, thanks to the application of an array of approximation methods related to stochastic optimization, which we have introduced in Chapter 10:

Simulation-based optimization. One strategy is to specify a decision rule depending on a set of parameters, and solve a simulation-based optimization problem based on a set of sample paths. For instance, one might decide to exercise the option when it is enough in-the-money, i.e., when the intrinsic value exceeds a time-varying threshold. We may define the threshold by a set of parameters that can be optimized by the approaches of Sections 10.4. This approach may work fairly well in some cases, but it suffers from definite limitations when the shape of the early exercise region is complicated and disconnected.

Scenario trees. Another approach relies on a tree representation of uncertainty. As we have seen in Section 10.5, scenario trees are the standard way to represent uncertainty in stochastic programming with recourse, as they allow to deal with nonanticipative decisions. Trees are discretized stochastic processes, just like binomial lattices, but they do not recombine. Hence, the bad side of the coin is that their size tends to explode exponentially with respect to the number of time steps, but the upside is that they can deal with path-dependent and high-dimensional problems, provided that the set of early exercise opportunities is limited.

Approximate dynamic programming. Finally, we may rely on approximate dynamic programming methods, most notably regression-based approaches, to learn the value function; see Section 10.8. The value function is, in this case, the value of the continuation. With respect to full-fledged stochastic optimization approaches, in option pricing we do not face one of the three curses of dimensionality, since the decision variable is pretty simple: exercise or continue.

These three categories are not exhaustive, since there are alternatives, such as the stochastic mesh method. Furthermore, the boundaries among them need not be clear-cut. For instance, learning a policy by simulation can be interpreted as a form of approximate dynamic programming, focused on the policy itself rather than the value function determining the policy implicitly. Relying on the value function, in principle, allows for more flexibility, and in the following we will describe in detail only examples of the two latter approaches. However, first we have to investigate bias issues. Indeed, in Section 7.1.1 we have illustrated the pricing of a chooser option, which involves an optimal decision, and we discovered that a naive Monte Carlo approach may not work at all because of induced bias.

11.3.1 SOURCES OF BIAS IN PRICING OPTIONS WITH EARLY EXERCISE FEATURES

When pricing European-style options, we usually take for granted that our estimator is unbiased and concentrate on reducing its variance. When dealing with early exercise features, the picture is not so simple. To begin with, there are two clear sources of low bias:

If we have to price an American-style option and discretize time to generate sample paths, we are restricting the exercise opportunities, which reduces the value of the option.

When making early exercise decisions, we typically use suboptimal rules obtained by an approximate solution of an optimization problem. Hence, this also contributes to reduce the estimate of the option value.

On the other hand, however there is also a potential source of high bias. When making decisions based on a scenario tree, like the one illustrated in Fig. 11.14, we are actually using more information in the model than we actually have in the real world, as only a set of sampled scenarios is used, ruling out other realizations of the random variables involved. As a more formal clue, let us consider the problem of estimating

where α is a given number, playing the role of the intrinsic value, and X is a random variable, whose expectation plays the role of the continuation value; we estimate the maximum on the basis of a sample X₁, …, X_m and the sample mean

. It is easy to see that estimator max {α,

} is biased high:⁷

In general, the different sources of bias interact in unclear ways, possibly resulting in an undefined situation. It is quite recommended to run the simulation in such a way that we have a clear bias, in one way or another. The scenario tree approach, that we illustrate in Section 11.3.2, yields both a low- and a high-biased estimator, which can be merged into a valid confidence interval. The value-function-based approach that we illustrate in Section 11.3.3 is a potentially more powerful approach, but its bias is unclear. What we can do is learn an approximate policy, using a set of replications, and then evaluate the option by an independent set of replications, which yields a low-biased estimator. There are other sophisticated strategies to obtain upper bounds, but they require a deeper background on stochastic processes.⁸

11.3.2 THE SCENARIO TREE APPROACH

Pricing European-style options can be accomplished by the generation of a bunch of independent scenarios, sharing only the initial state. When dealing with early exercise features, we need more care, as we cannot make any exercise decision on the basis of the future states visited along each individual sample path. As we have seen in Section 10.5, a scenario tree can be generated in order to enforce nonanticipativity of decisions in stochastic programming models. Broadie and Glasserman (BG) have proposed a pricing approach that relies on such a structure.⁹ The R function in Fig. 11.15 shows how a tree can be built for a trivial one-dimensional GBM process. The approach can be easily extended to multidimensional and more realistic models. A possible disadvantage of trees is that they suffer from an exponential explosion in their size when one has to branch at many time instants. Let b denote the branching factor, i.e., the number of nodes branched for each node in the tree; for the sake of simplicity, we assume that b is constant along the tree. If we branch b = 100 nodes from each node at each time layer in the structure, we have 1 node at time t = 0, 100 nodes at time t = T₁, 100² nodes at time t = T₂, and so on. Indeed, the BG tree approach is best suited to derivatives with limited time epochs for early exercise. An advantage, however, is that the idea works very well also for high-dimensional derivatives. A further, and possibly even more significant advantage, is that the authors have proposed an approach to find both high- and low-biased estimators.

To understand the approach, it is important to choose a sensible notation.¹⁰ We generate the tree starting from the initial state X₀. Then:

From the initial node X₀ we generate b successor nodes X₁¹, …, X₁^b at time T₁.

From each of these nodes X₁ⁱ, i = 1, …, b, we branch b successors at time T₂, denoted by X₂ⁱ¹, …, X₂^ib.

The process is repeated recursively to generate the successors at time T₃, X₃^i₁i₂1, …, X₃^i₁i₂b, up to option expiration at time T_m.

We observe that the subscript for each node is associated with its time period, whereas the superscript is associated with the path. Indeed, this approach can be applied to path-dependent options, unlike binomial lattices. Since we are sampling the tree by straightforward Monte Carlo, all branches have the same conditional probability, 1/b; alternative scenario generation procedures, like Gaussian quadrature, would assign different probabilities to successor nodes. The early exercise decision when in state x at time t has to be made by a comparison between the intrinsic value h_t(x) and the discounted expectation of the value at future states:

where we assume for simplicity that the length of the time steps is always δt. For a vanilla put, the intrinsic value is h_t(x) = max{K − x, 0}, where the state x is the current asset price S_t and K is the asset price; clearly, we do not care about negative intrinsic values. This is a recursive dynamic programming equation defining the option value at each state. At the terminal nodes of the tree, the exercise decision is trivial, since there is no value in waiting and the form of the option payoff is given. Going backward, we may estimate the option value by replacing the expectation by a sample mean, as successor nodes are equiprobable:

For the reasons that we have outlined in Section 11.3.1, this estimator is biased high for each node of the tree. In building this estimator, the same information is used twice: first, to decide when to exercise early, and then to estimate the value of this decision. In order come up with a corresponding low-biased estimator, we should break this link and use a sample to learn the rule, and then another independent sample to estimate its value. Since this requires sampling more and more nodes, the BG approach relies on a common trick in statistics. Given a node (i₁i₂ ··· i_t) at time t, we set apart a single successor node (i₁i₂ ··· i_tk) in turn, k = 1, …, b, and use the remaining b − 1 nodes to find a low-biased estimate

of the option value at (i₁i₂ ··· i_t). The estimate is low-biased because, as we show below in detail, we use the other b − 1 successor nodes to decide whether we exercise or not, but we assess the value of the decision on the successor node that was set apart. This results in b estimates, which are then averaged to find the low-biased estimate

. Note the use of a lowercase v to distinguish this estimate from the high-biased value estimate, denoted by uppercase V. In detail, given node i₁i₂ ··· i_t at time t, we set

Then, all of the above quantities are averaged and yield the low-biased estimate

These low-biased estimates are initialized at the last time layer of the tree by setting them to option payoff. The R code of Fig. 11.16 shows how to generate lower and upper bounds based on a tree generated by the function sampleBGMTree.

11.3.3 THE REGRESSION-BASED APPROACH

In Sections 10.7 and 10.8 we have illustrated different strategies for the numerical solution of the recursive equation of stochastic dynamic programming by approximating value functions. In particular, approximate dynamic programming (ADP) lends itself to pricing options with early exercise features by Monte Carlo sampling. In this section we describe an approach due to Longstaff and Schwartz,¹¹ which should be interpreted as a way to approximate the value function of dynamic programming by a linear regression against a set of basis functions. Since we approximate the value function, what we expect is a suboptimal solution; furthermore, time is discretized. Hence, we should expect some low bias in our estimate of price. Actually, to find an accurate estimate with a well-defined bias, it is recommended to use the procedure that we outline below to learn first a set of approximate value functions, which embody a strategy for early exercise decisions; then, a more extensive set of replications should be used to assess the value of the decision rules. This is not needed when ADP is used in other optimization problems whose main output is the set of decision variables, defined by a suitable strategy; here, however, it is the value of the objective function that we are interested in.

For the sake of simplicity, we will just consider a vanilla American put option on a single non-dividend-paying stock. Clearly, the approach makes sense in more complex settings. As usual with Monte Carlo simulation, we generate sample paths (S₀, S₁, …, S_t, …, S_T), where we use t as a discrete time index, leaving the discretization step δt implicit. If we denote by h_t(S_t) the intrinsic value¹² of the option at time t, the dynamic programming recursion for the value function V_t(S_t) is

The expectation is taken under a risk-neutral measure

and is conditional on the current asset price S_t. In the case of a vanilla American put, we have h_t(S_t) = max{K − S_t, 0}. Having to cope with continuous states is the only difficulty we have here, as time is discretized and the set of control actions is finite: Either you exercise immediately the option, or you cross your fingers and continue. Equation (11.9) is identical to Eq. (11.8). The difference between the regression-based and the scenario tree approaches lies in the scenario generation process: Here we do not branch a tree, which suffers from an exponential increase in complexity, but we generate independent sample paths, like we do with European-style options. Then, however, we must find a way to enforce nonanticipativity. It is important to realize that we cannot make the exercise decision along individual sample paths; if we are at a given point of a sample path generated by Monte Carlo sampling, we cannot exploit knowledge of future prices along that path, as this would imply clairvoyance.¹³ What we can do is use our set of scenarios to build an approximation of the conditional expectation in Eq. (11.9), for some choice of basis functions

_k(S_t), k = 1, …, K. The simplest choice we can think of is regressing the conditional expectation against a basis of monomials:

₁(S) = 1,

₂(S) = S,

(S) = S², etc. In practice, orthogonal polynomials can also be used, as well as well as problem-specific basis functions capturing important features of each state. Note that we are using the same set of basis functions for each time instant, but the coefficients in the linear combination do depend on time:

Since the coefficients α_kt are not associated with specific sample paths, decisions are nonanticipative. The coefficients α_kt can be found by linear regression, going backward in time as customary with dynamic programming; note that the approximation is nonlinear in S_t, but it is linear in terms of the coefficients.

In order to illustrate the method, we should start from the last time period. Assume that we have generated N sample paths, and let us denote by S_ti the price at time t along sample path i = 1, …, N. At option expiration, the value function is just the option payoff:

These values can be used, in a sense, as the Y values in a linear regression, where the X values are the prices at time T − 1. More precisely, we estimate the following regression model for the continuation value:

where e_i is the residual for each sample path. We may find the weights α_k,T−1 by the usual least-squares approach, minimizing the sum of squared residuals. Note that we are considering the discounted payoff, so that we may then compare it directly against the intrinsic value. Furthermore, we are using the state at time T − 1 as the regressor. As we clarify later, we should interpret this as a post-decision state, a fundamental concept in approximate dynamic programming that we have discussed in Section 10.8.2.

In the regression above, we have considered all of the generated sample paths. Actually, it is sometimes suggested to consider only the subset of sample paths for which we have a decision to make, i.e., the subset of sample paths in which the option is in the money at time T − 1. In fact, if the option is not in the money, we have no reason to exercise; using only the sample paths for which the option is in the money is called the “moneyness” criterion and it may improve the performance of the overall approach.¹⁴ Denoting this subset by I_T−1 and assuming K = 3, we would have to solve the following least-squares problem:

The output of this optimization problem is a set of coefficients in the approximation of the continuation value. Note that the weights are linked to time periods, and not to sample paths. Using the same approximation for each sample path in I_T−1, we may decide if we exercise or not. We stress again that this is how nonanticipativity is enforced in the regression-based approach, without the need to resort to scenario trees.

We should pause and illustrate what we have seen so far by a little numerical example. We will use the same example as the original reference, where the eight sample paths given in Table 11.1 are considered for a vanilla American put with strike price K = 1.1. For each sample path, we also have a set of cash flows at expiration; cash flows are positive where the option is in the money. Cash flows are discounted back to time t = 2 and used for the first linear regression. Assuming a risk free rate of 6% per period, the discount factor is e^−0.06 = 0.94176. The data for the regression are given in Table 11.2; X corresponds to current underlying asset price and Y corresponds to discounted cash flows in the future. We see that only the sample paths in which the option is in the money at time t = 2 are used. The following approximation is obtained:

Table 11.1 Sample path and cash flows at option expiration for a vanilla American put

Now, based on this approximation, we may compare at time t = 2 the intrinsic value and the continuation value. This is carried out in Table 11.3. Given the exercise decisions, we update the cash flow matrix. Note that the exercise decision does not exploit knowledge of the future. Consider, for instance, sample path 4: we exercise, earning $0.13; along that sample path, later we would regret our decision, because we could have earned $0.18 at time t = 3. This is what nonanticipativity is all about. We should also note that on some paths we exercise at time t = 2, and this is reflected by the updated cash flow matrix in the table.

Table 11.3 Comparing intrinsic and continuation value at time t = 2, and resulting cash flow matrix

The process is repeated going backward in time. To carry out the regression, we must consider the cash flows on each path, resulting from the early exercise decisions. Assume that we are at time step t, and consider path i. For each sample path i, there is an exercise time t_e^*, which we might set conventionally to T + 1 if the option will never be exercised in the future. Then the regression problem (11.10) should be rewritten, for the generic time period t, as:

Since there can be at most one exercise time for each path, it may be the case that after comparing the intrinsic value with the continuation value on a path, the exercise time t^*_e is reset to a previous period. Stepping back to time t = 1, we have the regression data of Table 11.4. The discount factor e^−2·0.06 = 0.88692 is applied on paths 1 and 8. Since the cash flow there is zero, the discount factor is irrelevant, but we prefer using this to point out that we are discounting cash flows from time period t = 3; if we had a positive cash flow at t = 3 and zero cash flow at t = 2, this is the discount factor we should use. Least squares yield the approximation:

This approximation may seem unreasonable, as we expect smaller payoffs for larger asset prices, yet the highest power of the polynomial has a positive coefficient here. It can be verified that, for the range of X values we are considering, the function is in fact decreasing. Based on this approximation of the continuation value, we obtain the exercise decisions illustrated in Table 11.5. Discounting all cash flows back to time t = 0 and averaging over the eight sample paths, we get an estimate of the continuation value of $0.1144, which is larger than the intrinsic value $0.1; hence, the option should not be exercised immediately.

Table 11.5 Comparing intrinsic and continuation value at time t = 1, and resulting cash flow matrix

Figure 11.17 shows R code to implement this regression-based approach in general. The function is modular in that it takes scenarios generated outside by another function. Figure 11.18 shows a script to check the toy example we have considered with the generic function. We insist again on the fact that a safer approach would first learn the exercise policy, on the basis of a limited number of replications. Then, using a brand new and possibly much larger set of replications, we should find the low-biased estimator.

Finally, it is important to interpret the regression-based approach within the framework of approximate dynamic programming. Unlike other numerical approaches for stochastic dynamic programming, which we have considered in Section 10.7, we did not consider the expectation of the value functions. The expectation is directly built into the conditional expectation, as linear regression yields a function of S_t:

In this approach, there is no need to tackle the expectation of V_t+1(S_t+1) by, say, Gaussian quadrature. As we have seen in Section 10.8, this corresponds to the concept of value function for the post-decision state variable. In this case the post-decision state variable is just the current asset price, which in practice is the same as the pre-decision state variable, but it should be interpreted as the state after the decision to continue. The point is that our decision is whether we exercise or not, and this does not influence the state variable, unlike other dynamic programming models. Indeed, linear regression yields the value of the continuation, which is then compared against the value of immediate exercise to find the option value.

We close the section by noting that in some ADP approaches, a recursive approach to linear regression is adopted, whereby the coefficients are incrementally updated by considering one sample path at time. Here we are using a batch regression, whereby all sample paths are generated at once and we run a linear regression per time step on all of them. The relative merits of the two strategies may depend on the specific problem at hand.

11.4 A look outside the BSM world: Equity options under the Heston model

Option pricing within the BSM world is only a first step to grasp the remarkable potential of Monte Carlo methods for financial engineering. We have considered some more general models than geometric Brownian motion in Chapter 3, such as the Heston stochastic volatility model and mean-reverting square-root diffusions for short rates. Path generation mechanisms for such processes have been introduced in Chapter 6. All of these tools may be integrated to relax the restrictive assumptions of the BSM formula and price options in a more general and, hopefully, more realistic setting. We should not forget that in doing so we may lose market completeness. Hence, as we have pointed out in Section 3.9.3, we may have to face a nontrivial model calibration task. In this and the next section we do not consider calibration, but we just illustrate the flexibility of Monte Carlo methods, which indeed are often (not always) the only viable pricing strategy.

In Section 3.7.5 we have considered the Heston stochastic volatility model, reported below for convenience, under the risk-neutral measure:

The model integrates a GBM with nonconstant volatility and a square-root diffusion modeling squared volatility, where

is a long-term value, α measures the speed of reversion to the mean, and ξ is the volatility of the square-root diffusion. Different assumptions can be made about the instantaneous correlation ρ of the two driving Wiener processes W_t¹ and W_t². Suppose that we are interested in pricing a vanilla call option. Unlike the case of GBM, we cannot sample the price S_T at maturity directly, but we have to go through a whole sample path generation to keep the discretization error under control. A straightforward approach to discretize the above equations is the Euler scheme

where

_t¹ and

_t² are standard normals with correlation ρ. Since the Euler discretization does not guarantee non-negativity, we may heuristically patch the above expressions by taking the maximum between the result and 0, as shown in the code of Fig. 11.19. Here is one sample run of the function, and the reader is invited to try and play with it, in order to see the effect of the different parameters:

Actually, the Heston model allows for some semianalytical solutions in simple cases, but the Monte Carlo code can be adapted to more complicated options. As we mentioned, we have to generate a whole sample path, with a corresponding increase in computational effort with respect to the GBM case. Furthermore, since we are using a plain Euler scheme, the time step δt must be small enough to ensure accuracy. However, this additional cost may be not quite relevant if we want to price more exotic, possibly path-dependent, options. Asian options feature a payoff depending on the average underlying asset price. Another class of options, called lookback options, feature a payoff depending on the maximum or the minimum observed price. Consider, for instance, a lookback call option, which is characterized by the payoff

where t₁, t₂, …, t_n ≡ T is the collection of discrete time instants at which the price is monitored. It is very easy to adapt the code of Fig. 11.19 to price this option.

11.5 Pricing interest rate derivatives

In the first sections of this chapter we have used the risk-neutral valuation principle to price options on stocks, under the assumption of a constant risk-free rate. Using this principle, the price at time t of a path-independent, European-style option maturing in T can be written as

where f(S_T) is the payoff at maturity, depending on the stock price S_T,

is the pricing measure, and the expectation is conditional on the price S_t of the underlying asset in t. If we step into the domain of interest rate derivatives, there are substantial complications.

The first issue is that modeling interest rates is an intrinsically more difficult endeavor than modeling stock prices, since we should model a whole term structure, rather than a single stochastic process. Here we will skip this difficulty by adopting single-factor models depending only on the short-term (instantaneous) rate, like the Vasicek and the Cox–Ingersoll–Ross (CIR) models that we have mentioned in Chapter 3. These models rely on stochastic differential equations that are slightly more complicated than geometric Brownian motion, but we have seen in Section 6.3 how to generate sample paths for them.

Since we are dealing with stochastic interest rates, we cannot take the discount factor outside the expectation, as is done in Eq. (11.12). Furthermore, since we are dealing with a short rate, we should account for the whole path of the process r_t, even if the option payoff is path-independent. The risk-neutral pricing approach leads here to a formula like

(11.13) equation

Another quite tricky point is that, as we have pointed out in Section 3.9.3, the market here is incomplete. The implication is that there is not a unique measure

for pricing, but several ones. This means that we have to calibrate the short-rate model against the prices of quoted derivatives. In this chapter, we assume that such calibration has already been done, and when we use a model like

equation

we assume that the parameters define the risk-neutral measure for pricing purposes.

In order to calibrate an interest rate model, we need quoted prices of interest rate derivatives, and the simplest such assets are, in fact, bonds. This may sound odd, as bond pricing given a term structure of interest rates is an easy task, but the point here is that we want to estimate a risk-neutral model for pricing purposes, accounting for the stochastic character of the interest rates. In fact, bonds are interest rate derivatives in the sense that they are assets whose value depends on that underlying risk factor. Since a coupon-bearing bond can be priced as a portfolio of zero-coupon bonds, the first interest rate derivative that we should price is a zero-coupon bond. In such a case, the option payoff is just 1, if we assume a bond with $1 face value. Hence, the price at time t of a zero-coupon bond maturing at time T, when the short rate in t is r_t, can be expressed as

This expression is fairly simple, and indeed there are analytical formulas to price zero-coupon bonds under both Vasicek and CIR models. From a pedagogical viewpoint, though, it is quite useful to find those prices by Monte Carlo simulation in order to build skills that can be put to good use when dealing with more complex (and interesting) cases. From Eq. (11.14) we immediately see that to use Monte Carlo in this setting, we should generate sample paths of the short rate process and then approximate the integral by a sum:

where δt is the discretization step, and r_{t_k} is the short rate at the discretized time instants t_k, k = 0, …, m, such that t₀ = t and t_m = T. Note that in the sum we multiply by δt the rate observed at the beginning of each time slice.

The next step in the learning process is to price a relatively simple option, namely, a call option on a zero-coupon bond. The risk-neutral principle must be applied exactly in the same way, but we have to realize that there are two maturities:

The option maturity, T_O, i.e., the time at which the option can be exercised, by purchasing a zero-coupon bond at the strike price K.

The bond maturity, T_B; clearly, for the option to make sense, we must have T_O < T_B.

11.5.1 PRICING BONDS AND BOND OPTIONS UNDER THE VASICEK MODEL

The price at time t of a zero-coupon bond maturing in T, with face value $1, under the Vasicek short-rate model

We insist once again that the parameters of the Vasicek model are not the parameters of the real world, which could be estimated by analyzing time series of short rates, but those in the risk-neutral world. These formulas can be proved by solving the partial differential equation that we have given in Eq. (3.125),¹⁵ and are easily implemented R, as shown in Fig. 11.20.

To price the bond using Monte Carlo, we rely on the function simVasicek, listed in Fig. 6.12, to generate sample paths. In Fig. 11.20, we also show code to price the zero-coupon bond using random sampling, and a snapshot to compare the results. Running the code yields the following output:

We see that the results are fairly reasonable. The price at time 0 of a European-style call option maturing at time T_O on a zero-coupon bond maturing in T_B is, under the Vasicek model:

where r₀ is the current value of the short rate, Φ(·) is the CDF of the standard normal distribution, and

To interpret this formula, it is quite useful to note its deep similarity with the Black–Scholes–Merton price of a vanilla call option on a stock, given in Eq. (3.108), and note the following:

In Eq. (11.18), Z(0, r₀; T_B) should be interpreted as the price of the underlying asset, i.e., the bond maturing in T_B, whereas Z(0, r₀; T_O) should be interpreted as a discount factor from the option maturity T₀ to time 0.

The terms d₁ and d₂ in Eqs. (11.19) and (11.20) look much like the similar terms in the BSM formula.

The term S(T_O), where B(T_O, T_B) is just the function given in Eq. (11.16), plays the role of a volatility.

It is also useful to note that short-rates under the Vasicek model, which relies on an Ornstein–Uhlenbeck process, are normally distributed, and that the price of a zero-coupon bond is an exponential of these rates. Hence, the bond price is lognormally distributed, just like stock prices under the BSM model, and this is the essential reason behind the observed similarity.

In Fig. 11.21 we give R code to evaluate the exact call price, and a function to estimate it by straightforward Monte Carlo. We are cheating a bit there, as you may notice that to evaluate the payoff we are using the exact bond price. Otherwise, we should simulate sample paths of r_t from t = 0 to t = T_O to evaluate the discount factor, and then up to r = T_B to evaluate the bond price. We use a little short-cut to avoid this nightmare, but the reader will immediately appreciate the complexity of pricing a really difficult interest rate derivative.¹⁶

11.5.2 PRICING A ZERO-COUPON BOND UNDER THE CIR MODEL

We have noted in Chapter 6 that, since the CIR model relies on a square-root diffusion,

the short rates are no longer normal. Nevertheless, a zero-coupon bond can be priced analytically under the CIR model, even though the formulas are more complicated than those for the Vasicek model:

In Fig. 11.22 we show R code that implements these formulas, as well as a Monte Carlo pricer relying on the function simCIR of Fig. 6.14 to generate sample paths of the square-root diffusion. As usual, we also verify the sensibility of the results:

For further reading

A friendly introduction to numerical methods for option pricing can be found in [4].

The most comprehensive reference on Monte Carlo methods for option pricing is [9].

Other books that include interesting discussions of specific points are [11] and [12].

An early survey can be found in [2], which was later updated in [3].

An interesting collection of papers is proposed in [8].

The background on the interest rate derivatives that we have discussed in Section 11.5 can be found, e.g., in [16].

References

1 T. Björk. Arbitrage Theory in Continuous Time (2nd ed.). Oxford University Press, Oxford, 2004.

2 P. Boyle. Options: A Monte Carlo approach. Journal of Financial Economics, 4:323–338, 1977.

3 P. Boyle, M. Broadie, and P. Glasserman. Monte Carlo methods for security pricing. Journal of Economics Dynamics and Control, 21:1267–1321, 1997.

4 P. Brandimarte. Numerical Methods in Finance and Economics: A MATLAB-Based Introduction (2nd ed.). Wiley, Hoboken, NJ, 2006.

5 M. Broadie and P. Glasserman. Pricing American-style securities using simulation. Journal of Economic Dynamics and Control, 21:1323–1352, 1997.

6 M. Broadie, P. Glasserman, and S.G. Kou. A continuity correction for discrete barrier options. Mathematical Finance, 7:325–349, 1997.

7 L. Clewlow and C. Strickland. Implementing Derivatives Models. Wiley, Chichester, West Sussex, England, 1998.

8 B. Dupire, editor. Monte Carlo. Methodologies and Applications for Pricing and Risk Management. Risk Books, London, 1998.

9 P. Glasserman. Monte Carlo Methods in Financial Engineering. Springer, New York, 2004.

10 M.B. Haugh and L. Kogan. Pricing American options: A duality approach. Operations Research, 52:258–270, 2004.

12 R. Korn, E. Korn, and G. Kroisandt. Monte Carlo Methods and Models in Finance and Insurance. CRC Press, Boca Raton, FL, 2010.

13 Y.K. Kwok. Mathematical Models of Financial Derivatives. Springer, Berlin, 1998.

14 F.A. Longstaff and E.S. Schwartz. Valuing American options by simulation: A simple least-squares approach. Review of Financial Studies, 14:113–147, 2001.

15 S.M. Ross and J.G. Shanthikumar. Monotonicity in volatility and efficient simulation. Probability in the Engineering and Informational Sciences, 14:317–326, 2000.

16 P. Veronesi. Fixed Income Securities: Valuation, Risk, and Risk Management. Wiley, Hoboken, NJ, 2010.

17 P. Wilmott. Quantitative Finance (vols. I and II). Wiley, Chichester, West Sussex, England, 2000.

⁷This is a consequence of Jensen’s inequality: E[f(X)] ≥ f(E[X]) for a random vector X and a convex function f(·). See [9, Chapter 8] for an excellent discussion of bias issues in pricing options with early exercise.

¹²Again, we identify the intrinsic value with the value of immediate exercise.

¹³This point was also emphasized when we discussed the role of nonanticipativity in multistage stochastic programming.

¹⁴The criterion was suggested in the original paper by Longstaff and Schwartz, and we follow the idea, even though some authors have argued against its use.

¹⁶Suitable changes of measure are in fact used to price interest rate derivatives with Monte Carlo, but this requires deeper concepts that are beyond the scope of this book.

Path	Y	X
1	.00 × .94176	1.08
2	–	–
3	.07 × .94176	1.07
4	.18 × .94176	0.97
5	–	–
6	.20 × .94176	0.77
7	.09 × .94176	0.84
8	–	–