M9 Simulations for Rates of Return

Topics in Insurance, Risk, and Finance

Author

Affiliation

Yuyu Chen

Department of Economics, University of Melbourne

Published

2024

Introduction

Learning outcomes

Understand why Excel is needed for evaluating cash series.
Can describe the process of simulation.
Understand the theoretical basis for simulations and the Monte Carlo estimator.
Can compute the Monte Carlo estimator and its confidence interval.
Implement simulations to model rates of return and calculate related risk metrics in Excel.

Why Excel I

As we have seen previously, besides moments of the accumulated/present value of cash flows, we want to learn more about these quantities (like VaR), mainly for risk management purposes.
To do so, we need to know the corresponding distribution functions.
However, deriving an explicit expression of the distributions is extremely challenging. Given lognormal rates, explicit results are available only in the case of a single cash flow (i.e., $S_{n}$ and $V_{n}$ ).
In practice, cash flows can be much more complicated.

Why Excel II

We can use Excel (also other programming languages) to simulate random variables, and use these to generate random observations of cash flow series.
These generations will result in an empirical distribution of the cash flow series. We can then use this distribution to estimate moments, quantiles, ranges, and probabilities.
Simulation is widely used in the work by actuaries.
Watching someone else use Excel has limited value, however, and so you must attempt these in your own time.

Theoretical preparation

Generating random numbers: uniform distribution

A computer cannot generate true random numbers.
Instead, Excel (and other programs) generate approximations that are called pseudorandom numbers.
Once the parameter of the simulation is fixed, the same random samples will be generated each time.
We will refer to pseudorandom numbers as random for brevity.
There are multiple ways to generate random numbers in Excel.
To generate random numbers from a uniform distribution on $(0, 1)$ , we can use the RAND() (or RANDARRAY) function, which gives us realizations of independent and identically distributed uniform random variables.
Generating uniform random numbers is at the core of simulations.

Inverse transformation

Questions: How to show probability transformation? Does the result hold for discrete distributions?

Inverse transformation: proof

We have that $F (x) \geq y \Rightarrow x \geq F^{- 1} (y) .$ This is by the definition of $F^{- 1}$ .

Next, note that $F^{- 1} (y) \leq x$ implies $y \leq F (F^{- 1} (y)) \leq F (x)$ . Here, the first inequality uses the fact that $F$ is right-continuous. Then $F (x) \geq y ⟺ x \geq F^{- 1} (y) .$

Thus we have $\p (F^{- 1} (U) \leq x) = \p (U \leq F (x)) = F (x) .$

Generating random numbers: inverse transformation

Therefore, we can use inverse transformation to generate random numbers of a distribution as long as the (generalized) inverse of the distribution is known (i.e., VaR).
That is, for a random variable $X$ , we first generate a sample of a standard uniform random variable $u_{1}, \dots, u_{n}$ . Then, by inverse transformation, a sample of $X \sim F$ is $F^{- 1} (u_{1}), \dots, F^{- 1} (u_{n}) .$
Note that not all distributions have nice formulas for their generalized inverse (e.g., normal distributions). One may need to rely on $z$ -table for normal distributions.

Generating random numbers: inverse functions in Excel

The inverse functions of many commonly used distributions are available in Excel.
The general form of such inverse function is F.INV(p,...): If p = F.DIST(x,...), then F.INV(p,...) = x.
Here, F.DIST is the distribution function.

Some inverse functions are summarized below

BETA.INV for Beta distribution
LOGNORM.INV for lognormal distribution
NORM.INV for normal distribution
NORM.S.INV for standard normal distribution

To see more inverse/probability functions Excel, go to Formulas $\to$ More Functions $\to$ Statistical.

Generating random numbers: example A

Generate random numbers from normal distributions.

To get a random number of standard normal random variable, we take the inverse cumulative normal of a uniform random variable in the range $(0, 1)$ . In Excel, NORMSINV(RAND()) will generally do.
To get a large sample of random numbers, one can use NORMSINV(RANDARRAY()).
Plot a histogram of the sample.
To get a general normal random variable $X \sim N (μ, σ^{2})$ , from a standard normal random variable $Z$ , we simply take $X = μ + σ Z .$

Generating random numbers: example B

Generate random numbers from a uniform distribution $F$ on the interval $(a, b)$ where $b > a$ .

We first find the inverse function of $F$ , which is $F^{- 1} (p) = a + (b - a) p .$
To get a random number from the uniform distribution, we use $a + (b - a) \times$ RAND().
To get a large sample of random numbers, one can use $a + (b - a) \times$ RANDARRAY().
Plot a histogram of the sample.

Generating random numbers: example C I

Generate random numbers from the following distribution $F (x) = {\begin{cases} 1 - \frac{1}{x + 1}, 0 \leq x < 9, \\ 0.95, 9 \leq x < 10, \\ 1, x \geq 10. \end{cases}$

Generating random numbers: example C II

We first find the inverse function $F^{- 1} (p) = {\begin{cases} \frac{1}{1 - x} - 1, 0 < p \leq 0.9, \\ 9, 0.9 < x \leq 0.95, \\ 10, 0.95 < x \leq 1. \end{cases}$

Generating random numbers: example C III

We first generate a random number from a standard uniform distribution using RAND (or a sample using RANDARRAY). Store it in A1.
To write the inverse function, we use IF function: IF(condition, value1 if the condition is true, value2 if the condition is false)
Logic operator: AND(condition1,condition2), OR(condition1,condition2)
The inverse function can be written as IF(A1<=0.9,1/(1-A1)-1,IF(0.9< A1<=0.95,9,10)) or IF(A1<=0.9,1/(1-A1)-1,IF(AND(A1>0.9, A1<=0.95),9,10))

Monte Carlo estimator

Given a sample of a random variable $X$ , we are now able to approximate $θ = \E (X) .$
Let $X_{1}, \dots, X_{n}$ be iid copies of $X$ . The Monte Carlo estimator of $θ$ can be obtained by $\hat{θ} = \frac{1}{n} \sum_{i = 1}^{n} X_{i} .$
Note that this estimator also works for functions of random variables, i.e., $f (X)$ .
If $f (x) = x^{k}$ , we can approximate the $k$ th moment.
If $f (x) = \id_{{x \in [a, b]}}$ , we can approximate the probability landing in $[a, b]$ .

Theoretical basis of simulation: Why does it work?

$\hat{θ}$ is unbiased, i.e., $\E (\hat{θ}) = θ$ .
The Law of Large Numbers: for iid random variables $X_{1}, \dots, X_{n}$ , $lim_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} X_{i} \to \E (X_{1})$ with probability 1. Moreover, for iid random variables $Y_{1}, \dots, Y_{n}$ , $lim_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} f (Y_{i}) \to \E (f (Y_{1}))$ with probability 1.
Also $\var (\hat{θ}) = \var (\frac{1}{n} \sum_{i = 1}^{n} X_{i}) = \frac{1}{n} \var (X) .$

Hence, if $n$ is large, $\hat{θ}$ can be very close to $θ$ .

Theoretical basis of simulation: standard error

By the Central Limit theorem, the distribution of $\sum_{i = 1}^{n} X_{i} / n$ can be approximated by $N (\E (X_{1}), \var (X_{1}) / n)$ .
So the error of the simulation, i.e., $\sum_{i = 1}^{n} X_{i} / n - \E (X_{1})$ is normally distributed with mean $0$ and variance $\var (X_{1}) / n$ . The standard deviation $\sqrt{\var (X_{1}) / n}$ is called the standard error.
A problem is that we do not know the variance of $X$ (even the expectation is unknown, otherwise simulation is redundant).
For this, we can replace $\var (X)$ by sample variance $s_{n}^{2} = \frac{1}{n - 1} \sum_{i = 1}^{n} (X_{i} - \hat{θ})^{2},$ which is an unbiased estimator of $\var (X)$ .

Theoretical basis of simulation: confidence interval I

By the Law of Large numbers, $s_{n}^{2} \to \var (X)$ with probability 1. (why?)
Then with the Central Limit Theorem and Slutsky’s Theorem, we have the following result.

Theoretical basis of simulation: confidence interval II

By the previous theorem, a $1 - α$ confidence interval of $θ$ is $(\hat{θ} - z_{c} \frac{s_{n}}{\sqrt{n}}, \hat{θ} + z_{c} \frac{s_{n}}{\sqrt{n}})$ where $\p (- z_{c} < Z < z_{c}) = 1 - α$ . Here $Z \sim N (0, 1)$ .

Example: MC estimate I

Suppose that interest rates $i_{1}, i_{2}, i_{3} \sim U (0, 0.5)$ are independent. An agent simulates 3 paths of $i_{1}, i_{2}, i_{3}$ using inverse transformation. The random numbers, generated from a standard uniform distribution, for $i_{1}, i_{2}, i_{3}$ are documented sequentially below:

Compute a Monte Carlo estimate of the mean of $S_{3}$ with a $90 %$ confidence interval.

Example: MC estimate II

The simulated values of $i_{1}, \dots, i_{3}$ are:

Hence $S_{3}$ in each path is:

Example: MC estimate III

Therefore, the Monte Carlo estimate of $\E (S_{3})$ is $\frac{1}{3} (1.629 + 1.679 + 1.687) = 1.665,$ and $s_{n}^{2} = 0.000988$ . The confidence interval is $(1.665 - 1.65 * 0.01815, 1.665 + 1.65 * 0.01815) .$

Question: Is this a good confidence interval?

Some basic Excel functions

Summary statistics

Some functions:

mean: AVERAGE()
variance of a sample: VAR()
variance of a population: VARP()
standard deviation of a sample: STDEV()
standard deviation of a population: STDEVP()
percentile: PERCENTILE()

Analysis ToolPak

Load the Analysis ToolPak in Excel for Mac

Click the Tools menu, and then click Excel Add-ins.
In the Add-Ins available box, select the Analysis ToolPak check box, and then click OK.
If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to locate it.
If you get a prompt that the Analysis ToolPak is not currently installed on your computer, click Yes to install it.
Quit and restart Excel.

Now the Data Analysis command is available on the Data tab.

Analysis ToolPak

Data Analysis can complete basic statistical tasks including:

generate random numbers (we can use seed here)
summary statistics
frequency table

Exercise A: $S_{n}$ I

Recall that if rates are iid, even if the accumulation factors do not follow a lognormal distribution, one can still use lognormal distribution to approximate $S_{n}$ or $V_{n}$ .
This can be seen by noting that $\log S_{n} = \sum_{t = 1}^{n} \log (1 + i_{t}) .$ By central limit theorem, $\log S_{n}$ can be approximated by a normal distribution for large $n$ .
Next, we will assess the performance of using lognormal distribution as an approximation.

Exercise A: $S_{n}$ II

Suppose that in a varying rate model, $i_{1}, \dots, i_{30} \sim U (0.06, 0.10)$ . Complete the following tasks:

Generate a random sample of $(1 + i_{t})$ with size $6000 \times 30$ .
Compute the first and second moments of $S_{30}$ numerically.
Plot the mean of $S_{30}$ against the simulation number $k = 1, \dots, 6000$ .
Compute the first and second moments of $S_{30}$ analytically and compare them with the numerical result.
Use the analytical results for moments of $\log S_{30}$ as the corresponding parameters of a lognormal distribution. Compare the histogram of the sample of $S_{30}$ and the density of the lognormal distribution. Explain the observation.
Use FREQUENCY to generate frequency table given Bin series.

Compare the simulated results with lognormal distribution.

Exercise A: $S_{n}$ III

We compute the first and second moments of $S_{30}$ analytically. We have $\begin{array}{r} \E (1 + i_{t}) = 1 + \frac{0.06 + 0.1}{2} = 1.08 \end{array}$ and $\begin{aligned} \E ((1 + i_{t})^{2}) & = 1 + 2 \E (i_{t}) + \E (i_{t}^{2}) \\ = 1 + 2 \frac{0.06 + 0.1}{2} + \int_{0.06}^{0.1} x^{2} \frac{1}{0.1 - 0.06} d x = 1.6665 . \end{aligned}$ Then $\E (S_{30}) = {1.08}^{30} = 10.0626$ and $\E (S_{30}^{2}) = {1.6665}^{30} = 101.6049$ .

Exercise A: $S_{n}$ IV

Some observations:

The mean and second moment of the sample are close to the analytical solution.
As the simulation number increases, the mean of $S_{30}$ becomes more and more stable and it is closer to the true value.
The simulated distribution of $S_{30}$ is very close to the lognormal distribution. This is due to the central limit theorem: as $\log S_{30}$ is close to a normal distribution, $S_{30}$ is close to a lognormal distribution.

Exercise A: $S_{n}$ Remarks

Some remarks:

To fill a column/row, use the fill function (Home $\to$ Editing $\to$ Fill $\to$ Series).
To select a large range of cells, use the name box in the top left of the workbook (e.g., A1:B2).
When use the autofill function the columns should be adjacent.
Although Excel states the lognormal function as LOGNORM.DIST(x,mean,standard dev), the mean and standard deviation in this function refer to the equivalent normal distribution mean and standard deviation.

Exercise B: $A_{n}$ I

Suppose that $1 + i_{1}, \dots, 1 + i_{10} \sim L N (0.09, {0.04}^{2})$ are independent. Complete the following tasks.

Calculate the mean and standard deviation of $A_{10}$ using recursive formulae.
Calculate the mean and standard deviation of $A_{10}$ numerically using 5000 simulations.
Estimate the $5 %$ and $95 %$ percentiles of $A_{10}$ .
Estimate $\p (A_{10} < 14.75)$ .
Get summary statistics of $A_{10}$ .

Exercise B: $A_{n}$ II

The recursive formulae: Since $A_{n} = (1 + i_{n}) (1 + A_{n - 1})$ , we have $\E (A_{n}) = \E ((1 + i_{n})) (1 + \E (A_{n - 1})),$ and $\E (A_{n}^{2}) = \E ((1 + i_{n})^{2}) (1 + 2 \E (A_{n - 1}) + \E (A_{n - 1}^{2})) .$

Simulation in VBA

VBA I

We can use VBA to automate the process of simulation (i.e., the simulation can be done automatically for a different set of parameters).

To do so, we need to put our code in a marco.
We first enable the Developer tab in Excel.
Then go to Developer $\to$ Visual Basic $\to$ project folder to open module.
If there is no module, right click the project folder to insert a module.
We will learn some basic functions to complete simulation tasks.

VBA II

Each automated task starts with Sub taskName() and ends with End Sub
To declare variables in the task: Dim variableName As variableType
Types of variables can be Long (usually for large integer values), Double (usually for values with decimals), Range, and so on. We mainly consider numeric variables.
To assign a value from cell, e.g., A1 from worksheet B to a variable C: C = Sheets("B").Range("A1").Value
To output the value of variable C to cell A1 from worksheet B: Sheets("B").Range("A1").Value = C
The aboveRange("A1") can be replaced by Cells(1,1)
To assign a range, use Set variableName = Range("")

VBA III

We can call Excel worksheet function by WorksheetFunction:

sum and average: WorksheetFunction.Sum and WorksheetFunction.Average
max and min: WorksheetFunction.Max and WorksheetFunction.Min
count and countif: WorksheetFunction.Count and WorksheetFunction.CountIf

VBA IV

Simulations:

To initialize random number generator: Randomize (each run gives a different simulation).
To simulate a number from a uniform distribution on $(0, 1)$ : Rnd
To simulate a number from a normal distribution $N (μ, σ^{2})$ : WorksheetFunction.Norm_Inv(Rnd, $μ$ , $σ$ )
To simulate a number from a lognormal distribution $L N (μ, σ^{2})$ : WorksheetFunction.LogNorm_Inv(Rnd, $μ$ , $σ$ )

Loop: start with For i = 1 To maxNumberOfLoops and end with Next i.

Exercise C

Let $n$ be a the number of simulations. Automate the following task.

Generate $n$ $N (0, 1)$ random numbers in a worksheet. When running the task, the cells containing the previous simulations should be empty.
Calculate the average of the random numbers.

Exercise D

Suppose that in a varying rate model, $i_{1}, \dots, i_{t} \sim U (a, b)$ . Let $n$ be the number of simulations. Automate the following tasks.

Generate $S_{t}$ , $S_{t}^{2}$ , and $\log S_{t}$ given arbitrary $t$ , $a$ , $b$ , and $n$ .
Numerically calculate the first two moments of $S_{t}$ and $\log S_{t}$ .
Enable one button to run the above two tasks and another button to erase the simulation results for $S_{t}$ , $S_{t}^{2}$ , and $\log S_{t}$ .

Introduction

Learning outcomes

Why Excel I

Why Excel II

Theoretical preparation

Generating random numbers: uniform distribution

Inverse transformation

Inverse transformation: proof

Generating random numbers: inverse transformation

Generating random numbers: inverse functions in Excel

Generating random numbers: example A

Generating random numbers: example B

Generating random numbers: example C I

Generating random numbers: example C II

Generating random numbers: example C III

Monte Carlo estimator

Theoretical basis of simulation: Why does it work?

Theoretical basis of simulation: standard error

Theoretical basis of simulation: confidence interval I

Theoretical basis of simulation: confidence interval II

Example: MC estimate I

Example: MC estimate II

Example: MC estimate III

Some basic Excel functions

Summary statistics

Analysis ToolPak

Analysis ToolPak

Exercise A: Sn I

Exercise A: Sn II

Exercise A: Sn III

Exercise A: Sn IV

Exercise A: Sn Remarks

Exercise B: An I

Exercise B: An II

Simulation in VBA

VBA I

VBA II

VBA III

VBA IV

Exercise C

Exercise D

Exercise A: $S_{n}$ I

Exercise A: $S_{n}$ II

Exercise A: $S_{n}$ III

Exercise A: $S_{n}$ IV

Exercise A: $S_{n}$ Remarks

Exercise B: $A_{n}$ I

Exercise B: $A_{n}$ II