R Programming for Actuarial Science - Peter McQuire - E-Book

R Programming for Actuarial Science E-Book

Peter McQuire

0,0
65,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

R Programming for Actuarial Science Professional resource providing an introduction to R coding for actuarial and financial mathematics applications, with real-life examples R Programming for Actuarial Science provides a grounding in R programming applied to the mathematical and statistical methods that are of relevance for actuarial work. In R Programming for Actuarial Science, readers will find: * Basic theory for each chapter to complement other actuarial textbooks which provide foundational theory in depth. * Topics covered include compound interest, statistical inference, asset-liability matching, time series, loss distributions, contingencies, mortality models, and option pricing plus many more typically covered in university courses. * More than 400 coding examples and exercises, most with solutions, to enable students to gain a better understanding of underlying mathematical and statistical principles. * An overall basic to intermediate level of coverage in respect of numerous actuarial applications, and real-life examples included with every topic. Providing a highly useful combination of practical discussion and basic theory, R Programming for Actuarial Science is an essential reference for BSc/MSc students in actuarial science, trainee actuaries studying privately, and qualified actuaries with little programming experience, along with undergraduate students studying finance, business, and economics.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 881

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



R Programming for Actuarial Science

Peter McQuire University of Kent Canterbury UK

Alfred Kume University of Kent Canterbury UK

 

 

 

This edition first published 2024

© 2024 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Peter McQuire and Alfred Kume to be identified as the authors of this work has been asserted in accordance with law.

Registered Offices

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

A catalogue record for this book is available from the Library of Congress

Hardback ISBN: 9781119754978; ePub ISBN: 9781119754992; ePDF ISBN: 9781119754985; oBook ISBN: 9781119755005

Cover Design: Wiley

Cover Image: © Peter McQuire

Set in 9.5/12.5pt STIXTwoText by Integra Software Services Pvt. Ltd, Pondicherry, India

To my wife, Jenny, and daughter, Lauren, for their constant support and encouragement. (Peter McQuire)

To my wife Ortenca, for her support throughout the process. (Alfred Kume)

Contents

Cover

Title Page

Copyright Page

Dedication

About the Companion Website

Introduction

1 Main Objectives of This Book

2 Who Is This Book For?

3 How to Use This Book

4 Book Structure

5 Chapter Style

6 Examples and Exercises

7 Verification of Code and Calculations – Best Practice

8 Website: www.wiley.com/go/rprogramming

9 R or Microsoft Excel?

10 Caveats

11 Acknowledgements

1 R : What You Need to Know to Get Started

1.1 Introduction

1.2 Getting Started: Installation of R and RStudio

1.2.1 Installing R

1.2.2 What Is RStudio?

1.2.3 Inputting R Commands

1.3 Assigning Values

1.4 Help in R

1.5 Data Objects in R

1.6 Vectors

1.6.1 Numeric Vectors

1.6.2 Logical Vectors

1.6.3 Character Vectors

1.6.4 Factor Vectors

1.7 Matrices

1.8 Dataframes

1.9 Lists

1.10 Simple Plots and Histograms

1.11 Packages

1.12 Script Files

1.13 Workspace, Saving Objects, and Miscellany

1.14 Setting Your Working Directory

1.15 Importing and Exporting Data

1.15.1 Importing Data

1.15.2 Exporting Data

1.16 Common Errors Made in Coding

1.17 Next Steps

1.18 Recommended Reading

1.19 Appendix: Coercion

2 Functions in R

2.1 Introduction

2.1.1 Objectives

2.1.2 Core and Package Functions

2.1.3 User-Defined Functions

2.2 An Introduction to Applying Core and Package Functions

2.2.1 Examples of Simple, Common Functions

2.3 User-Defined Functions

2.3.1 What does a “udf” consist of?

2.3.2 Naming Conventions

2.3.3 Examples and Exercises

2.4 Using Loops in R – the “for” Function

2.5 Integral Calculus in R

2.5.1 The “Integrate” Function

2.5.2 Numerical Integration

2.6 Recommended Reading

3 Financial Mathematics (1): Interest Rates and Valuing Cashflows

3.1 Introduction

3.2 The Force of Interest

3.3 Present Value of Future Cashflows

3.4 Instantaneous Forward Rates and Spot Rates

3.5 Non-Constant Force of Interest

3.5.1 Discrete Cashflows

3.5.2 Cashflows Which Are Continuous

3.6 Effective and Nominal Rates of Interest

3.6.1 Effective Rates of Interest

3.6.2 Why Do We Use Effective Rates?

3.6.3 Nominal Interest Rates

3.7 Appendix: Force of Interest – An Analogy with Mortality Rates

3.8 Recommended Reading

4 Financial Mathematics (2): Miscellaneous Examples

4.1 Introduction

4.2 Writing Annuity Functions

4.2.1 Writing a function for an annuity certain

4.3 The ‘presentValue’ Function

4.4 Annuity Function

4.5 Bonds – Pricing and Yield Calculations

4.6 Bond Pricing: Non-Constant Interest Rates

4.7 The Effect of Future Yield Changes on Bond Prices Throughout the Term of the Bond

4.8 Loan Schedules

4.8.1 Introduction

4.8.2 Method 1

4.8.3 Method 2

4.9 Recommended Reading

5 Fundamental Statistics: A Selection of Key Topics - Dr A Kume

5.1 Introduction

5.2 Basic Distributions in Statistics

5.3 Some Useful Functions for Descriptive Statistics

5.3.1 Introduction

5.3.2 Bivariate or Higher Order Data Structure

5.4 Statistical Tests

5.4.1 Exploring for Normality or Any Other Distribution in the Data

5.4.2 Goodness-of-fit Testing for Fitted Distributions to Data

5.4.2.1 Continuous distributions

5.4.2.2 Discrete distributions

5.4.3 T-tests

5.4.3.1 One sample test for the mean

5.4.3.2 Two sample tests for the mean

5.4.4 F-test for Equal Variances

5.5 Main Principles of Maximum Likelihood Estimation

5.5.1 Introduction

5.5.2 MLE of the Exponential Distribution

5.5.2.1 Obtaining the MLE numerically using R

5.5.2.2 Obtaining the MLE analytically

5.5.3 Large Sample (Asymptotic) Properties of MLE

5.5.4 Fitting Distributions to Data in R Using MLE

5.5.5 Likelihood Ratio Test, LRT

5.6 Regression: Basic Principles

5.6.1 Simple Linear Regression

5.6.2 Quantifying Uncertainty on

5.6.3 Analysis of Variance in Regression

5.6.3.1

R

2

and adjusted

R

2

Coefficient of Determination

5.6.4 Some Visual Diagnostics for the Proposed Simple Regression Model

5.7 Multiple Regression

5.7.1 Introduction

5.7.2 Regression and MLE

5.7.2.1 Multivariate Regression

5.7.3 Tests

5.7.3.1 Likelihood Ratio Test in Regression

5.7.3.2 Akaike Information Criterion: AIC

5.7.3.3 AIC and Regression model selection

5.7.3.4 Bayesian Information Criterion: BIC

5.7.4 Variable Selection, Finding the Most Appropriate Sub-Model

5.7.5 Backward Elimination

5.7.6 Forward Selection

5.7.7 Using AIC/BIC Criteria

5.7.8 LRT in Model Selection

5.7.9 Automatic Search Using R-squared Criteria

5.7.10 Concluding Remarks on Test Data

5.7.11 Modelling Beyond Linearity

5.8 Dummy/Indicator Variable Regression

5.8.1 Introducing Categorical Variables

5.8.2 Continuous and Indicator Variable Predictors – Including Load in the Model

5.9 Recommended Reading

6 Multivariate Distributions, and Sums of Random Variables

6.1 Multivariate Distributions – Examples in Finance

6.2 Simulating Multivariate Normal Variables

6.3 The Summation of a Number of Random Variables

6.4 Conclusion

6.5 Recommended Reading

7 Benefits of Diversification

7.1 Introduction

7.2 Background

7.3 Key Mathematical Ideas

7.4 Running Simulations

7.5 Recommended Reading

8 Modern Portfolio Theory

8.1 Introduction

8.2 2-Asset Portfolio

8.3 3-Asset Portfolio

8.4 Introduction of a Risk-free Asset to the Portfolio

8.4.1 Adding a Risk-free Asset

8.4.2 Capital Market Line and the Sharpe Ratio

8.4.3 Borrowing to Obtain Higher Returns

8.5 Appendix: Lagrange Multiplier Method

8.6 Recommended Reading

9 Duration – A Measure of Interest Rate Sensitivity

9.1 Introduction

9.2 Duration – Definitions and Interpretation

9.3 Duration Function in R

9.4 Practical Applications of Duration

9.5 Recommended Reading

10 Asset-Liability Matching: An Introduction

10.1 Introduction

10.2 What Interest Rates Do Institutions Use To Measure Their Liabilities?

10.3 Variance of the Solvency Position

10.4 Characteristics of Various Asset Classes and Liabilities

10.5 Our Scenarios

10.6 Results

10.7 Simulations

10.8 Exercise and Discussion – an Insurer With Predominately Short-Term Liabilities

10.9 Potential Exercise

10.10 Conclusions

10.11 Recommended Reading

11 Hedging: Protecting Against a Fall in Equity Markets

11.1 Introduction

11.2 Our Example

11.2.1 Futures Contracts – A Brief Explanation

11.2.2 Our Task

11.3 Adopting a Better Hedge

11.4 Allowance for Contract and Portfolio Sizes

11.5 Negative Hedge Ratio

11.6 Parameter and Model Risk

11.7 A Final Reminder on Hedging

11.8 Recommended Reading

12 Immunisation – Redington and Beyond

12.1 Introduction

12.2 Outline of Redington Theory and Alternatives

12.3 Redington’s Theory of Immunisation

12.4 Changes in the Shape of the Yield Curve

12.5 A More Realistic Example

12.5.1 Determining a Suitable Bond Allocation

12.5.2 Change in Yield Curve Shape

12.5.3 Liquidity Risk

12.6 Conclusion

12.7 Recommended Reading

13 Copulas

13.1 Introduction

13.2 Copula Theory – The Basics

13.3 Commonly Used Copulas

13.3.1 The Independent Copula

13.3.2 The Gaussian Copula

13.3.3 Archimedian Copulas

13.3.4 Clayton Copula

13.3.5 Gumbel Copula

13.4 Copula Density Functions

13.5 Mapping from Copula Space to Data Space

13.6 Multi-dimensional Data and Copulas

13.7 Further Insight into the Gaussian Copula: A Non-rigorous View

13.8 The Real Power of Copulas

13.9 General Method of Fitting Distributions and Simulations – A Copula Approach

13.9.1 Fitting the Model

13.9.2 Simulating Data Using the mvdc and rMvdc Functions

13.10 How Non-Gaussian Copulas Can Improve Modelling

13.11 Tail Correlations

13.12 Exercise (Challenging)

13.13 Appendix 1 – Copula Properties

13.14 Appendix 2 – Rank Correlation and Kendall’s Tau,

τ

13.15 Recommended Reading

14 Copulas – A Modelling Exercise

14.1 Introduction

14.2 Modelling Future Claims

14.2.1 Data

14.2.2 Fitting Appropriate Marginal Distributions

14.2.3 Fitting The Copula

14.2.4 Assessing Risk From the Analysis of Simulated Values

14.2.5 Comparison with the Gaussian Copula Model

14.2.6 Comparison of the Models with the Data

14.3 Another Example: Banking Regulator

14.4 Conclusion

15 Bond Portfolio Valuation: A Simple Credit Risk Model

15.1 Introduction

15.2 Our Example Bond Portfolio

15.2.1 Description

15.2.2 The Transition Matrix

15.2.3 Correlation Matrix

15.2.4 Simulations and Results

15.2.5 Incorporating Interest Rate Risk – A Simple Adjustment

15.2.6 Portfolio Consisting of Highly Correlated Bonds

15.3 Further Development of this Model

15.4 Recommended Reading

16 The Markov 2-State Mortality Model

16.1 Introduction

16.2 Markov 2-State Model

16.3 Simple Applications of the 2-State Model

16.4 Estimating Mortality Rates from Data

16.5 An Example: Calculating Mortality Rates for One Age Band

16.6 Uncertainty in Our Estimates

16.7 Next Steps?

16.8 Appendices

16.8.1 Informal Discussion of

μ

16.8.2 Intuitive meaning of

f

x

(t)

16.9 Recommended Reading

17 Approaches to Fitting Mortality Models: The Markov 2-state Model and an Introduction to Splines

17.1 Introduction

17.2 Graduation of Mortality Rates

17.3 Fitting Our Data

17.3.1 Objective

17.3.2 Summarised Data

17.4 Model Fitting with Least Squares

17.5 Individual Member Data

17.6 Comparing Life Tables with a Parametric Formula

17.7 Splines: An Introduction

17.7.1 Overview

17.7.2 Data

17.7.3 Fitting the Model: Spline regression

17.7.4 Adjusted Dataset

17.8 Summary

17.9 Recommended Reading

18 Assessing the Suitability of Mortality Models: Statistical Tests

18.1 Introduction

18.2 Theory

18.3 Our Mortality Data and Various Proposed Mortality Rates

18.4 Testing the Standard Table Rates – Table 1,

18.4.1 Data and initial plot

18.4.2

x

2

test

18.4.3 Signs Test – for Overall Bias

18.4.4 Serial Correlations Test; Testing for Bias Over Age Ranges

18.4.5 Analysing the Distribution of Deviances

18.4.6 logL, AIC Calculations

18.4.7 Conclusions on Conclusions on

18.5 Graduation of Mortality Rates by Adjusting a Standard Table

18.5.1 Testing Table 2,

18.5.2 Adjusting Table 2

18.6 Testing Graduated Rates Obtained from a Parametric Formula,

18.7 Comparing Our Candidate Rates

18.8 Over-fitting

18.9 Other Thoughts

18.10 Appendix – Alternative Calculations of LogL’s

18.11 Recommended Reading

19 The Lee-Carter Model

19.1 Introduction

19.2 Using the L-C Model to Create Data and Fit the Model

19.2.1 Introducing the Lee-Carter Model

19.2.2 Calculating the Parameter Values

19.2.3 Interpretation of

a

x

,

b

x

, and

k

t

19.3 Using L-C to Model Actual Mortality Data from HMD

19.4 Using the lca Function in the Demography Package

19.5 Constructing Your Own Demogdata Object

19.6 Forecasting Mortality Rates

19.7 Case Study: The Impact of the HIV Virus on Mortality Rates

19.8 Recommended Reading

20 The Kaplan-Meier Estimator

20.1 Introduction

20.2 What Is Censoring?

20.2.1 Non-Informative Censoring

20.3 Defining the Relevant Event

20.4 K-M Theory

20.5 Introductory Example: Monitoring Delays in Making Claim Payments

20.6 Lung Cancer Example

20.6.1 Basic Results

20.6.2 Comparison of Male and Female Rates

20.6.3 Doctor Assessment Scores – ph.ecog

20.7 Issues with the Kaplan-Meier Model

20.8 Recommended reading

21 Cox Proportionate Hazards Regression Model

21.1 Introduction

21.2 Cox Model Equation

21.3 Applications

21.3.1 Smokers’ Mortality: Small Data Set

21.3.2 Smokers’ Mortality: Larger Data Set

21.3.3 Multiple covariates and interactions

21.4 Comparison of Cox and Kaplan Meier Analyses of Lung Cancer Data

21.5 Recommended Reading

22 Markov Multiple State Models: Applications to Life Contingencies

22.1 Introduction

22.2 The Markov Property

22.3 Markov Chains and Jump Models

22.3.1 Examples

22.3.2 Differences between Markov Chain and Markov Jump Models

22.4 Markov Chains (Discrete Time)

22.4.1 Applying Markov Chains to Estimate Future Probabilities

22.4.2 Markov Chain Model – NCD

22.4.3 Coding Exercise for Markov Chains

22.5 Markov Jump Models

22.5.1 Example – Simple 3-State Model (All Transitions Possible)

22.5.2 Example – H-S-D Model

22.6 Non-Constant Rates

22.7 Premium Calculations

22.8 Transition Rate Estimation

22.9 Multiple Decrement Models

22.9.1 Introduction

22.9.2 Using a Numerical Approach for the above Fixed Rate Problems

22.9.3 An Exact Approach

22.9.4 Age-Dependent Rates

22.10 Recommended Reading

23 Contingencies I

23.1 Introduction

23.2 What is Meant by “Contingencies” in an Actuarial Context?

23.3 The Life Table

23.4 Expected Present Values of the Key Contingency Functions

23.5 Writing Our Own Code – Some Introductory Exercises

23.6 The Lifecontingencies Package

23.6.1 The Lifetable and Actuarialtable Objects

23.6.2 Application to Actual Mortality Tables: AM92 and AF92

23.6.3 Annuities

23.6.4 Annuities Paid more Frequently than Annually

23.6.5 Increasing Annuities

23.6.6 Reversionary Annuities

23.6.7 Example: Annuity Company Valuation

23.6.8 Life Assurance functions

23.6.9 Assurance Policies with immediate Payment on Death: Ax

23.7 Simulation of Future Lifetimes

23.8 Recommended Reading

24 Contingencies II

24.1 Introduction

24.2 Mortality Tables: AM92

24.3 Uncertainty in Present Values: Variance

24.4 Simulations

24.4.1 Single Policy

24.4.2 Portfolios with 100 Policies – Portfolio Claim Distribution from Simulations

24.5 Simulation of Annuities

24.6 Premium Calculations

24.7 Profits – Probability Distributions of Single Policies and Portfolios

24.8 Progression of expected profits throughout the lifetime of a policy: no reserves held

24.9 Policy Values

24.9.1 Calculating Policy Values

24.9.2 Recursive Formulae – Discrete and Continuous (Thiele)

24.9.3 Recursive Equation with 3 States – HSD Model

24.10 Profits from Policies where Reserves Are Held

24.10.1 Calculating the Profit Vector

24.10.2 Measures of Profit and Profit Testing

24.11 Profit Uncertainty: Interest Rate and Mortality Risk

24.12 Risk Capital and Risk-adjusted Return Measures

24.13 Unit-linked Policies

24.13.1 Introduction

24.13.2 Example with Deterministic and Stochastic Projections

24.14 Additional Exercises

24.15 Appendix: Dependent and Independent Rates

24.16 Recommended Reading

25 Actuarial Risk Theory – An Introduction: Collective and Individual Risk Models

25.1 Introduction

25.2 Collective Risk Model

25.3 Poisson Compound Collective Risk Model

25.4 Applications of the Model

25.4.1 Setting Appropriate Reserves and Premium Pricing

25.4.2 Increasing the Number of Independent Policies

25.4.3 Adopting a Normal Distribution Approximation

25.4.4 Return on Capital

25.4.5 Skewness of the Compound Poisson Model

25.4.6 Sum of Compound Poisson Distributions

25.5 Compound Binomial Collective Risk Model

25.6 Compound Negative Binomial Distribution

25.7 Panjer’s Recursion Formula

25.8 Closing Thoughts on Collective Risks Models

25.9 Individual Risk Model

25.9.1 Standard Individual Risk Model

25.9.2 Alternative Model – ‘The Poisson Individual Risk Model’

25.10 Issues with Heterogeneity

25.11 Policies Which Are Not Independent

25.12 Incorporating Parameter Uncertainty in the Models

25.13 Claim Amount Distributions: Alternatives to the Gamma Distribution

25.14 Conclusions

25.15 Recommended Reading

26 Collective Risk Models: Exercise

26.1 Introduction

26.2 Analysis of Claims Data

26.3 Running Simulations

26.4 Tails of the Distribution

26.5 Allowing for Parameter Uncertainty

26.6 Conclusions

26.7 Recommended Reading

27 Generalised Linear Models: Poisson Regression

27.1 Introduction

27.2 Examples/Exercises/Data

27.3 Brief Recap on Multiple Linear Regression

27.4 Generalised Linear Models (“GLMs”)

27.5 Goodness of Fit of GLMs

27.6 Poisson Regression

27.6.1 Introduction

27.6.2 Using Poisson Regression to Model Claim Numbers

27.7 Data with Varying Exposure Periods

27.7.1 Claim Rates and the Offset

27.7.2 Application to Aggregated Data in Section 27.1

27.8 Categorical and Continuous Variables

27.8.1 Problem with Continuous Variables

27.8.2 Categorical Variables

27.9 Interaction between Variables

27.10 Over-dispersion

27.11 Miscellaneous Exercises

27.12 Further Study / Next Steps

27.13 Recommended Reading

28 Extreme Value Theory

28.1 Introduction

28.2 Why Use EVT?

28.3 Generalised Pareto Distribution – “GPD”

28.4 EVT Analysis of Historic Daily Equity Market Returns (S&P 500)

28.4.1 Basic EVT Analysis

28.4.2 Will a Normal Distribution (and Other Alternatives) Do Just as Well?

28.5 Data for Further EVT Analysis

28.6 Recommended Reading

29 Introduction to Machine Learning: k-Nearest Neighbours (kNN)

29.1 Introduction

29.2 Example 1 – Identifying a Fruit Type

29.2.1 Data

29.2.2 Overview of the Process

29.2.3 How does the kNN Algorithm Work?

29.2.4 Normalising Our Data

29.2.5 Varying k

29.2.6 Using Our Model

29.3 Analysis of Our Model – the Confusion Matrix

29.4 Example 2 – Cancer Diagnoses

29.5 Conclusion

29.6 Recommended Reading

30 Time Series Modelling in R – Dr A Kume

30.1 Introduction

30.2 Linear Regression Versus Autoregressive Model

30.3 Three Components for Time Series Modelling

30.4 Stationarity

30.5 Main Tools in R for ARIMA Modelling

30.5.1 PACF as a Derivation of ACF and Their General Behaviour for ARMA(p,q) Models

30.5.2 How to Simulate and Obtain the Theoretical Values of ACF and PACF for ARMA Models

30.6 Identifying a Set Possible Models to the Data Including the Order of Differencing

30.6.1 Model Fitting to Time Series Data

30.6.2 Parameter Estimation for Pure Auto-Regressive Models

30.6.3 Diagnostic Plots

30.6.4 Forecasting

30.7 Dealing with Real Data far from Stationary

30.7.1 Non Parametric Approaches

30.7.2 Airline Data Modelling Using Multiplicative Seasonal Models

30.8 Recommended Reading

31 Volatility Models – GARCH

31.1 Introduction

31.2 Why Use GARCH Models?

31.3 Outline of the Chapter

31.4 Key Theoretical Concepts with GARCH

31.5 Simulation of Data Using a GARCH Model

31.6 Fitting a GARCH Model to Data, and Analysis

31.6.1 Fitting a GARCH Model

31.6.2 Further Analysis of the Data; Comparison with the Normal Distribution

31.6.3 Further Analysis of the Data; Volatility Clustering

31.7 A Note on Correlation and Dependency

31.8 GARCH Long-Term Variance

31.9 Exercise: Shocks to Global Equity Markets – The Global Financial Crisis 2008, and COVID-19

31.10 Extensions to the GARCH Model

31.11 Appendix – A Mixture of Normal Distributions

31.12 Recommended Reading

32 Modelling Future Stock Prices Using Geometric Brownian Motion: An Introduction

32.1 Introduction

32.1.1 Discrete Gaussian Random Walk

32.2 Geometric Brownian Motion

32.3 Applications of GBM, and Simulating Prices

32.4 Recommended Reading

33 Financial Options: Pricing, Characteristics, and Strategies

33.1 Introduction

33.2 What is a Financial Option?

33.3 What are Financial Options Used for?

33.4 Black, Scholes and Merton Differential Equation

33.4.1 Assumptions Underlying B-S-M Formulation

33.4.2 Solution to B-S-M Equation for European Call Options

33.4.3 Call Option Price Function

33.5 Calculating the Option Price Using Simulations

33.6 Factors Which Affect the Price of a Call Option

33.6.1 Share Price

33.6.2 Time to Expiry

33.6.3 Combined Effect of Share Price and Time to Expiry

33.6.4 Other Factors

33.7 Greeks

33.8 Volatility of Call Option Positions

33.9 Put Options

33.10 Delta Hedging

33.11 Sketch of the B-S-M Derivation

33.12 Further Tasks

33.13 Appendix

33.14 Recommended Reading

Index

End User License Agreement

List of Tables

CHAPTER 05

Table 5.1 Some distributions and...

Table 5.2 A few entries...

CHAPTER 30

Table 30.1 The derivation of...

Table 30.2 The general behaviour...

List of Illustrations

CHAPTER 01

Figure 1.1 Downloading R.

Figure 1.2 Downloading R-studio.

Figure 1.3 Starting R-studio, script...

Figure 1.4 Script window is...

Figure 1.5 Histograms : 2,000...

CHAPTER 02

Figure 2.1 Effect of gravit..

Figure 2.2 Plotting a functio..

CHAPTER 03

Figure 3.1 Non-constant forward...

Figure 3.2 Cashflow payments...

CHAPTER 04

Figure 4.1 Effect of interest...

Figure 4.2 Effect of term...

Figure 4.3 Effect on bond...

Figure 4.4 Effect on bond...

Figure 4.5 Simple bond valuation...

Figure 4.6 Effect of payments...

Figure 4.7 Solution from Exercise...

CHAPTER 05

Figure 5.1 Histogram of simulated...

Figure 5.2 Empirical and theoretical...

Figure 5.3 Some realisation of...

Figure 5.4 Normal and t...

Figure 5.5 CLT for sample...

Figure 5.6 Histograms and a...

Figure 5.7 Two options for...

Figure 5.8 Exploring sample quantiles...

Figure 5.9 Exploring sample quantiles...

Figure 5.10 Exploring the Likelihood...

Figure 5.11 Likelihood (and Log...

Figure 5.12 Fitting Exponential and...

Figure 5.13 A bivariate plot...

Figure 5.14 A bivariate plot...

Figure 5.15 Some examples of...

Figure 5.16 Model 3: Analysis...

Figure 5.17 Pairwise plots for...

Figure 5.18 Plots for various...

Figure 5.19 Box plots of...

Figure 5.20 The fitted regression...

CHAPTER 06

Figure 6.1 Simulations from a...

Figure 6.2 Bi-variate Normal...

Figure 6.3 Bi-variate Normal...

CHAPTER 07

Figure 7.1 The effect of...

Figure 7.2 The effect of...

Figure 7.3 Simulated returns...

Figure 8.2 3-Asset portfolio...

Figure 8.3 3 Assets (including...

Figure 8.4 Capital market line...

Figure 8.5 Using a Lagrange...

CHAPTER 10

Figure 10.1 Simulated solvency positions...

Figure 10.2 Simulated solvency position...

CHAPTER 11

Figure 11.1 The effect of...

Figure 11.2 Simulations (assuming a...

Figure 11.3 High-quality hedge...

CHAPTER 12

Figure 12.1 Funding level as...

Figure 12.2 Comparison of assets...

Figure 12.3 Solvency level evolution...

Figure 12.4 Liability cashflows...

Figure 12.6 Yield curve scenarios...

Figure 12.7 Asset and liability...

CHAPTER 13

Figure 13.1 Two realisations of...

Figure 13.2 Gaussian copulas with...

Figure 13.3 Clayton copula simulation...

Figure 13.4 Realisations of Gumbel...

Figure 13.5 Copula density functions...

Figure 13.6 Mapping from copula...

Figure 13.7 Clayton copula mapped...

Figure 13.8 Simulated data using...

Figure 13.9 Representation of lower...

Figure 13.10 The rectangle inequality...

Figure 13.11 Calculation of Kendall..u

CHAPTER 14

Figure 14.1 UK and US..a

Figure 14.2 Comparing data with...

Figure 14.3 Comparison of simulated...

Figure 14.4 Comparison of simulations...

CHAPTER 15

Figure 15.1 Yield curves...

CHAPTER 16

Figure 16.1 2-state Markov...

Figure 16.2 Survival probability with...

Figure 16.3 Mortality following Gompertz...

Figure 16.4 Mortality characteristics of...

Figure 16.5 Mortality rates, survival...

CHAPTER 17

Figure 17.1 Crude rates calculated...

Figure 17.2 Comparing crude rates...

Figure 17.3 Lives aged 50...

Figure 17.4 Comparing the Gompertz...

Figure 17.5 Chosen basis splines...

Figure 17.6 B(60) values...

Figure 17.7 Two knots at..0

Figure 17.8 Pronounced accident hump...

CHAPTER 18

Figure 18.1 Standard table vs..e

Figure 18.2 Graduation using a..e

Figure 18.3 2nd differences: Assessing..g

CHAPTER 19

Figure 19.1 Changes in UK...

Figure 19.2 Fitted rates using...

Figure 19.3 HMD and L...

Figure 19.4 Fitted rates using...

Figure 19.5 k parameter values...

Figure 19.6 Projected k’...

Figure 19.7 Projection of rates...

Figure 19.8 Comparison of 2018...

Figure 19.9 Comparison of actual...

Figure 19.10 Changes in US...

Figure 19.11 k vector analysis...

CHAPTER 20

Figure 20.1 K-M survival...

Figure 20.2 K-M plot...

Figure 20.3 Survival Distributions using...

CHAPTER 21

Figure 21.1 Non-constant ratio..s

Figure 21.2 Loglikelihood function value..

CHAPTER 22

Figure 22.1 2 examples of...

Figure 22.2 Weather forecast using...

Figure 22.3 NCD system...

Figure 22.5 Probabilities of sleeping...

Figure 22.6 Healthy - sick - dead...

Figure 22.7 Solutions to H...

Figure 22.8 Solution to 2...

Figure 22.9 SIRD model...

Figure 22.11 Probabilities from models...

Figure 22.12 Multiple decrement model...

CHAPTER 23

Figure 23.1 Simulated Future Lifetimes...

CHAPTER 24

Figure 24.1 Probability distributions of...

Figure 24.2 Portfolios (100 policies...

Figure 24.3 Probability distribution of...

Figure 24.4 Single whole life...

Figure 24.5 Present value of...

Figure 24.6 Present value of...

Figure 24.7 Present value of...

Figure 24.8 Simulated profits from...

Figure 24.9 Present value of...

Figure 24.10 Evolution of reserves...

Figure 24.11 Evolution of reserves...

Figure 24.12 Evolution of Thiele...

Figure 24.13 Profit vectors with...

Figure 24.14 Distribution of profits...

Figure 24.15 Unit-linked fund...

CHAPTER 25

Figure 25.1 Simulations from Compound...

Figure 25.2 Effect on capital...

Figure 25.3 Skewed claims distribution...

Figure 25.4 Sum of independent...

Figure 25.5 Sum of Compound...

Figure 25.6 Panjer vs Normal...

Figure 25.7 Q-Q Plot...

CHAPTER 26

Figure 26.1 Claim amounts data...

Figure 26.2 Comparing total monthly...

Figure 26.3 QQ plot: comparing...

CHAPTER 27

Figure 27.1 Fitting of GLM...

Figure 27.2 Comparing rates using...

Figure 27.3 With and without...

CHAPTER 28

Figure 28.1 Cumulative distribution function...

Figure 28.2 Probability of exceeding...

Figure 28.3 Visual checks on...

Figure 28.4 Daily equity returns...

Figure 28.5 Analysis of daily...

CHAPTER 29

Figure 29.1 kNN analysis...

CHAPTER 30

Figure 30.1 Daily exchange rate...

Figure 30.2 Plots of the...

Figure 30.3 Some time series...

Figure 30.4 Realisations of AR1...

Figure 30.5 A few more...

Figure 30.6 Some Random Walk...

Figure 30.7 ACF and PACF...

Figure 30.8 Standard output of...

Figure 30.9 Theoretical ACF and...

Figure 30.10 A random walk...

Figure 30.11 output of x3...

Figure 30.12 output of model...

Figure 30.13 output of model...

Figure 30.14 Forecasting values of...

Figure 30.15 Forecasting values and...

Figure 30.16 Decomposition of the...

Figure 30.17 Holt-Winters algorithm...

Figure 30.18 A set of...

Figure 30.19 ouput of ...

CHAPTER 31

Figure 31.1 US equity returns...

Figure 31.2 Fat tails...

Figure 31.4 Histogram of GARCH...

Figure 31.5 ACFs...

Figure 31.7 Regression to the...

CHAPTER 32

Figure 32.1 Simulated discrete Gaussian...

Figure 32.2 Two Simulated discrete...

Figure 32.3 Simulated BM with...

Figure 32.4 Simulated Share prices...

Figure 32.5 Share price distributions...

Figure 32.6 Five simulated share...

Figure 32.7 Evolution of the...

CHAPTER 33

Figure 33.1 Call Option Price...

Figure 33.2 Option price as...

Figure 33.3 3-dimensional plot...

Figure 33.4 Simulated share price...

Figure 33.5 Distribution of prices...

Figure 33.6 Distribution of profit...

Figure 33.7 Delta hedging: Portfolio...

Guide

Cover

Title Page

Copyright Page

Dedication

Table of Contents

About the Companion Website

Begin Reading

Index

End User License Agreement

Pages

i

ii

iii

iv

v

vi

vii

viii

ix

x

xi

xii

xiii

xiv

xv

xvi

xvii

xviii

xix

xx

xxi

xxii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

About the Companion Website

This book is accompanied by a companion website:

www.wiley.com/go/rprogramming

The website contains much of the R code used in this book, allowing copying of the suggested code, thus saving the reader significant time.

The website also includes numerous data files, such as investment data and mortality data. These data files will be analyzed using R code in several chapters of the book.

Introduction

1 Main Objectives of This Book

The overriding objective of this book is to help students of actuarial mathematics and related disciplines such as financial mathematics, develop programming skills which will enhance their understanding of actuarial, financial, and statistical concepts, enabling them to solve real-world problems encountered in these fields. Breaking this down further, the purposes of the book is two-fold:

To provide an introduction to the programming language,

R

. This is achieved using worked examples and undertaking exercises commonly seen in the fields of actuarial and financial mathematics.

Secondly, to improve the reader’s level of understanding of actuarial and financial topics by using these programming skills. We believe that most students can develop a deeper understanding of mathematical material by solving problems using a programming language. From our experience of teaching actuarial mathematics and statistics, students often confirm that their understanding of a topic has vastly improved following the completion of a computer-based exercise or project.

A similar effect is noted in students who opt to take a year out from their studies to work in the financial industry, often applying extensive programming skills to solve real-world problems. Such students invariably notice a similar level of improvement in their understanding of concepts. It is hoped, to some extent, that this learning experience can be mirrored throughout this book.

The authors have significant teaching experience at both undergraduate and postgraduate levels, enhanced with experience in assessment processes for universities and the actuarial profession. This has given insights into the typical issues students experience with actuarial mathematics – problems often arise from a fundamental misunderstanding of introductory material. For example, a final year undergraduate may only fully understand a concept introduced in their first year whilst undertaking programming coursework in their final year on a specific application of the material taught in the first year, experiencing that “Eureka” moment.

The reader should not underestimate the extent to which learning a programming language, such as R, to a level such that most exercises in this book can be completed, will help the reader in the employment market. Having a good working knowledge of R or similar language should improve the career prospects of the graduate.

A further motivating factor for writing this book originates from the decision taken by the Institute and Faculty of Actuaries (IFoA) in 2018 to choose the R programming language as an integral part of its syllabus. Indeed, much of the IFoA’s syllabi for subjects CM1, CM2, CS1, and, in particular, CS2 are covered in the book.

2 Who Is This Book For?

This book is aimed at two main groups:

It has been written principally for university level actuarial and financial maths students, together with graduates undertaking professional actuarial exams (e.g. with the IFoA and SOA), and more generally to anyone aspiring to careers in actuarial mathematics and finance. The book should be useful to the student throughout their studies, whether first-year undergraduate or postgraduate, spanning topics from fundamentals of financial mathematics and Brownian motion, to a variety of mortality models and analysing investment strategies such as asset–liability matching and hedging.

Secondly, we hope the book appeals to more experienced professionals in related disciplines wishing to develop skills in a programming language, who may have had limited opportunities to do so earlier in their career. By undertaking examples and exercises related to material with which they are already familiar, this book provides an efficient journey to acquiring such programming skills. Such users of this book may therefore wish to review Chapters 3, 4, 23, and 25, which include traditional material most actuaries would be familiar with.

In writing this book, we have attempted to cater for a wide range of experiences and abilities. The overall style of the book aims to ensure that the basics of each topic are covered, with appropriate text, examples, and exercises, whilst including several more advanced tasks. As noted elsewhere, the reader should aim at expanding on the tasks included in this book.

It is assumed that the reader will have a knowledge of statistics and mathematics at a level expected from that of a first year undergraduate in a maths-based university degree.

The book includes the majority of topics covered in a typical undergraduate course in actuarial science. There is also perhaps a greater emphasis placed on a number of actuarial concepts which may not directly be assessed in traditional university courses; indeed, several examples involve addressing practical problems which the student will see in the workplace. For example, we introduce models which may help in improving how correlations are dealt with by insurance companies, and develop an understanding of fundamental risk management techniques such as hedging, asset-liability matching, and diversification. Ultimately, we hope the reader develops a good understanding of the problem-solving approaches used in the workplace.

3. How to Use This Book

To get the most from this book it is anticipated that during each study session the user will simultaneously:

study the material in this book,

access the book’s website (code and data), and

write and run code on their computer.

It would be expected that the user proceeds to write their own code and duplicate the results. The suggested code for each example/exercise is one of many possible solutions; it may be quite reasonable, depending on the scenario, for your code to be quite different to that set out in this book. It is important that the user practises writing their own, independent code, and does not try to learn, by rote, the code in the book. As noted in Chapter 1, the reader may wish to save a script file in respect of each chapter. Indeed the reader may wish to write functions incorporating and combining several sections of code from the website, improving the efficiency of their code.

We would expect most users to have had some prior exposure to, and knowledge of, the material in a chapter before embarking on it, either following an initial period of independent study, or attendance at related university lectures or tutorials; it is anticipated that readers will have access to alternative study material for each topic.

The website contains the majority of the R code included in the book, together with suggested code relating to the exercises. It is intended that the reader will treat the book and website as companions; it is not expected that most users use the book and website separately (for the most part at least). Note that a small amount of code is not included on the website (the missing code can simply be copied from the book) – this is to encourage more active learning of the material.

The vast majority of students will gain most benefit from frequent practise of writing code; occasional engagement is likely to end in less satisfactory results. It is hoped that the style of the book will lend itself to encouraging a greater level of creativity from the student, developing their own examples and exercises as their skills and knowledge increase.

4 Book Structure

We start by covering the fundamentals of R in Chapter 1, “R: What you need to know to get started”, and Chapter 2, “Functions in R”. If you are new to R we recommend that you first read these two chapters, and revisit them when required. Chapter 1 explains the key aspects of R, e.g. writing your first code in R, how objects are used etc. From experience, most students find it beneficial to initially read this chapter relatively quickly, referring back to it frequently. Readers new to R should benefit from spending some time digesting the examples in Chapter 2 to get a feel for writing basic R code and applying existing functions.

The typical actuarial and financial mathematics student is then likely to cover Chapters 3 and 4 – “Financial Mathematics I” (and “II”); the material included in these two chapters is usually covered in the first year of actuarial mathematics programmes at university.

As noted above, we think most readers will benefit from only a relatively brief study of Chapters 1 and 2, and to move onto the main chapters and start practising! It is unlikely to be beneficial to spend days memorising the material in these introductory chapters.

Most chapters are largely self-contained, with a few obvious exceptions, e.g. Financial Mathematics I and II, Contingencies I and II, the chapters on copulas, and Markov mortality models. There is a certain amount of grouping of chapters where the material is strongly related, and it is likely that most readers will tend to read a group of topics together.

A number of chapters lend themselves particularly to a relatively brief initial study, subsequently re-visiting them when studying a later chapter which uses that material. For example, application of the material in Chapters 5 and 6 is used in several later chapters of the book.

5 Chapter Style

Most chapters begin with setting key objectives and a broad discussion of the main ideas behind the topic of the chapter. This is usually followed with a certain amount of theory, the length of which is based on our experience of how well students generally tend to grasp the concepts. Compared to other texts, there will, in general, be less theory included in this book. Many topics covered in this book already have a wealth of excellent texts – repetition of the same theory is not warranted here. The importance of mathematical rigour should be stressed at this point; the student will benefit greatly over the long-term by developing a deeper understanding of the material which can be adapted to various scenarios (such comparisons are highlighted in several chapters of the book). Each chapter ends with a Recommended Reading list.

As noted in Section 3, most readers will require additional principal reading material on each topic to supplement the material in this book. This book focuses on solving actuarial problems by using the R programming language, and is not intended to be used as a student’s sole source of learning for each subject.

The reader may also find it beneficial to own a copy of the “Formulae and Tables” issued by the Institute and Faculty of Actuaries (2002) (also freely available online at the time of publishing).

6 Examples and Exercises

There are over 400 examples and exercises included in this book which readers can use to develop their programming skills and understanding of the mathematical concepts. The book includes two main types of tasks:

Analysis of datasets (such as claims data, investment data, mortality data etc.), fitting various models to data, and testing the results. You will find these data sets on the book’s website.

Other tasks do not require data sets. Code is used to develop a better understanding of actuarial concepts, often with the use of simulations. The book includes, for the most part, the use of relatively simple code, aimed at communicating the fundamental ideas of the mathematics involved – it is the intention that the reader will develop their coding skills through self-study.

It is hoped that readers will also combine code from various parts of the book, developing their own more advanced models. For example, by combining code from various chapters on asset modelling, claims models, and mortality models, one could develop a model for an insurance company.

Ultimately, actuaries are involved in the management of risk – much of this book relates to measuring risk and uncertainty, and how to manage the risks identified. Indeed, inadequate risk management has contributed to many corporate failures, both on the macro or global level, and also within firms and industries. Many of the examples and exercises aim to develop these analytical skills. We mainly discuss risk in the context of financial risk (such as interest rate risk and market price risk), and demographic risk (such as mortality risk), although many of the principles could be applied to operational-type risks. Most of these discussions will relate to the fields of finance and insurance (both life and non-life).

A word of warning – the material in this book mainly relates to the quantitative management of risk, that is, analysing data and proposing statistical distributions and models to predict financial outcomes. It is important when analysing real-world risk that a qualitative approach is taken alongside such a quantitative approach – the relative weights assigned to the two approaches depending on the particular scenario. A risk in itself is an over-reliance on quantitative financial models, at the expense of any qualitative analysis and exercise of judgement.

The student is likely to benefit from a review of case study material which relate to risk management cases. Study of such cases will provide a more rounded education and knowledge-base of risk management, rather than solely understanding the mathematical approach discussed in this book. Examples of such case studies include: Robert Maxwell and the Mirror Group newspapers, Barings Bank, Equitable Life Assurance Company, Long-Term Capital Management, GFC 2008, Northern Rock, Lehmann Brothers, UK pension schemes/LDI crisis 2022, Silicon Valley Bank; and relevant regulations, such as: Basel Accords, Solvency 2, The Dodd-Frank Act, The Sarbanes-Oxley Act.

7 Verification of Code and Calculations – Best Practice

A key skill of the actuary is to verify complex calculations efficiently. For example, actuarial valuations of insurance companies and company pension schemes typically involve millions of calculations; clearly it is not sensible to check all of them. The actuary must be able to check calculations in an appropriate, cost-effective manner such that they, and other stakeholders, have sufficient confidence in them and can rely on their accuracy. Errors in these calculations may result in advice which has a significant impact on company balance sheets, solvency levels, profits, amount of additional funding required, dividend payouts, and even future career prospects. We will often provide more than one coded solution to a problem e.g. by performing an alternative, approximate calculation. This is a skill the authors believe most undergraduates would benefit from improving prior to entering the workplace.

8 Website: www.wiley.com/go/rprogramming

The book’s website includes code from each chapter of the book, together with all data files used. It will also, periodically, be updated with extra coding exercises and solutions. We welcome feedback from our readers regarding areas which require further examples.

As referred to earlier, the website is fundamental in using this book efficiently. As well as providing solutions to exercises in the book, it allows copying of the suggested R code thus saving significant time.

9 R or Microsoft Excel?

It is expected that many readers will have some level of experience in using Microsoft Excel. Excel is a fantastic calculation tool. A significant benefit of Excel is its intuitive nature, making it relatively straightforward to learn the basics and quickly reach a reasonable level of competency. Indeed many financial institutions use Excel as a principal piece of software. Programming languages such as R have a significantly steeper learning curve than Excel; most new users, particularly those with no programming language background, will take several days to familiarise themselves with the basic workings of the R language.

In general, Excel is likely to be preferred to R for simpler tasks, or for more involved calculations which are unlikely to require numerous re-runs with various adjustments; the extra cost and time involved in writing R code may not be justified in such cases given the relatively small savings over the long term. In the same way that there are occasions when a calculator is a preferable tool to Excel, there are many occasions when Excel will be preferable to R.

It is important for the reader who has experience with Excel to develop an understanding of whether a programming language such as R will be more suited at solving a particular problem, or set of problems, than Excel. This should be achieved as the reader makes progress with this book. The reader is encouraged to tackle exercises in Excel (where possible) and to compare the process with R. An obvious example is that of the Loan Schedule discussed in Chapter 4 (where the reader is specifically encouraged to reproduce the calculations in Excel); it may well be the case here that using R code is not warranted. A number of calculation schedules involving Life Contingency examples (Chapters 23 and 24) may also prove more user-friendly with Excel. However, as these models become more complex (e.g. incorporating stochastic interest rate models) at some point it is likely that R will become more efficient.

For example, a pension scheme’s valuation calculations may take several minutes to run in Excel compared to a few seconds in R; such calculations may be re-run hundreds of times throughout the analysis and verification process of the valuation, thus benefiting from faster running speeds. A decision is often required therefore regarding programming and running times and related costs when comparing Excel and a programming language such as R.

For a basic level of statistical analysis, Excel may be the preferred choice; however R will be the preferred route involving tasks which require anything more than basic analysis. To understand the benefits of using a programming language such as R requires a certain amount of practice and application; all the above should become clearer with experience.

For the casual data user, Excel is better, given the steeper learning curve required to learn most programming languages such as R. Excel is indeed used in several of our actuarial maths classes to demonstrate small-scale, simplified calculations; however, these often tend not to be particularly realistic, and ultimately lead to a better learning and teaching experience when carried out in R. Indeed, with students migrating towards languages like R and Python, a greater proportion of our assessments now involve R programming.

There are tasks which are particularly unsuited to, or just not possible in Excel. For example, it is not a simple task to calculate eigenvectors in Excel, but these can be calculated almost instantly with one line of R code; similarly, running large numbers of simulations using complex models, large matrix calculations, complex regression analysis etc. are problematic. Many student projects require the use of R (or similar language), and are simply not possible using Excel. R has many statistical functions which run significantly quicker, or are unavailable in Excel. We will see many examples of tasks in this book where using Excel is extremely slow and impractical, such as when running large tasks, and does not deal particularly well with huge datasets – often causing it to slow down and crash. Simple tasks in R such as analysing millions of rows of data across several databases can be extremely time-consuming in Excel, and prone to calculation errors. Several R script file (the file which contains the code) can be used to combine various tasks; with Excel the solution is significantly less elegant, and more difficult to verify and audit.

Also we are frequently required to solve problems numerically e.g. an exact solution may not be possible or easy to obtain. Such a numerical approach is usually more suited to R than Excel.

The use of R is also likely to reduce the risk of data corruption and other errors being made. Physically manipulating data and formulae (cutting, copying, deleting, pasting) in Excel is generally quick and easy, but not particularly robust. Human error introduces mouse-slips, moving to incorrect cells etc. Such processes may be required to be performed several times on many similar data sets – with R we run the same program requiring no manual interaction with the data. Less experienced users may suggest applying care is required when handling data; eventually, however, errors will be made. The experienced Excel user will only be too aware of such problems and the potential for calculation disaster. Ultimately, R is likely to be more robust, and less prone to manual errors.

A similar comparison could be made with the requirement for database systems. For small-scale data scenarios a spreadsheet may perform the required tasks adequately; similar to the comparison with R, Excel will tend to be initially more user-friendly than a database programme. However, with added complexity a robust, dedicated database system is required. Readers may wish to review the recent high-profile case relating to COVID-19 data held on a UK Government (Public Health England) spreadsheet “database” where a spreadsheet of data reached its maximum size resulting in data errors.

In short, if you are intending to become a serious user of data, or data analyst, learning R or a similar computer programme is a priority. At the risk of repeating a message from earlier, it is important for the reader to experience the advantages and disadvantages of R compared to Excel for themselves – please experiment! We would advise the reader to become competent with both tools.

Remark 1.1   For most readers of this book, learning R should really be about the process of learning a programming language; as is the case with a foreign language, once one language has been learnt to a reasonable level the hurdle to learning a second language is somewhat reduced. Thus, a further objective of the reader may be to subsequently learn other programming languages.

10 Caveats

All code, results, and analyses included in this book are provided following significant review by peers and colleagues. However the authors cannot guarantee their accuracy, and material from this book should not be used without further appropriate checking and review by their subsequent users.

11 Acknowledgements

Much of the inspiration for this book has come from teaching students of actuarial science and statistics, and discussing material with interested students, without which this book would not have been possible. We would like to thank all those students who have indirectly contributed to this book.

We would like to thank all the reviewers of chapters of this book for their their important and significant feedback: Shaun Parsley, Kevin Yuen, Dr Eduard Campillo-Funollet, Dhruv Gavde, Dr Daniel Bearup, Dr Pradip Tapadar, Dr James Bentham, Dr Peng Liu, John Millett, Professor Martin Readout, Professor Enkelejd Hashorva and Professor Malcolm Brown.

We would also like to acknowledge those involved with writing and developing the R programming language, and its predecessor, S. This includes all the authors of the packages used in this book of which there are too many to mention.