65,99 €
R Programming for Actuarial Science Professional resource providing an introduction to R coding for actuarial and financial mathematics applications, with real-life examples R Programming for Actuarial Science provides a grounding in R programming applied to the mathematical and statistical methods that are of relevance for actuarial work. In R Programming for Actuarial Science, readers will find: * Basic theory for each chapter to complement other actuarial textbooks which provide foundational theory in depth. * Topics covered include compound interest, statistical inference, asset-liability matching, time series, loss distributions, contingencies, mortality models, and option pricing plus many more typically covered in university courses. * More than 400 coding examples and exercises, most with solutions, to enable students to gain a better understanding of underlying mathematical and statistical principles. * An overall basic to intermediate level of coverage in respect of numerous actuarial applications, and real-life examples included with every topic. Providing a highly useful combination of practical discussion and basic theory, R Programming for Actuarial Science is an essential reference for BSc/MSc students in actuarial science, trainee actuaries studying privately, and qualified actuaries with little programming experience, along with undergraduate students studying finance, business, and economics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 881
Veröffentlichungsjahr: 2023
Peter McQuire University of Kent Canterbury UK
Alfred Kume University of Kent Canterbury UK
This edition first published 2024
© 2024 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Peter McQuire and Alfred Kume to be identified as the authors of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
A catalogue record for this book is available from the Library of Congress
Hardback ISBN: 9781119754978; ePub ISBN: 9781119754992; ePDF ISBN: 9781119754985; oBook ISBN: 9781119755005
Cover Design: Wiley
Cover Image: © Peter McQuire
Set in 9.5/12.5pt STIXTwoText by Integra Software Services Pvt. Ltd, Pondicherry, India
To my wife, Jenny, and daughter, Lauren, for their constant support and encouragement. (Peter McQuire)
To my wife Ortenca, for her support throughout the process. (Alfred Kume)
Cover
Title Page
Copyright Page
Dedication
About the Companion Website
Introduction
1 Main Objectives of This Book
2 Who Is This Book For?
3 How to Use This Book
4 Book Structure
5 Chapter Style
6 Examples and Exercises
7 Verification of Code and Calculations – Best Practice
8 Website: www.wiley.com/go/rprogramming
9 R or Microsoft Excel?
10 Caveats
11 Acknowledgements
1 R : What You Need to Know to Get Started
1.1 Introduction
1.2 Getting Started: Installation of R and RStudio
1.2.1 Installing R
1.2.2 What Is RStudio?
1.2.3 Inputting R Commands
1.3 Assigning Values
1.4 Help in R
1.5 Data Objects in R
1.6 Vectors
1.6.1 Numeric Vectors
1.6.2 Logical Vectors
1.6.3 Character Vectors
1.6.4 Factor Vectors
1.7 Matrices
1.8 Dataframes
1.9 Lists
1.10 Simple Plots and Histograms
1.11 Packages
1.12 Script Files
1.13 Workspace, Saving Objects, and Miscellany
1.14 Setting Your Working Directory
1.15 Importing and Exporting Data
1.15.1 Importing Data
1.15.2 Exporting Data
1.16 Common Errors Made in Coding
1.17 Next Steps
1.18 Recommended Reading
1.19 Appendix: Coercion
2 Functions in R
2.1 Introduction
2.1.1 Objectives
2.1.2 Core and Package Functions
2.1.3 User-Defined Functions
2.2 An Introduction to Applying Core and Package Functions
2.2.1 Examples of Simple, Common Functions
2.3 User-Defined Functions
2.3.1 What does a “udf” consist of?
2.3.2 Naming Conventions
2.3.3 Examples and Exercises
2.4 Using Loops in R – the “for” Function
2.5 Integral Calculus in R
2.5.1 The “Integrate” Function
2.5.2 Numerical Integration
2.6 Recommended Reading
3 Financial Mathematics (1): Interest Rates and Valuing Cashflows
3.1 Introduction
3.2 The Force of Interest
3.3 Present Value of Future Cashflows
3.4 Instantaneous Forward Rates and Spot Rates
3.5 Non-Constant Force of Interest
3.5.1 Discrete Cashflows
3.5.2 Cashflows Which Are Continuous
3.6 Effective and Nominal Rates of Interest
3.6.1 Effective Rates of Interest
3.6.2 Why Do We Use Effective Rates?
3.6.3 Nominal Interest Rates
3.7 Appendix: Force of Interest – An Analogy with Mortality Rates
3.8 Recommended Reading
4 Financial Mathematics (2): Miscellaneous Examples
4.1 Introduction
4.2 Writing Annuity Functions
4.2.1 Writing a function for an annuity certain
4.3 The ‘presentValue’ Function
4.4 Annuity Function
4.5 Bonds – Pricing and Yield Calculations
4.6 Bond Pricing: Non-Constant Interest Rates
4.7 The Effect of Future Yield Changes on Bond Prices Throughout the Term of the Bond
4.8 Loan Schedules
4.8.1 Introduction
4.8.2 Method 1
4.8.3 Method 2
4.9 Recommended Reading
5 Fundamental Statistics: A Selection of Key Topics - Dr A Kume
5.1 Introduction
5.2 Basic Distributions in Statistics
5.3 Some Useful Functions for Descriptive Statistics
5.3.1 Introduction
5.3.2 Bivariate or Higher Order Data Structure
5.4 Statistical Tests
5.4.1 Exploring for Normality or Any Other Distribution in the Data
5.4.2 Goodness-of-fit Testing for Fitted Distributions to Data
5.4.2.1 Continuous distributions
5.4.2.2 Discrete distributions
5.4.3 T-tests
5.4.3.1 One sample test for the mean
5.4.3.2 Two sample tests for the mean
5.4.4 F-test for Equal Variances
5.5 Main Principles of Maximum Likelihood Estimation
5.5.1 Introduction
5.5.2 MLE of the Exponential Distribution
5.5.2.1 Obtaining the MLE numerically using R
5.5.2.2 Obtaining the MLE analytically
5.5.3 Large Sample (Asymptotic) Properties of MLE
5.5.4 Fitting Distributions to Data in R Using MLE
5.5.5 Likelihood Ratio Test, LRT
5.6 Regression: Basic Principles
5.6.1 Simple Linear Regression
5.6.2 Quantifying Uncertainty on
5.6.3 Analysis of Variance in Regression
5.6.3.1
R
2
and adjusted
R
2
Coefficient of Determination
5.6.4 Some Visual Diagnostics for the Proposed Simple Regression Model
5.7 Multiple Regression
5.7.1 Introduction
5.7.2 Regression and MLE
5.7.2.1 Multivariate Regression
5.7.3 Tests
5.7.3.1 Likelihood Ratio Test in Regression
5.7.3.2 Akaike Information Criterion: AIC
5.7.3.3 AIC and Regression model selection
5.7.3.4 Bayesian Information Criterion: BIC
5.7.4 Variable Selection, Finding the Most Appropriate Sub-Model
5.7.5 Backward Elimination
5.7.6 Forward Selection
5.7.7 Using AIC/BIC Criteria
5.7.8 LRT in Model Selection
5.7.9 Automatic Search Using R-squared Criteria
5.7.10 Concluding Remarks on Test Data
5.7.11 Modelling Beyond Linearity
5.8 Dummy/Indicator Variable Regression
5.8.1 Introducing Categorical Variables
5.8.2 Continuous and Indicator Variable Predictors – Including Load in the Model
5.9 Recommended Reading
6 Multivariate Distributions, and Sums of Random Variables
6.1 Multivariate Distributions – Examples in Finance
6.2 Simulating Multivariate Normal Variables
6.3 The Summation of a Number of Random Variables
6.4 Conclusion
6.5 Recommended Reading
7 Benefits of Diversification
7.1 Introduction
7.2 Background
7.3 Key Mathematical Ideas
7.4 Running Simulations
7.5 Recommended Reading
8 Modern Portfolio Theory
8.1 Introduction
8.2 2-Asset Portfolio
8.3 3-Asset Portfolio
8.4 Introduction of a Risk-free Asset to the Portfolio
8.4.1 Adding a Risk-free Asset
8.4.2 Capital Market Line and the Sharpe Ratio
8.4.3 Borrowing to Obtain Higher Returns
8.5 Appendix: Lagrange Multiplier Method
8.6 Recommended Reading
9 Duration – A Measure of Interest Rate Sensitivity
9.1 Introduction
9.2 Duration – Definitions and Interpretation
9.3 Duration Function in R
9.4 Practical Applications of Duration
9.5 Recommended Reading
10 Asset-Liability Matching: An Introduction
10.1 Introduction
10.2 What Interest Rates Do Institutions Use To Measure Their Liabilities?
10.3 Variance of the Solvency Position
10.4 Characteristics of Various Asset Classes and Liabilities
10.5 Our Scenarios
10.6 Results
10.7 Simulations
10.8 Exercise and Discussion – an Insurer With Predominately Short-Term Liabilities
10.9 Potential Exercise
10.10 Conclusions
10.11 Recommended Reading
11 Hedging: Protecting Against a Fall in Equity Markets
11.1 Introduction
11.2 Our Example
11.2.1 Futures Contracts – A Brief Explanation
11.2.2 Our Task
11.3 Adopting a Better Hedge
11.4 Allowance for Contract and Portfolio Sizes
11.5 Negative Hedge Ratio
11.6 Parameter and Model Risk
11.7 A Final Reminder on Hedging
11.8 Recommended Reading
12 Immunisation – Redington and Beyond
12.1 Introduction
12.2 Outline of Redington Theory and Alternatives
12.3 Redington’s Theory of Immunisation
12.4 Changes in the Shape of the Yield Curve
12.5 A More Realistic Example
12.5.1 Determining a Suitable Bond Allocation
12.5.2 Change in Yield Curve Shape
12.5.3 Liquidity Risk
12.6 Conclusion
12.7 Recommended Reading
13 Copulas
13.1 Introduction
13.2 Copula Theory – The Basics
13.3 Commonly Used Copulas
13.3.1 The Independent Copula
13.3.2 The Gaussian Copula
13.3.3 Archimedian Copulas
13.3.4 Clayton Copula
13.3.5 Gumbel Copula
13.4 Copula Density Functions
13.5 Mapping from Copula Space to Data Space
13.6 Multi-dimensional Data and Copulas
13.7 Further Insight into the Gaussian Copula: A Non-rigorous View
13.8 The Real Power of Copulas
13.9 General Method of Fitting Distributions and Simulations – A Copula Approach
13.9.1 Fitting the Model
13.9.2 Simulating Data Using the mvdc and rMvdc Functions
13.10 How Non-Gaussian Copulas Can Improve Modelling
13.11 Tail Correlations
13.12 Exercise (Challenging)
13.13 Appendix 1 – Copula Properties
13.14 Appendix 2 – Rank Correlation and Kendall’s Tau,
τ
13.15 Recommended Reading
14 Copulas – A Modelling Exercise
14.1 Introduction
14.2 Modelling Future Claims
14.2.1 Data
14.2.2 Fitting Appropriate Marginal Distributions
14.2.3 Fitting The Copula
14.2.4 Assessing Risk From the Analysis of Simulated Values
14.2.5 Comparison with the Gaussian Copula Model
14.2.6 Comparison of the Models with the Data
14.3 Another Example: Banking Regulator
14.4 Conclusion
15 Bond Portfolio Valuation: A Simple Credit Risk Model
15.1 Introduction
15.2 Our Example Bond Portfolio
15.2.1 Description
15.2.2 The Transition Matrix
15.2.3 Correlation Matrix
15.2.4 Simulations and Results
15.2.5 Incorporating Interest Rate Risk – A Simple Adjustment
15.2.6 Portfolio Consisting of Highly Correlated Bonds
15.3 Further Development of this Model
15.4 Recommended Reading
16 The Markov 2-State Mortality Model
16.1 Introduction
16.2 Markov 2-State Model
16.3 Simple Applications of the 2-State Model
16.4 Estimating Mortality Rates from Data
16.5 An Example: Calculating Mortality Rates for One Age Band
16.6 Uncertainty in Our Estimates
16.7 Next Steps?
16.8 Appendices
16.8.1 Informal Discussion of
μ
16.8.2 Intuitive meaning of
f
x
(t)
16.9 Recommended Reading
17 Approaches to Fitting Mortality Models: The Markov 2-state Model and an Introduction to Splines
17.1 Introduction
17.2 Graduation of Mortality Rates
17.3 Fitting Our Data
17.3.1 Objective
17.3.2 Summarised Data
17.4 Model Fitting with Least Squares
17.5 Individual Member Data
17.6 Comparing Life Tables with a Parametric Formula
17.7 Splines: An Introduction
17.7.1 Overview
17.7.2 Data
17.7.3 Fitting the Model: Spline regression
17.7.4 Adjusted Dataset
17.8 Summary
17.9 Recommended Reading
18 Assessing the Suitability of Mortality Models: Statistical Tests
18.1 Introduction
18.2 Theory
18.3 Our Mortality Data and Various Proposed Mortality Rates
18.4 Testing the Standard Table Rates – Table 1,
18.4.1 Data and initial plot
18.4.2
x
2
test
18.4.3 Signs Test – for Overall Bias
18.4.4 Serial Correlations Test; Testing for Bias Over Age Ranges
18.4.5 Analysing the Distribution of Deviances
18.4.6 logL, AIC Calculations
18.4.7 Conclusions on Conclusions on
18.5 Graduation of Mortality Rates by Adjusting a Standard Table
18.5.1 Testing Table 2,
18.5.2 Adjusting Table 2
18.6 Testing Graduated Rates Obtained from a Parametric Formula,
18.7 Comparing Our Candidate Rates
18.8 Over-fitting
18.9 Other Thoughts
18.10 Appendix – Alternative Calculations of LogL’s
18.11 Recommended Reading
19 The Lee-Carter Model
19.1 Introduction
19.2 Using the L-C Model to Create Data and Fit the Model
19.2.1 Introducing the Lee-Carter Model
19.2.2 Calculating the Parameter Values
19.2.3 Interpretation of
a
x
,
b
x
, and
k
t
19.3 Using L-C to Model Actual Mortality Data from HMD
19.4 Using the lca Function in the Demography Package
19.5 Constructing Your Own Demogdata Object
19.6 Forecasting Mortality Rates
19.7 Case Study: The Impact of the HIV Virus on Mortality Rates
19.8 Recommended Reading
20 The Kaplan-Meier Estimator
20.1 Introduction
20.2 What Is Censoring?
20.2.1 Non-Informative Censoring
20.3 Defining the Relevant Event
20.4 K-M Theory
20.5 Introductory Example: Monitoring Delays in Making Claim Payments
20.6 Lung Cancer Example
20.6.1 Basic Results
20.6.2 Comparison of Male and Female Rates
20.6.3 Doctor Assessment Scores – ph.ecog
20.7 Issues with the Kaplan-Meier Model
20.8 Recommended reading
21 Cox Proportionate Hazards Regression Model
21.1 Introduction
21.2 Cox Model Equation
21.3 Applications
21.3.1 Smokers’ Mortality: Small Data Set
21.3.2 Smokers’ Mortality: Larger Data Set
21.3.3 Multiple covariates and interactions
21.4 Comparison of Cox and Kaplan Meier Analyses of Lung Cancer Data
21.5 Recommended Reading
22 Markov Multiple State Models: Applications to Life Contingencies
22.1 Introduction
22.2 The Markov Property
22.3 Markov Chains and Jump Models
22.3.1 Examples
22.3.2 Differences between Markov Chain and Markov Jump Models
22.4 Markov Chains (Discrete Time)
22.4.1 Applying Markov Chains to Estimate Future Probabilities
22.4.2 Markov Chain Model – NCD
22.4.3 Coding Exercise for Markov Chains
22.5 Markov Jump Models
22.5.1 Example – Simple 3-State Model (All Transitions Possible)
22.5.2 Example – H-S-D Model
22.6 Non-Constant Rates
22.7 Premium Calculations
22.8 Transition Rate Estimation
22.9 Multiple Decrement Models
22.9.1 Introduction
22.9.2 Using a Numerical Approach for the above Fixed Rate Problems
22.9.3 An Exact Approach
22.9.4 Age-Dependent Rates
22.10 Recommended Reading
23 Contingencies I
23.1 Introduction
23.2 What is Meant by “Contingencies” in an Actuarial Context?
23.3 The Life Table
23.4 Expected Present Values of the Key Contingency Functions
23.5 Writing Our Own Code – Some Introductory Exercises
23.6 The Lifecontingencies Package
23.6.1 The Lifetable and Actuarialtable Objects
23.6.2 Application to Actual Mortality Tables: AM92 and AF92
23.6.3 Annuities
23.6.4 Annuities Paid more Frequently than Annually
23.6.5 Increasing Annuities
23.6.6 Reversionary Annuities
23.6.7 Example: Annuity Company Valuation
23.6.8 Life Assurance functions
23.6.9 Assurance Policies with immediate Payment on Death: Ax
23.7 Simulation of Future Lifetimes
23.8 Recommended Reading
24 Contingencies II
24.1 Introduction
24.2 Mortality Tables: AM92
24.3 Uncertainty in Present Values: Variance
24.4 Simulations
24.4.1 Single Policy
24.4.2 Portfolios with 100 Policies – Portfolio Claim Distribution from Simulations
24.5 Simulation of Annuities
24.6 Premium Calculations
24.7 Profits – Probability Distributions of Single Policies and Portfolios
24.8 Progression of expected profits throughout the lifetime of a policy: no reserves held
24.9 Policy Values
24.9.1 Calculating Policy Values
24.9.2 Recursive Formulae – Discrete and Continuous (Thiele)
24.9.3 Recursive Equation with 3 States – HSD Model
24.10 Profits from Policies where Reserves Are Held
24.10.1 Calculating the Profit Vector
24.10.2 Measures of Profit and Profit Testing
24.11 Profit Uncertainty: Interest Rate and Mortality Risk
24.12 Risk Capital and Risk-adjusted Return Measures
24.13 Unit-linked Policies
24.13.1 Introduction
24.13.2 Example with Deterministic and Stochastic Projections
24.14 Additional Exercises
24.15 Appendix: Dependent and Independent Rates
24.16 Recommended Reading
25 Actuarial Risk Theory – An Introduction: Collective and Individual Risk Models
25.1 Introduction
25.2 Collective Risk Model
25.3 Poisson Compound Collective Risk Model
25.4 Applications of the Model
25.4.1 Setting Appropriate Reserves and Premium Pricing
25.4.2 Increasing the Number of Independent Policies
25.4.3 Adopting a Normal Distribution Approximation
25.4.4 Return on Capital
25.4.5 Skewness of the Compound Poisson Model
25.4.6 Sum of Compound Poisson Distributions
25.5 Compound Binomial Collective Risk Model
25.6 Compound Negative Binomial Distribution
25.7 Panjer’s Recursion Formula
25.8 Closing Thoughts on Collective Risks Models
25.9 Individual Risk Model
25.9.1 Standard Individual Risk Model
25.9.2 Alternative Model – ‘The Poisson Individual Risk Model’
25.10 Issues with Heterogeneity
25.11 Policies Which Are Not Independent
25.12 Incorporating Parameter Uncertainty in the Models
25.13 Claim Amount Distributions: Alternatives to the Gamma Distribution
25.14 Conclusions
25.15 Recommended Reading
26 Collective Risk Models: Exercise
26.1 Introduction
26.2 Analysis of Claims Data
26.3 Running Simulations
26.4 Tails of the Distribution
26.5 Allowing for Parameter Uncertainty
26.6 Conclusions
26.7 Recommended Reading
27 Generalised Linear Models: Poisson Regression
27.1 Introduction
27.2 Examples/Exercises/Data
27.3 Brief Recap on Multiple Linear Regression
27.4 Generalised Linear Models (“GLMs”)
27.5 Goodness of Fit of GLMs
27.6 Poisson Regression
27.6.1 Introduction
27.6.2 Using Poisson Regression to Model Claim Numbers
27.7 Data with Varying Exposure Periods
27.7.1 Claim Rates and the Offset
27.7.2 Application to Aggregated Data in Section 27.1
27.8 Categorical and Continuous Variables
27.8.1 Problem with Continuous Variables
27.8.2 Categorical Variables
27.9 Interaction between Variables
27.10 Over-dispersion
27.11 Miscellaneous Exercises
27.12 Further Study / Next Steps
27.13 Recommended Reading
28 Extreme Value Theory
28.1 Introduction
28.2 Why Use EVT?
28.3 Generalised Pareto Distribution – “GPD”
28.4 EVT Analysis of Historic Daily Equity Market Returns (S&P 500)
28.4.1 Basic EVT Analysis
28.4.2 Will a Normal Distribution (and Other Alternatives) Do Just as Well?
28.5 Data for Further EVT Analysis
28.6 Recommended Reading
29 Introduction to Machine Learning: k-Nearest Neighbours (kNN)
29.1 Introduction
29.2 Example 1 – Identifying a Fruit Type
29.2.1 Data
29.2.2 Overview of the Process
29.2.3 How does the kNN Algorithm Work?
29.2.4 Normalising Our Data
29.2.5 Varying k
29.2.6 Using Our Model
29.3 Analysis of Our Model – the Confusion Matrix
29.4 Example 2 – Cancer Diagnoses
29.5 Conclusion
29.6 Recommended Reading
30 Time Series Modelling in R – Dr A Kume
30.1 Introduction
30.2 Linear Regression Versus Autoregressive Model
30.3 Three Components for Time Series Modelling
30.4 Stationarity
30.5 Main Tools in R for ARIMA Modelling
30.5.1 PACF as a Derivation of ACF and Their General Behaviour for ARMA(p,q) Models
30.5.2 How to Simulate and Obtain the Theoretical Values of ACF and PACF for ARMA Models
30.6 Identifying a Set Possible Models to the Data Including the Order of Differencing
30.6.1 Model Fitting to Time Series Data
30.6.2 Parameter Estimation for Pure Auto-Regressive Models
30.6.3 Diagnostic Plots
30.6.4 Forecasting
30.7 Dealing with Real Data far from Stationary
30.7.1 Non Parametric Approaches
30.7.2 Airline Data Modelling Using Multiplicative Seasonal Models
30.8 Recommended Reading
31 Volatility Models – GARCH
31.1 Introduction
31.2 Why Use GARCH Models?
31.3 Outline of the Chapter
31.4 Key Theoretical Concepts with GARCH
31.5 Simulation of Data Using a GARCH Model
31.6 Fitting a GARCH Model to Data, and Analysis
31.6.1 Fitting a GARCH Model
31.6.2 Further Analysis of the Data; Comparison with the Normal Distribution
31.6.3 Further Analysis of the Data; Volatility Clustering
31.7 A Note on Correlation and Dependency
31.8 GARCH Long-Term Variance
31.9 Exercise: Shocks to Global Equity Markets – The Global Financial Crisis 2008, and COVID-19
31.10 Extensions to the GARCH Model
31.11 Appendix – A Mixture of Normal Distributions
31.12 Recommended Reading
32 Modelling Future Stock Prices Using Geometric Brownian Motion: An Introduction
32.1 Introduction
32.1.1 Discrete Gaussian Random Walk
32.2 Geometric Brownian Motion
32.3 Applications of GBM, and Simulating Prices
32.4 Recommended Reading
33 Financial Options: Pricing, Characteristics, and Strategies
33.1 Introduction
33.2 What is a Financial Option?
33.3 What are Financial Options Used for?
33.4 Black, Scholes and Merton Differential Equation
33.4.1 Assumptions Underlying B-S-M Formulation
33.4.2 Solution to B-S-M Equation for European Call Options
33.4.3 Call Option Price Function
33.5 Calculating the Option Price Using Simulations
33.6 Factors Which Affect the Price of a Call Option
33.6.1 Share Price
33.6.2 Time to Expiry
33.6.3 Combined Effect of Share Price and Time to Expiry
33.6.4 Other Factors
33.7 Greeks
33.8 Volatility of Call Option Positions
33.9 Put Options
33.10 Delta Hedging
33.11 Sketch of the B-S-M Derivation
33.12 Further Tasks
33.13 Appendix
33.14 Recommended Reading
Index
End User License Agreement
CHAPTER 05
Table 5.1 Some distributions and...
Table 5.2 A few entries...
CHAPTER 30
Table 30.1 The derivation of...
Table 30.2 The general behaviour...
CHAPTER 01
Figure 1.1 Downloading R.
Figure 1.2 Downloading R-studio.
Figure 1.3 Starting R-studio, script...
Figure 1.4 Script window is...
Figure 1.5 Histograms : 2,000...
CHAPTER 02
Figure 2.1 Effect of gravit..
Figure 2.2 Plotting a functio..
CHAPTER 03
Figure 3.1 Non-constant forward...
Figure 3.2 Cashflow payments...
CHAPTER 04
Figure 4.1 Effect of interest...
Figure 4.2 Effect of term...
Figure 4.3 Effect on bond...
Figure 4.4 Effect on bond...
Figure 4.5 Simple bond valuation...
Figure 4.6 Effect of payments...
Figure 4.7 Solution from Exercise...
CHAPTER 05
Figure 5.1 Histogram of simulated...
Figure 5.2 Empirical and theoretical...
Figure 5.3 Some realisation of...
Figure 5.4 Normal and t...
Figure 5.5 CLT for sample...
Figure 5.6 Histograms and a...
Figure 5.7 Two options for...
Figure 5.8 Exploring sample quantiles...
Figure 5.9 Exploring sample quantiles...
Figure 5.10 Exploring the Likelihood...
Figure 5.11 Likelihood (and Log...
Figure 5.12 Fitting Exponential and...
Figure 5.13 A bivariate plot...
Figure 5.14 A bivariate plot...
Figure 5.15 Some examples of...
Figure 5.16 Model 3: Analysis...
Figure 5.17 Pairwise plots for...
Figure 5.18 Plots for various...
Figure 5.19 Box plots of...
Figure 5.20 The fitted regression...
CHAPTER 06
Figure 6.1 Simulations from a...
Figure 6.2 Bi-variate Normal...
Figure 6.3 Bi-variate Normal...
CHAPTER 07
Figure 7.1 The effect of...
Figure 7.2 The effect of...
Figure 7.3 Simulated returns...
Figure 8.2 3-Asset portfolio...
Figure 8.3 3 Assets (including...
Figure 8.4 Capital market line...
Figure 8.5 Using a Lagrange...
CHAPTER 10
Figure 10.1 Simulated solvency positions...
Figure 10.2 Simulated solvency position...
CHAPTER 11
Figure 11.1 The effect of...
Figure 11.2 Simulations (assuming a...
Figure 11.3 High-quality hedge...
CHAPTER 12
Figure 12.1 Funding level as...
Figure 12.2 Comparison of assets...
Figure 12.3 Solvency level evolution...
Figure 12.4 Liability cashflows...
Figure 12.6 Yield curve scenarios...
Figure 12.7 Asset and liability...
CHAPTER 13
Figure 13.1 Two realisations of...
Figure 13.2 Gaussian copulas with...
Figure 13.3 Clayton copula simulation...
Figure 13.4 Realisations of Gumbel...
Figure 13.5 Copula density functions...
Figure 13.6 Mapping from copula...
Figure 13.7 Clayton copula mapped...
Figure 13.8 Simulated data using...
Figure 13.9 Representation of lower...
Figure 13.10 The rectangle inequality...
Figure 13.11 Calculation of Kendall..u
CHAPTER 14
Figure 14.1 UK and US..a
Figure 14.2 Comparing data with...
Figure 14.3 Comparison of simulated...
Figure 14.4 Comparison of simulations...
CHAPTER 15
Figure 15.1 Yield curves...
CHAPTER 16
Figure 16.1 2-state Markov...
Figure 16.2 Survival probability with...
Figure 16.3 Mortality following Gompertz...
Figure 16.4 Mortality characteristics of...
Figure 16.5 Mortality rates, survival...
CHAPTER 17
Figure 17.1 Crude rates calculated...
Figure 17.2 Comparing crude rates...
Figure 17.3 Lives aged 50...
Figure 17.4 Comparing the Gompertz...
Figure 17.5 Chosen basis splines...
Figure 17.6 B(60) values...
Figure 17.7 Two knots at..0
Figure 17.8 Pronounced accident hump...
CHAPTER 18
Figure 18.1 Standard table vs..e
Figure 18.2 Graduation using a..e
Figure 18.3 2nd differences: Assessing..g
CHAPTER 19
Figure 19.1 Changes in UK...
Figure 19.2 Fitted rates using...
Figure 19.3 HMD and L...
Figure 19.4 Fitted rates using...
Figure 19.5 k parameter values...
Figure 19.6 Projected k’...
Figure 19.7 Projection of rates...
Figure 19.8 Comparison of 2018...
Figure 19.9 Comparison of actual...
Figure 19.10 Changes in US...
Figure 19.11 k vector analysis...
CHAPTER 20
Figure 20.1 K-M survival...
Figure 20.2 K-M plot...
Figure 20.3 Survival Distributions using...
CHAPTER 21
Figure 21.1 Non-constant ratio..s
Figure 21.2 Loglikelihood function value..
CHAPTER 22
Figure 22.1 2 examples of...
Figure 22.2 Weather forecast using...
Figure 22.3 NCD system...
Figure 22.5 Probabilities of sleeping...
Figure 22.6 Healthy - sick - dead...
Figure 22.7 Solutions to H...
Figure 22.8 Solution to 2...
Figure 22.9 SIRD model...
Figure 22.11 Probabilities from models...
Figure 22.12 Multiple decrement model...
CHAPTER 23
Figure 23.1 Simulated Future Lifetimes...
CHAPTER 24
Figure 24.1 Probability distributions of...
Figure 24.2 Portfolios (100 policies...
Figure 24.3 Probability distribution of...
Figure 24.4 Single whole life...
Figure 24.5 Present value of...
Figure 24.6 Present value of...
Figure 24.7 Present value of...
Figure 24.8 Simulated profits from...
Figure 24.9 Present value of...
Figure 24.10 Evolution of reserves...
Figure 24.11 Evolution of reserves...
Figure 24.12 Evolution of Thiele...
Figure 24.13 Profit vectors with...
Figure 24.14 Distribution of profits...
Figure 24.15 Unit-linked fund...
CHAPTER 25
Figure 25.1 Simulations from Compound...
Figure 25.2 Effect on capital...
Figure 25.3 Skewed claims distribution...
Figure 25.4 Sum of independent...
Figure 25.5 Sum of Compound...
Figure 25.6 Panjer vs Normal...
Figure 25.7 Q-Q Plot...
CHAPTER 26
Figure 26.1 Claim amounts data...
Figure 26.2 Comparing total monthly...
Figure 26.3 QQ plot: comparing...
CHAPTER 27
Figure 27.1 Fitting of GLM...
Figure 27.2 Comparing rates using...
Figure 27.3 With and without...
CHAPTER 28
Figure 28.1 Cumulative distribution function...
Figure 28.2 Probability of exceeding...
Figure 28.3 Visual checks on...
Figure 28.4 Daily equity returns...
Figure 28.5 Analysis of daily...
CHAPTER 29
Figure 29.1 kNN analysis...
CHAPTER 30
Figure 30.1 Daily exchange rate...
Figure 30.2 Plots of the...
Figure 30.3 Some time series...
Figure 30.4 Realisations of AR1...
Figure 30.5 A few more...
Figure 30.6 Some Random Walk...
Figure 30.7 ACF and PACF...
Figure 30.8 Standard output of...
Figure 30.9 Theoretical ACF and...
Figure 30.10 A random walk...
Figure 30.11 output of x3...
Figure 30.12 output of model...
Figure 30.13 output of model...
Figure 30.14 Forecasting values of...
Figure 30.15 Forecasting values and...
Figure 30.16 Decomposition of the...
Figure 30.17 Holt-Winters algorithm...
Figure 30.18 A set of...
Figure 30.19 ouput of ...
CHAPTER 31
Figure 31.1 US equity returns...
Figure 31.2 Fat tails...
Figure 31.4 Histogram of GARCH...
Figure 31.5 ACFs...
Figure 31.7 Regression to the...
CHAPTER 32
Figure 32.1 Simulated discrete Gaussian...
Figure 32.2 Two Simulated discrete...
Figure 32.3 Simulated BM with...
Figure 32.4 Simulated Share prices...
Figure 32.5 Share price distributions...
Figure 32.6 Five simulated share...
Figure 32.7 Evolution of the...
CHAPTER 33
Figure 33.1 Call Option Price...
Figure 33.2 Option price as...
Figure 33.3 3-dimensional plot...
Figure 33.4 Simulated share price...
Figure 33.5 Distribution of prices...
Figure 33.6 Distribution of profit...
Figure 33.7 Delta hedging: Portfolio...
Cover
Title Page
Copyright Page
Dedication
Table of Contents
About the Companion Website
Begin Reading
Index
End User License Agreement
i
ii
iii
iv
v
vi
vii
viii
ix
x
xi
xii
xiii
xiv
xv
xvi
xvii
xviii
xix
xx
xxi
xxii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
This book is accompanied by a companion website:
www.wiley.com/go/rprogramming
The website contains much of the R code used in this book, allowing copying of the suggested code, thus saving the reader significant time.
The website also includes numerous data files, such as investment data and mortality data. These data files will be analyzed using R code in several chapters of the book.
The overriding objective of this book is to help students of actuarial mathematics and related disciplines such as financial mathematics, develop programming skills which will enhance their understanding of actuarial, financial, and statistical concepts, enabling them to solve real-world problems encountered in these fields. Breaking this down further, the purposes of the book is two-fold:
To provide an introduction to the programming language,
R
. This is achieved using worked examples and undertaking exercises commonly seen in the fields of actuarial and financial mathematics.
Secondly, to improve the reader’s level of understanding of actuarial and financial topics by using these programming skills. We believe that most students can develop a deeper understanding of mathematical material by solving problems using a programming language. From our experience of teaching actuarial mathematics and statistics, students often confirm that their understanding of a topic has vastly improved following the completion of a computer-based exercise or project.
A similar effect is noted in students who opt to take a year out from their studies to work in the financial industry, often applying extensive programming skills to solve real-world problems. Such students invariably notice a similar level of improvement in their understanding of concepts. It is hoped, to some extent, that this learning experience can be mirrored throughout this book.
The authors have significant teaching experience at both undergraduate and postgraduate levels, enhanced with experience in assessment processes for universities and the actuarial profession. This has given insights into the typical issues students experience with actuarial mathematics – problems often arise from a fundamental misunderstanding of introductory material. For example, a final year undergraduate may only fully understand a concept introduced in their first year whilst undertaking programming coursework in their final year on a specific application of the material taught in the first year, experiencing that “Eureka” moment.
The reader should not underestimate the extent to which learning a programming language, such as R, to a level such that most exercises in this book can be completed, will help the reader in the employment market. Having a good working knowledge of R or similar language should improve the career prospects of the graduate.
A further motivating factor for writing this book originates from the decision taken by the Institute and Faculty of Actuaries (IFoA) in 2018 to choose the R programming language as an integral part of its syllabus. Indeed, much of the IFoA’s syllabi for subjects CM1, CM2, CS1, and, in particular, CS2 are covered in the book.
This book is aimed at two main groups:
It has been written principally for university level actuarial and financial maths students, together with graduates undertaking professional actuarial exams (e.g. with the IFoA and SOA), and more generally to anyone aspiring to careers in actuarial mathematics and finance. The book should be useful to the student throughout their studies, whether first-year undergraduate or postgraduate, spanning topics from fundamentals of financial mathematics and Brownian motion, to a variety of mortality models and analysing investment strategies such as asset–liability matching and hedging.
Secondly, we hope the book appeals to more experienced professionals in related disciplines wishing to develop skills in a programming language, who may have had limited opportunities to do so earlier in their career. By undertaking examples and exercises related to material with which they are already familiar, this book provides an efficient journey to acquiring such programming skills. Such users of this book may therefore wish to review Chapters 3, 4, 23, and 25, which include traditional material most actuaries would be familiar with.
In writing this book, we have attempted to cater for a wide range of experiences and abilities. The overall style of the book aims to ensure that the basics of each topic are covered, with appropriate text, examples, and exercises, whilst including several more advanced tasks. As noted elsewhere, the reader should aim at expanding on the tasks included in this book.
It is assumed that the reader will have a knowledge of statistics and mathematics at a level expected from that of a first year undergraduate in a maths-based university degree.
The book includes the majority of topics covered in a typical undergraduate course in actuarial science. There is also perhaps a greater emphasis placed on a number of actuarial concepts which may not directly be assessed in traditional university courses; indeed, several examples involve addressing practical problems which the student will see in the workplace. For example, we introduce models which may help in improving how correlations are dealt with by insurance companies, and develop an understanding of fundamental risk management techniques such as hedging, asset-liability matching, and diversification. Ultimately, we hope the reader develops a good understanding of the problem-solving approaches used in the workplace.
To get the most from this book it is anticipated that during each study session the user will simultaneously:
study the material in this book,
access the book’s website (code and data), and
write and run code on their computer.
It would be expected that the user proceeds to write their own code and duplicate the results. The suggested code for each example/exercise is one of many possible solutions; it may be quite reasonable, depending on the scenario, for your code to be quite different to that set out in this book. It is important that the user practises writing their own, independent code, and does not try to learn, by rote, the code in the book. As noted in Chapter 1, the reader may wish to save a script file in respect of each chapter. Indeed the reader may wish to write functions incorporating and combining several sections of code from the website, improving the efficiency of their code.
We would expect most users to have had some prior exposure to, and knowledge of, the material in a chapter before embarking on it, either following an initial period of independent study, or attendance at related university lectures or tutorials; it is anticipated that readers will have access to alternative study material for each topic.
The website contains the majority of the R code included in the book, together with suggested code relating to the exercises. It is intended that the reader will treat the book and website as companions; it is not expected that most users use the book and website separately (for the most part at least). Note that a small amount of code is not included on the website (the missing code can simply be copied from the book) – this is to encourage more active learning of the material.
The vast majority of students will gain most benefit from frequent practise of writing code; occasional engagement is likely to end in less satisfactory results. It is hoped that the style of the book will lend itself to encouraging a greater level of creativity from the student, developing their own examples and exercises as their skills and knowledge increase.
We start by covering the fundamentals of R in Chapter 1, “R: What you need to know to get started”, and Chapter 2, “Functions in R”. If you are new to R we recommend that you first read these two chapters, and revisit them when required. Chapter 1 explains the key aspects of R, e.g. writing your first code in R, how objects are used etc. From experience, most students find it beneficial to initially read this chapter relatively quickly, referring back to it frequently. Readers new to R should benefit from spending some time digesting the examples in Chapter 2 to get a feel for writing basic R code and applying existing functions.
The typical actuarial and financial mathematics student is then likely to cover Chapters 3 and 4 – “Financial Mathematics I” (and “II”); the material included in these two chapters is usually covered in the first year of actuarial mathematics programmes at university.
As noted above, we think most readers will benefit from only a relatively brief study of Chapters 1 and 2, and to move onto the main chapters and start practising! It is unlikely to be beneficial to spend days memorising the material in these introductory chapters.
Most chapters are largely self-contained, with a few obvious exceptions, e.g. Financial Mathematics I and II, Contingencies I and II, the chapters on copulas, and Markov mortality models. There is a certain amount of grouping of chapters where the material is strongly related, and it is likely that most readers will tend to read a group of topics together.
A number of chapters lend themselves particularly to a relatively brief initial study, subsequently re-visiting them when studying a later chapter which uses that material. For example, application of the material in Chapters 5 and 6 is used in several later chapters of the book.
Most chapters begin with setting key objectives and a broad discussion of the main ideas behind the topic of the chapter. This is usually followed with a certain amount of theory, the length of which is based on our experience of how well students generally tend to grasp the concepts. Compared to other texts, there will, in general, be less theory included in this book. Many topics covered in this book already have a wealth of excellent texts – repetition of the same theory is not warranted here. The importance of mathematical rigour should be stressed at this point; the student will benefit greatly over the long-term by developing a deeper understanding of the material which can be adapted to various scenarios (such comparisons are highlighted in several chapters of the book). Each chapter ends with a Recommended Reading list.
As noted in Section 3, most readers will require additional principal reading material on each topic to supplement the material in this book. This book focuses on solving actuarial problems by using the R programming language, and is not intended to be used as a student’s sole source of learning for each subject.
The reader may also find it beneficial to own a copy of the “Formulae and Tables” issued by the Institute and Faculty of Actuaries (2002) (also freely available online at the time of publishing).
There are over 400 examples and exercises included in this book which readers can use to develop their programming skills and understanding of the mathematical concepts. The book includes two main types of tasks:
Analysis of datasets (such as claims data, investment data, mortality data etc.), fitting various models to data, and testing the results. You will find these data sets on the book’s website.
Other tasks do not require data sets. Code is used to develop a better understanding of actuarial concepts, often with the use of simulations. The book includes, for the most part, the use of relatively simple code, aimed at communicating the fundamental ideas of the mathematics involved – it is the intention that the reader will develop their coding skills through self-study.
It is hoped that readers will also combine code from various parts of the book, developing their own more advanced models. For example, by combining code from various chapters on asset modelling, claims models, and mortality models, one could develop a model for an insurance company.
Ultimately, actuaries are involved in the management of risk – much of this book relates to measuring risk and uncertainty, and how to manage the risks identified. Indeed, inadequate risk management has contributed to many corporate failures, both on the macro or global level, and also within firms and industries. Many of the examples and exercises aim to develop these analytical skills. We mainly discuss risk in the context of financial risk (such as interest rate risk and market price risk), and demographic risk (such as mortality risk), although many of the principles could be applied to operational-type risks. Most of these discussions will relate to the fields of finance and insurance (both life and non-life).
A word of warning – the material in this book mainly relates to the quantitative management of risk, that is, analysing data and proposing statistical distributions and models to predict financial outcomes. It is important when analysing real-world risk that a qualitative approach is taken alongside such a quantitative approach – the relative weights assigned to the two approaches depending on the particular scenario. A risk in itself is an over-reliance on quantitative financial models, at the expense of any qualitative analysis and exercise of judgement.
The student is likely to benefit from a review of case study material which relate to risk management cases. Study of such cases will provide a more rounded education and knowledge-base of risk management, rather than solely understanding the mathematical approach discussed in this book. Examples of such case studies include: Robert Maxwell and the Mirror Group newspapers, Barings Bank, Equitable Life Assurance Company, Long-Term Capital Management, GFC 2008, Northern Rock, Lehmann Brothers, UK pension schemes/LDI crisis 2022, Silicon Valley Bank; and relevant regulations, such as: Basel Accords, Solvency 2, The Dodd-Frank Act, The Sarbanes-Oxley Act.
A key skill of the actuary is to verify complex calculations efficiently. For example, actuarial valuations of insurance companies and company pension schemes typically involve millions of calculations; clearly it is not sensible to check all of them. The actuary must be able to check calculations in an appropriate, cost-effective manner such that they, and other stakeholders, have sufficient confidence in them and can rely on their accuracy. Errors in these calculations may result in advice which has a significant impact on company balance sheets, solvency levels, profits, amount of additional funding required, dividend payouts, and even future career prospects. We will often provide more than one coded solution to a problem e.g. by performing an alternative, approximate calculation. This is a skill the authors believe most undergraduates would benefit from improving prior to entering the workplace.
The book’s website includes code from each chapter of the book, together with all data files used. It will also, periodically, be updated with extra coding exercises and solutions. We welcome feedback from our readers regarding areas which require further examples.
As referred to earlier, the website is fundamental in using this book efficiently. As well as providing solutions to exercises in the book, it allows copying of the suggested R code thus saving significant time.
It is expected that many readers will have some level of experience in using Microsoft Excel. Excel is a fantastic calculation tool. A significant benefit of Excel is its intuitive nature, making it relatively straightforward to learn the basics and quickly reach a reasonable level of competency. Indeed many financial institutions use Excel as a principal piece of software. Programming languages such as R have a significantly steeper learning curve than Excel; most new users, particularly those with no programming language background, will take several days to familiarise themselves with the basic workings of the R language.
In general, Excel is likely to be preferred to R for simpler tasks, or for more involved calculations which are unlikely to require numerous re-runs with various adjustments; the extra cost and time involved in writing R code may not be justified in such cases given the relatively small savings over the long term. In the same way that there are occasions when a calculator is a preferable tool to Excel, there are many occasions when Excel will be preferable to R.
It is important for the reader who has experience with Excel to develop an understanding of whether a programming language such as R will be more suited at solving a particular problem, or set of problems, than Excel. This should be achieved as the reader makes progress with this book. The reader is encouraged to tackle exercises in Excel (where possible) and to compare the process with R. An obvious example is that of the Loan Schedule discussed in Chapter 4 (where the reader is specifically encouraged to reproduce the calculations in Excel); it may well be the case here that using R code is not warranted. A number of calculation schedules involving Life Contingency examples (Chapters 23 and 24) may also prove more user-friendly with Excel. However, as these models become more complex (e.g. incorporating stochastic interest rate models) at some point it is likely that R will become more efficient.
For example, a pension scheme’s valuation calculations may take several minutes to run in Excel compared to a few seconds in R; such calculations may be re-run hundreds of times throughout the analysis and verification process of the valuation, thus benefiting from faster running speeds. A decision is often required therefore regarding programming and running times and related costs when comparing Excel and a programming language such as R.
For a basic level of statistical analysis, Excel may be the preferred choice; however R will be the preferred route involving tasks which require anything more than basic analysis. To understand the benefits of using a programming language such as R requires a certain amount of practice and application; all the above should become clearer with experience.
For the casual data user, Excel is better, given the steeper learning curve required to learn most programming languages such as R. Excel is indeed used in several of our actuarial maths classes to demonstrate small-scale, simplified calculations; however, these often tend not to be particularly realistic, and ultimately lead to a better learning and teaching experience when carried out in R. Indeed, with students migrating towards languages like R and Python, a greater proportion of our assessments now involve R programming.
There are tasks which are particularly unsuited to, or just not possible in Excel. For example, it is not a simple task to calculate eigenvectors in Excel, but these can be calculated almost instantly with one line of R code; similarly, running large numbers of simulations using complex models, large matrix calculations, complex regression analysis etc. are problematic. Many student projects require the use of R (or similar language), and are simply not possible using Excel. R has many statistical functions which run significantly quicker, or are unavailable in Excel. We will see many examples of tasks in this book where using Excel is extremely slow and impractical, such as when running large tasks, and does not deal particularly well with huge datasets – often causing it to slow down and crash. Simple tasks in R such as analysing millions of rows of data across several databases can be extremely time-consuming in Excel, and prone to calculation errors. Several R script file (the file which contains the code) can be used to combine various tasks; with Excel the solution is significantly less elegant, and more difficult to verify and audit.
Also we are frequently required to solve problems numerically e.g. an exact solution may not be possible or easy to obtain. Such a numerical approach is usually more suited to R than Excel.
The use of R is also likely to reduce the risk of data corruption and other errors being made. Physically manipulating data and formulae (cutting, copying, deleting, pasting) in Excel is generally quick and easy, but not particularly robust. Human error introduces mouse-slips, moving to incorrect cells etc. Such processes may be required to be performed several times on many similar data sets – with R we run the same program requiring no manual interaction with the data. Less experienced users may suggest applying care is required when handling data; eventually, however, errors will be made. The experienced Excel user will only be too aware of such problems and the potential for calculation disaster. Ultimately, R is likely to be more robust, and less prone to manual errors.
A similar comparison could be made with the requirement for database systems. For small-scale data scenarios a spreadsheet may perform the required tasks adequately; similar to the comparison with R, Excel will tend to be initially more user-friendly than a database programme. However, with added complexity a robust, dedicated database system is required. Readers may wish to review the recent high-profile case relating to COVID-19 data held on a UK Government (Public Health England) spreadsheet “database” where a spreadsheet of data reached its maximum size resulting in data errors.
In short, if you are intending to become a serious user of data, or data analyst, learning R or a similar computer programme is a priority. At the risk of repeating a message from earlier, it is important for the reader to experience the advantages and disadvantages of R compared to Excel for themselves – please experiment! We would advise the reader to become competent with both tools.
Remark 1.1 For most readers of this book, learning R should really be about the process of learning a programming language; as is the case with a foreign language, once one language has been learnt to a reasonable level the hurdle to learning a second language is somewhat reduced. Thus, a further objective of the reader may be to subsequently learn other programming languages.
All code, results, and analyses included in this book are provided following significant review by peers and colleagues. However the authors cannot guarantee their accuracy, and material from this book should not be used without further appropriate checking and review by their subsequent users.
Much of the inspiration for this book has come from teaching students of actuarial science and statistics, and discussing material with interested students, without which this book would not have been possible. We would like to thank all those students who have indirectly contributed to this book.
We would like to thank all the reviewers of chapters of this book for their their important and significant feedback: Shaun Parsley, Kevin Yuen, Dr Eduard Campillo-Funollet, Dhruv Gavde, Dr Daniel Bearup, Dr Pradip Tapadar, Dr James Bentham, Dr Peng Liu, John Millett, Professor Martin Readout, Professor Enkelejd Hashorva and Professor Malcolm Brown.
We would also like to acknowledge those involved with writing and developing the R programming language, and its predecessor, S. This includes all the authors of the packages used in this book of which there are too many to mention.