Data Analysis and Applications 4 -  - E-Book

Data Analysis and Applications 4 E-Book

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Data analysis as an area of importance has grown exponentially, especially during the past couple of decades. This can be attributed to a rapidly growing computer industry and the wide applicability of computational techniques, in conjunction with new advances of analytic tools. This being the case, the need for literature that addresses this is self-evident. New publications are appearing, covering the need for information from all fields of science and engineering, thanks to the universal relevance of data analysis and statistics packages. This book is a collective work by a number of leading scientists, analysts, engineers, mathematicians and statisticians who have been working at the forefront of data analysis. The chapters included in this volume represent a cross-section of current concerns and research interests in these scientific areas. The material is divided into three parts: Financial Data Analysis and Methods, Statistics and Stochastic Data Analysis and Methods, and Demographic Methods and Data Analysis- providing the reader with both theoretical and applied information on data analysis methods, models and techniques and appropriate applications.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 372

Veröffentlichungsjahr: 2020

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Preface

PART 1: Financial Data Analysis and Methods

1 Forecasting Methods in Extreme Scenarios and Advanced Data Analytics for Improved Risk Estimation

1.1. Introduction

1.2. The low price effect and correction

1.3. Application

1.4. Conclusion

1.5. Acknowledgements

1.6. References

2 Credit Portfolio Risk Evaluation with Non-Gaussian One-factor Merton Models and its Application to CDO Pricing

2.1. Introduction

2.2. Model and assumptions

2.3. Asymptotic evaluation of credit risk measures

2.4. Data analysis

2.5. Conclusion

2.6. Acknowledgements

2.7. References

3 Towards an Improved Credit Scoring System with Alternative Data: the Greek Case

3.1. Introduction

3.2. Literature review: stages of credit scoring

3.3. Performance definition

3.4. Data description

3.5. Models’ comparison

3.6. Out-of-time and out-of-sample validation

3.7. Conclusion

3.8. References

4 EM Algorithm for Estimating the Parameters of the Multivariate Stable Distribution

4.1. Introduction

4.2. Estimators of maximum likelihood approach

4.3. Quadrature formulas

4.4. Computer modeling

4.5. Conclusion

4.6. References

PART 2: Statistics and Stochastic Data Analysis and Methods

5 Methods for Assessing Critical States of Complex Systems

5.1. Introduction

5.2. Heart rate variability

5.3. Time-series processing methods

5.4. Conclusion

5.5. References

6 Resampling Procedures for a More Reliable Extremal Index Estimation

6.1. Introduction and motivation

6.2. Properties and difficulties of classical estimators

6.3. Resampling procedures in extremal index estimation

6.4. Some overall comments

6.5. Acknowledgements

6.6. References

7 Generalizations of Poisson Process in the Modeling of Random Processes Related to Road Accidents

7.1. Introduction

7.2. Non-homogeneous Poisson process

7.3. Model of the road accident number in Poland

7.4. Non-homogeneous compound Poisson process

7.5. Data analysis

7.6. Anticipation of the accident consequences

7.7. Conclusion

7.8. References

8 Dependability and Performance Analysis for a Two Unit Multi-state System with Imperfect Switch

8.1. Introduction

8.2. Description of the system under maintenance and imperfect switch

8.3. Dependability and performance measures

8.4. Optimal maintenance policy

8.5. Numerical results

8.6. Conclusion and future work

8.7. Appendix

8.8. References

9 Models for Time Series Whose Trend Has Local Maximum and Minimum Values

9.1. Introduction

9.2. Models

9.3. Simulation

9.4. Estimation of the piecewise linear trend

9.5. Conclusion

9.6. References

10 How to Model the Covariance Structure in a Spatial Framework: Variogram or Correlation Function?

10.1. Introduction

10.2. Universal Krige setup

10.3. The variogram matrix

10.4. Inverse variogram matrix Γ

−1

10.5. Projecting on span (1)

10.6. Elliptope

10.7. Conclusion

10.8. Acknowledgements

10.9. References

11 Comparison of Stochastic Processes

11.1. Introduction

11.2. Preliminaries

11.3. Application to linguistic data

11.4. Conclusion

11.5. References

PART 3: Demographic Methods and Data Analysis

12 Conjoint Analysis of Gross Annual Salary Re-evaluation: Evidence from Lombardy ELECTUS Data

12.1. Introduction

12.2. Methodology

12.3. Application and results

12.4. Conclusion

12.5. References

13 Methodology for an Optimum Health Expenditure Allocation

13.1. Introduction

13.2. The Greek case

13.3. The basic table for calculations

13.4. The health expenditure in hospitals

13.5. Conclusion

13.6. References

14 Probabilistic Models for Clinical Pathways: The Case of Chronic Patients

14.1. Introduction

14.2. Models and clinical practice

14.3. The Markov models in medical diagnoses

14.4. Conclusion

14.5. References

15 On Clustering Techniques for Multivariate Demographic Health Data

15.1. Introduction

15.2. Literature review

15.3. Classification characteristics

15.4. Data analysis

15.5. Conclusion

15.6. References

16 Tobacco-related Mortality in Greece: The Effect of Malignant Neoplasms, Circulatory and Respiratory Diseases, 1994–2016

16.1. Introduction

16.2. Data and methods

16.3. Results

16.4. Discussion and conclusion

16.5. References

List of Authors

Index

End User License Agreement

Guide

Cover

Table of Contents

Begin Reading

Pages

v

iii

iv

xiii

xiv

xv

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

75

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

155

156

157

158

159

160

161

162

163

164

165

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

185

186

187

188

189

190

191

192

193

194

195

196

197

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

215

216

217

218

219

220

221

222

223

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

Big Data, Artificial Intelligence and Data Analysis Set

coordinated by

Jacques Janssen

Volume 6

Data Analysis and Applications 4

Financial Data Analysis and Methods

Edited by

Andreas Makrides

Alex Karagrigoriou

Christos H. Skiadas

First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address

ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUK

www.iste.co.uk

John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USA

www.wiley.com

© ISTE Ltd 2020

The rights of Andreas Makrides, Alex Karagrigoriou and Christos H. Skiadas to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2020930629

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-78630-624-1

Preface

Thanks to the important work of the authors and contributors we have developed this collective volume on “Data Analysis and Applications: Computational, Classification, Financial, Statistical and Stochastic Methods”.

Data analysis as an area of importance has grown exponentially, especially during the past couple of decades. This can be attributed to a rapidly growing computer industry and the wide applicability of computational techniques, in conjunction with new advances of analytic tools. This being the case, the need for literature that addresses this is self-evident. New publications appear as printed or e-books covering the need for information from all fields of science and engineering thanks to the wide applicability of data analysis and statistic packages.

The book is a collective work by a number of leading scientists, analysts, engineers, mathematicians and statisticians who have been working on the front end of data analysis. The chapters included in this collective volume represent a cross-section of current concerns and research interests in the above-mentioned scientific areas. This volume is divided into three parts with a total of 16 chapters in a form to provide the reader with both theoretical and applied information on data analysis methods, models and techniques along with appropriate applications.

Part 1 focuses on Financial Data Analysis and Methods and contains four chapters on “Forecasting Methods in Extreme Scenarios and Advanced Data Analytics for Improved Risk Estimation” by George-Jason Siouris, Despoina Skilogianni and Alex Karagrigoriou, “Credit Portfolio Risk Evaluation with non-Gaussian One-factor Merton Models and its Application to CDO Pricing” by Takuya Fujii and Takayuki Shiohama, “Towards an Improved Credit Scoring System with Alternative Data: the Greek Case” by Panagiota Giannouli and Christos E. Kountzakis and “EM Algorithm for Estimating the Parameters of the Multivariate Stable Distribution” by Leonidas Sakalauskas and Ingrida Vaiciulyte.

In Part 2, the interest lies in Statistics and Stochastic Data Analysis and Methods which includes seven papers on “Methods for Assessing Critical States of Complex Systems” by Valery Antonov, “Resampling Procedures for a More Reliable Extremal Index Estimation” by Dora Prata Gomes and M. Manuela Neves, “Generalizations of Poisson Process in the Modeling of Random Processes Related to Road Accidents” by Franciszek Grabski, “Dependability and Performance Analysis for a Two Unit Multi-State System with Imperfect Switch” by Vasilis P. Koutras, Sonia Malefaki and Agapios N. Platis, “Models for Time Series Whose Trend has Local Maximum and Minimum Values” by Norio Watanabe, “How to Model the Covariance Structure in a Spatial Framework: Variogram or Correlation Function?” by Giovanni Pistone and Grazia Vicario and “Comparison of Stochastic Processes” by Jesús E. García, Ramin Gholizadeh and Verónica Andrea González-López.

Finally, in Part 3, the interest is directed towards Demographic Methods and Data Analysis which includes five chapters on “Conjoint Analysis of Gross Annual Salary Re-evaluation: Evidence from Lombardy Electus Data” by Paolo Mariani, Andrea Marletta and Mariangela Zenga, “Methodology for an Optimum Health Expenditure Allocation” by George Matalliotakis, “Probabilistic Models for Clinical Pathways: The Case of Chronic Patients” by Stergiani Spyrou, Anatoli Kazektsidou and Panagiotis Bamidis, “On Clustering Techniques for Multivariate Demographic Health Data” by Achilleas Anastasiou, George Mavridoglou, Petros Hatzopoulos and Alex Karagrigoriou and “Tobacco Related Mortality in Greece: The Effect of Malignant Neoplasms, Circulatory and Respiratory Diseases, 1994–2016” by Konstantinos N. Zafeiris.

We wish to thank all the authors for their insights and excellent contributions to this book. We would like to acknowledge the assistance of all involved in the reviewing process of the book, without whose support this could not have been successfully completed. Finally, we wish to express our thanks to the secretariat and, of course, the publishers. It was a great pleasure to work with them in bringing to life this collective volume.

Andreas MAKRIDES

Rouen, France

Alex KARAGRIGORIOU

Samos, Greece

Christos H. SKIADAS

Athens, Greece

January 2020

PART 1Financial Data Analysis and Methods

1Forecasting Methods in Extreme Scenarios and Advanced Data Analytics for Improved Risk Estimation

After extensive investigation on the statistical properties of financial returns, a discrete nature has surfaced when low price effect is present. This is rather logical since every market operates on a specific accuracy. In order for our models to take into consideration this discrete nature of returns, the discretization of the tail density function is applied. As a result of this discretization process, it is now possible to improve the percentage value at risk (PVaR) and expected percentage shortfall (EPS) estimations on which we are focusing in this work. Finally, in order to evaluate the improvement provided by our proposed methodology, adjusted evaluation measures are presented, capable of evaluating percentile estimations like PVaR. These adjusted evaluation measures are not only limited to evaluating percentiles, but in any scenario where data does not bare the same amount of information and consequently, does not all carry the same degree of importance, like in the case of risk analysis.

1.1. Introduction

The quantification of risk is an important issue in finance that becomes even more important during periods of financial crises. The estimation of volatility is the main financial characteristic associated with risk analysis and management since in financial modeling, it has been consolidated the view that the asset returns are preferred over prices due to their stationarity [MEU 09, SHI 99]. Moreover, the volatility of returns is the one that can be successfully forecasted and that is essential for risk management [POO 03, AND 01]).

The increase in complexity of both the financial system and the nature of the financial risk over the last decades, results in models with limited reliability. Hence, in extreme economic events, such models become less reliable and in some instances, fully fail to measure the underlying risk. Consequently, they are producing inaccurate risk measurements which has a great impact on many financial applications, such as asset allocation and portfolio management in general, derivatives pricing, risk management, economic capital and financial stability (based on Basel III accords).

To make thinks even worse, stock exchange markets (and in general every financial market) operate with certain accuracy. In most European markets the accuracy is 0.1 cents (0.001 euros), while US markets operate with 1 cent (USD 0.01) accuracy except for securities that are priced less than USD 1.00 for which the market accuracy or minimum price variation (MPV) is USD 0.0001. Despite the readjustment for extremely low price assets, the associated fluctuation (variation) is considerable and the corresponding volatility is automatically increased. The phenomenon is magnified considerably in periods of extreme economic events (economic collapses, bankruptcies, depressions, etc.) and as a result typical market accuracy fails to handle smoothly assets of extremely low price.

However, any attempt to increase the reliability of the model identification procedure for volatility estimation results unavoidably in even more complex models, which still will be unable to identify and fully capture the governing set of rules of the global economy. The more complicated the model we use for volatility estimation, the larger the number of parameters that needs to be estimated. Hence, in order to have larger data sets, we go further back in the past to collect and use for the analysis older observations, which though may be less representative of reality. Lastly, risk models use only the returns of an asset while ignoring the prices.

In order to answer all the above, we discuss, in this chapter, the concept of low price effect (lpe) [FRI 36] and recommend a low price correction (lpc) for improved forecasts. The low price effect is the increase in variation for stocks with low price due to the existence of a minimum possible return produced when the asset price changes by the MPV. Lpe is frequently overlooked and the main reason for that is the lack of theoretical background related to the reasons resulting in this phenomenon, which surfaces primarily in periods of economic instability or extreme economic events. The pioneering of the proposed correction is that it does not require any additional parameters and takes into account the asset price. The proposed correction is associated with the rationalization of the estimated asset returns, since it is rounded to the next integer multiple of the minimum possible return. Except for the proposed correction, in this work we also provide a mathematical reasoning for the increase in volatility.

Inspired from the above, we came to the same conclusion as many before, that in risk analysis the returns of an asset do not all bare the same amount of information and do not all carry the same degree of significance [SOK 09, ASA 17, ALH 08, ALA 10]. In the absence of a formal mathematical notion, the term asymmetry in the importance describes relatively satisfactory the above phenomenon which is also apparent in other scientific areas, related to medical, epidemiological, climatological, geophysical or meteorological phenomenon. For a mathematical interpretation, we may consider a proper cost function that takes different values, depending on different regions of the dataset or the entire period of observation. For instance, in biosurveillance, the importance (and the cost) associated with an illness incidence rate is much higher in the so-called epidemic periods associated with an extreme rate increase. In the case of risk analysis, risk measures like the value at risk (VaR) and expected shortfall (ES), concentrate only on the left tail of the distribution of returns with the attention being paid to the days of violation rather than to those of no violation. Consequently, failures in fitting a model on the right tail are not considered to be important. Thus, the asymmetry in the importance of information is crucial in both choosing the most appropriate model by assigning a proper weight to each region of the dataset and evaluating its forecasting ability.

In order to judge the forecasting quality of the proposed methodology applied to typical risk measurements like VaR and ES, we have appropriately adjusted a number of popular evaluation measures that take into account the asymmetry in the importance of information bared in the data. Since the proposed low price correction is applied in a subset of the dataset with a specific property (in this case a very low price), it is logical for the adjusted evaluation measures to be computed on the same subset. Thus, the risk estimations can be evaluated with and without the implementation of low price correction and then, compared. The decrease in the values of the adjusted evaluation measures will show the improved forecasting quality of the proposed methodology.

In addition, for the evaluation of the proposed correction, backtesting methods such as the violation ratio (VR) for value at risk and normalized shortfall (NS) for expected shortfall can also be used (see, for example, [BRO 11]). For real-life examples and applications, see [SIO 17, SIO 19a, SIO 19b].

1.2. The low price effect and correction

The rules that govern the accuracy of financial markets around the world fail to smoothly handle assets of extremely low price resulting in a considerable fluctuation (variation) and increased volatility. As expected, the phenomenon is magnified considerably in periods of extreme economic events (economic collapses, bankruptcies, depressions, etc.). Indeed, since all possible (logarithmic) returns on a specific day are integer multiples of the minimum possible return, the stock movement will fluctuate more nervously, the lower the prices. As a result, violations will occur more frequently and forecasts will turn out to be irrational in the sense that such returns cannot be materialized. The resulting volatility increase is quite often overlooked, primarily due to the fact that we take into account only the returns of an asset, neglecting entirely the prices.

In order to accommodate different accuracies, we introduce below a broad definition of the minimum possible return.

DEFINITION 1.1.– Let pt be the asset value at time t and c(pt) be the minimum price variation (market accuracy) associated with the value of the asset at time t. Then the minimum possible return (mpr) of an asset at time t denoted by mprt is the logarithmic return that the asset will produce if its value changes by c(pt) and is given by:

Note that mprt is the same for both upward and downward movements due to the symmetry of logarithmic returns. For the special case, that a market has a constant accuracy, say c, irrespectively of the stock price, Definition 1 can be simplified to mprt = log ((pt + c)/pt) (see [SIO 17]).

EXAMPLE 1.1.– Let us assume that the value of an asset is equal to €0.19 and the market operates under €0.001 accuracy. Then the minimum possible return for the asset is 0.5%. Also, all possible (logarithmic) returns for the asset in this specific day are the integer multiples of the minimum possible return. This will have as a result, the stock movement to become even more nervous and models’ failures to increase. Consequently, PVaR violations will occur more frequently and our model will almost always derive forecasts that are irrational since the stock cannot produce such returns.

Now that we have defined the mpr, we can provide a strict definition of lpe as well as the mathematical reasoning behind this.

DEFINITION 1.2.– Low price effect (lpe) is the inevitable increase of variance in stocks with low prices due to the existence of a minimum possible return.

Considering that the probability mass will be concentrated to the next integer multiple of the minimum possible return, we have that the volatility, provided that it exists, is increased as shown below.

where Rt is the random variable of returns at time t and f (·), μ are the density and the mean of returns, respectively.

Here it must be noted that although Definition 2 is of the same philosophy, it is slightly different from the original concept introduced by [FRI 36] who observed that low-priced stocks were characterized by higher returns and higher volatility. Such high volatility whenever observed was attributed to increased returns, although the discrete nature is the one to be blamed for both increased returns and volatility. According to [CLE 51] the low price effect (or low price anomaly) is attributed to the low quality of stocks perceived by investors. Most results on lpe ([CHR 82, DES 97, HWA 08], etc) deal with the US market, although in the literature we can find some examples on other markets. Such markets include the Warsaw Stock Exchange, WSE [ZAR 14], the Johannesburg Stock Exchange, JSE [GIL 82] and [WAE 97] and the Athens Stock Exchange, ASE [SIO 17]. A general conclusion that can be derived from these results is that the decision on the cut-off point for a low-price share may be a subjective one, but the researcher is free to consider and explore various cut-off points in search for an idealistic threshold, if it exists. For both practical and theoretical purposes, any value can be adapted, provided that above a predefined threshold, the market is assumed or believed to operate not as efficiently as it should have been. For the purposes of the current work, the value Θ = 0.001 or 0.1% is arbitrarily chosen to play the role of the pre-assigned threshold or cut-off point.

REMARK 1.1.– It should be noted that in JSE, the lpe was present for shares priced below 30 cents, but the performance was not equally good for “super-low priced shares” (0–19 cents). For WSE, the analysis was based on the 30% of stocks with the lowest prices.

DEFINITION 1.3.– Low price effect area is the range of prices for which the mpr is greater than a pre-specified threshold Θ.

It must be noted that the minimum possible return is a direct result of the combination of a low price with the minimum possible variation for the stock.

EXAMPLE 1.2.– We present below two examples for the low price effect area, which is easily defined by applying Definition 1 for specific values of the accuracy and the threshold.

a) For a stock exchange market which operates under accuracy c = c(pt) = 0.001, and a predefined threshold Θ = 0.001, we get pt ≤ 0.999001. Thus, the low price effect area is given by pt ≤ 0.999, and an appropriate technique to resolve the low price effect should be implemented.

b) Equivalently, for stock exchange markets like NYSE and NASDAQ, we would have for the same threshold Θ = 0.001, two different cases:

Thus, for 1 ≤ pt ≤ 10 and for pt ≤ 0.100 the low price effect is present, and proper actions should be taken to minimize it. While in the range of 1–10 dollars, numerous real-life examples can be found, the same is not true for prices under 0.1 dollars, in which case, the stock has already been halted from the stock market, and hence, lpe makes only theoretical sense and becomes a theoretical concept.

For the resulting low price effect area, the models considered should be appropriately adapted. This adaptation is done through the so-called low price correction of both the estimation of percentage value at risk denoted by PVaR and the expected percentage shortfall denoted by EPS to be introduced respectively in the following two sections.

1.2.1. Percentage value at risk and low price correction

The percentage value at risk which is a risk measure for estimating the possible percentage losses from trading assets, within a set time period, is defined as follows:

DEFINITION 1.4.– a) Percentage value at risk p (PVaR(p)) is the 100pth percentile of the distribution of returns.

b) Percentage value at risk p at time t (PVaRt(p)) is the above-mentioned risk measure at time t.

The probability of a PVaRt(p) violation, namely, the p-th percentile of the distribution, is given by

where Rt is the random variable of returns at time t, and f(·) is the probability density function of returns.

Since we defined PVaRt(p) based on the return distribution, no additional specification of the equation is needed on whether simple or logarithmic returns are available. Usually, the computation is done over standardized returns. Thus,

where the distribution of standardized returns with standard deviation σ, (Rt/σ), is denoted by F(·). Hence, the PVaR of an asset takes the form:

where F−1(p) is the 100p-th percentile of the assumed distribution.

Note that for the evaluation of PVaRt(p), we may consider any econometric model and then apply an estimation technique for the model parameters. Consider, for instance, the general Asymmetric Power ARCH (APARCH) model [DIN 93]:

with

where Rt is the (logarithmic) return at day t, εt a series of iid random variables, the conditional variance of the model, a0 > 0, ai, bj, δ ≥ 0, i = 1, … p, j = 1, … , q and γi ∈ [−1, 1], i = 1, … , p.

We observe that for γi = 0, ∀i and δ = 2, the model reduces to the GARCH model ([BOL 86]), which, in turn, reduces further to the EWMA model for p = q = 1 and for special values of the parameters involved. For the distribution F of the series {εt}, we consider in this work (see section 1.3), the normal, the Student-t and the skewed Student-t distribution ([LAM 01]). Based on the available data, the estimator t of the conditional standard deviation σt is obtained by numerical maximization of the log-likelihood according to the distribution chosen. If without loss of generality, we assume that the mean of the conditional distribution of the (logarithmic) returns is zero (otherwise the mean-corrected logarithmic returns could be used), then the estimator PV aRt of PVaR at day t is obtained as follows (see, for instance, [BRA 16]):

where qp(F) is the 100p-th percentile of the assumed conditional distribution F of εt.

For the distribution F , usual choices are the normal, the Student-t and the skewed Student-t distribution [LAM 01]. In case where the standardized returns are generated from a Student-t distribution with ν degrees of freedom, the variance is equal to ν/(ν − 2); hence, it is never equal to 1. If that sample variance is used in the calculation of PVaR, the PVaR would be overestimated. Volatility effectively shows up twice, both in F −1(p) and in the estimation of σ, which is obtained by numerical maximization of the log-likelihood. Hence, we need to scale volatility as follows: where is the variance in excess of that implied by the standard Student-t.

We conclude this section with the definition of the low price correction for the percentage value at risk. A PVaRt(p) estimate can take any real value not necessarily equal to integer multiples of mprt. Under the low price effect, asset movements become inevitably more nervous, and any continuous model used would produce forecasts that are irrational in the sense that assets cannot produce such returns, more often than it should. To resolve this “inconsistency” we propose the low price correction by rounding the PVaRt(p) estimate to the closest legitimate value, namely, the next integer multiple of the mprt.

DEFINITION 1.5.– Let PV aRt(p) be the estimation of the PVaR on day t for a specific asset. The low price correction of the estimation denoted by is given by:

[1.1]

where is the floor function (integer part) of w.

Note that we prefer to deal with the percentage value at risk for comparative reasons between assets of the same portfolio with different allocations. We observe that under the low price correction, the market’s accuracy is passed on to the evaluation of the percentage value at risk, resulting in a more reasonable number of violations.

1.2.2. Expected Percentage Shortfall (EPS) and Low Price Correction

After we have obtained the PVaRt(p) in the previous section, we now calculate the conditional expectation under PVaRt(p), which is given by:

DEFINITION 1.6.– The expected percentage loss conditional on PVaRt(p) being violated is defined by:

We observe that the area under f(·) in the interval is less than 1, implying that f(·) is not a proper density function any more. This can be resolved by defining the tail density (right-truncated) function fPV aR(·), obtained by truncation on the right, so that the area below this density becomes exactly equal to 1. Thus:

The EPS is then given by:

In order to provide the discrete expression of EPS, we present below the discretization fDPV aR(·) of fPV aR(·). fDPV aR(x) is the probability function of a discrete random variable, having as values all the possible returns under , and probabilities given as follows:

Then, the definition of the discretization of EPS follows naturally:

DEFINITION 1.7.– Let EPSt be the estimation of the EPS on day t for a specific asset. The discrete approximation of the estimation EPSt, denoted by DEPSt, is given by:

where fDPV aR(·) is the discretization of fPV aR(·).

REMARK 1.2.– Even though, on the previous definition we call DEPSt the discrete approximation of EPSt, the truth is that the nature of f and by extension fPV aR, is discrete, since there always exists a minimum possible return. We may treat returns as a continuous random variable when mpr is extremely small, but the discrete nature of returns still exists.

DEFINITION 1.8.– Let EPSt be the estimation of EPS on day t for a specific asset. The low price correction of the estimation EPSt denoted by is given by:

REMARK 1.3.– For historical simulation, we have that DEPSt = − where fEPVaR(·) is the discrete empirical probability function of fPVaR(·) given by

where Ri are the realizations of returns.

1.2.3. Adjusted Evaluation Measures

For the evaluation of the performance of the competing models, we will use statistical methods, as well as backtesting. Popular evaluation measures used in the literature include the mean square error (MSE), the mean absolute error (MAE), and the mean absolute percent error (MAPE). Since the main interest lies in returns, which mostly (except for a very few extreme cases) take values in (−0.5, 0.5), we prefer MAE and MAPE, because the square in MSE will decrease further the errors. These measures should be appropriately adapted in order to capture the needs of the problem at hand.

Letbe a sample of a time series corresponding to daily logarithmic losses on a trading portfolio, T the length of the time series and PVaRt the estimation of PVaR on day t. If on a particular day, the logarithmic loss exceeds the PVaR forecast, then the PVaR limit is said to have been violated. For a given PVaRt we define the indicator ηt as follows:

Under the above setting, it is easily seen, that the MSE of violation days is defined as follows:

We can easily observe that the above is a special weighted mean squared error expression. Indeed, we note that the above can be written as

where

REMARK 1.4.– A more in-depth analysis of adjusted evaluation measures and their theoretical background can be found in <http://actuarweb.aegean.gr/labstada/publications.html> [SIO 19a, SIO 19b] which due to space limitations cannot be provided here.

1.2.4. Backtesting and Method’s Advantages

The comparison between the observed frequency and the expected number of violations provides the primary tool for backtesting, which is known as the violation ratio [CAM 06]. VR can be used to evaluate the forecast ability of the model, PVaR estimations, and normalized shortfall (NS) for backtesting EPS estimations.

Note that if the corrected version of PVaR is used, then PVaRt should be replaced by in (1.2.3). A PVaR violation is said to have occurred whenever the indicator ηt is equal to 1.

DEFINITION 1.9.– Let v(T) be the observed number of violations, p the probability of a violation and WT the testing window. Then, the violation ratio V R is defined by

Intuitively, if the violation ratio is greater than 1, the risk is underforecast, while if smaller than 1, the risk is overforecast. Acceptable values for VR according to Basel III accords lie in the interval (0.8,1.2), while values over 1.5 or below 0.5 indicate imperfect modeling (see [DAN 11]).

It is harder to backtest expected percentage shortfall (EPS) than PVaR because we are testing an expectation rather than a single quantile. Fortunately, there exists a simple methodology for backtesting EPS that is analogous to the use of violation ratios for PVaR.

For each day t when PVaR is violated, the normalized shortfall NS is calculated as follows:

where EPSt is the observed EPS on day t. From the definition of EPS, the expected return, given that PVaR is violated, is:

Therefore, average NS, denoted by , given by:

should be equal to 1 which, in turn, formulates the null hypothesis:

With EPS, we are testing whether the mean of returns on days when PVaR is violated is the same as the expected EPS on these days. As it is clear, it is much harder to create a formal test in order to ascertain, whether normalized EPS equals to 1 or not. Such a test would have to simultaneously test the accuracy of PVaR and the expectation beyond PVaR. This means that the reliability of any EPS backtest procedure is likely to be much lower than that of PVaR backtest procedures.

REMARK 1.5.– For a better understanding of the proposed methodology, the reader may refer to examples for the Athens Exchange (ATHEX) and the American Stock Exchange (NYSE MKT) analyzed and discussed in <http://actuarweb.aegean.gr/labstada/publications.html> [SIO 19a, SIO 19b] which due to space limitations cannot be provided here.

REMARK 1.6.– The importance of the proposed technique lies on the fact that under the low price correction, fewer violations are expected to occur. Note that a VaR estimate can take any real value not necessarily equal to integer multiples of mpr since it is not controlled by the market’s accuracy. This observation simply implies that any