Geometry Driven Statistics -  - E-Book

Geometry Driven Statistics E-Book

0,0
87,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A timely collection of advanced, original material in the area of statistical methodology motivated by geometric problems, dedicated to the influential work of Kanti V. Mardia

This volume celebrates Kanti V. Mardia's long and influential career in statistics. A common theme unifying much of Mardia’s work is the importance of geometry in statistics, and to highlight the areas emphasized in his research this book brings together 16 contributions from high-profile researchers in the field.

Geometry Driven Statistics covers a wide range of application areas including directional data, shape analysis, spatial data, climate science, fingerprints, image analysis, computer vision and bioinformatics. The book will appeal to statisticians and others with an interest in data motivated by geometric considerations.

Summarizing the state of the art, examining some new developments and presenting a vision for the future, Geometry Driven Statistics will enable the reader to broaden knowledge of important research areas in statistics and gain a new appreciation of the work and influence of Kanti V. Mardia.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 800

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Preface

List of Contributors

Part I: Kanti Mardia

Chapter 1: A Conversation with Kanti Mardia

1.1 Family background

1.2 School days

1.3 College life

1.4 Ismail Yusuf College—University of Bombay

1.5 University of Bombay

1.6 A taste of the real world

1.7 Changes in the air

1.8 University of Rajasthan

1.9 Commonwealth scholarship to England

1.10 University of Newcastle

1.11 University of Hull

1.12 Book writing at the University of Hull

1.13 Directional data analysis

1.14 Chair Professorship of Applied Statistics, University of Leeds

1.15 Leeds annual workshops and conferences

1.16 High profile research areas

1.17 Center of Medical Imaging Research (CoMIR)

1.18 Visiting other places

1.19 Collaborators, colleagues and personalities

1.20 Logic, statistics and Jain religion

1.21 Many hobbies

1.22 Immediate family

1.23 Retirement 2000

Acknowledgments

References

Chapter 2: A Conversation with Kanti Mardia: Part II

2.1 Introduction

2.2 Leeds, Oxford, and other affiliations

2.3 Book writing: revising and new ones

2.4 Research: bioinformatics and protein structure

2.5 Research: not necessarily linked directly with bioinformatics

2.6 Organizing centers and conferences

2.7 Memorable conference trips

2.8 A select group of special colleagues

2.9 High honors

2.10 Statistical science: thoughts and predictions

2.11 Immediate family

2.12 Jain thinking

2.13 What the future may hold

Acknowledgment

References

Chapter 3: Selected publications

1. Monographs

2. Edited Volumes

3. Journal Research Papers

4. Articles in Edited Volumes (other than edited by Mardia)

Part II: Directional Data Analysis

Chapter 4: Some advances in constrained inference for ordered circular parameters in oscillatory systems

4.1 Introduction

4.2 Oscillatory data and the problems of interest

4.3 Estimation of angular parameters under order constraint

4.4 Inferences under circular restrictions in von Mises models

4.5 The estimation of a common circular order from multiple experiments

4.6 Application: analysis of cell cycle gene expression data

4.7 Concluding remarks and future research

Acknowledgment

References

Chapter 5: Parametric circular–circular regression and diagnostic analysis

5.1 Introduction

5.2 Review of models

5.3 Parameter estimation and inference

5.4 Diagnostic analysis

5.5 Examples

5.6 Discussion

References

Chapter 6: On two-sample tests for circular data based on spacing-frequencies

6.1 Introduction

6.2 Spacing-frequencies tests for circular data

6.3 Rao's spacing-frequencies test for circular data

6.4 Monte Carlo power comparisons

Acknowledgments

References

Chapter 7: Barycentres and hurricane trajectories

7.1 Introduction

7.2 Barycentres

7.3 Hurricanes

7.4 Using

k

-means and non-parametric statistics

7.5 Results

7.6 Conclusion

Acknowledgment

References

Part III: Shape Analysis

Chapter 8: Beyond Procrustes: a proposal to save morphometrics for biology

8.1 Introduction

8.2 Analytic preliminaries

8.3 The core maneuver

8.4 Two examples

8.5 Some final thoughts

8.6 Summary

Acknowledgments

References

Chapter 9: Nonparametric data analysis methods in medical imaging

9.1 Introduction

9.2 Shape analysis of the optic nerve head

9.3 Extraction of 3D data from CT scans

9.4 Means on manifolds

9.5 3D size-and-reflection shape manifold

9.6 3D size-and-reflection shape analysis of the human skull

9.7 DTI data analysis

9.8 MRI data analysis of corpus callosum image

Acknowledgments

References

Chapter 10: Some families of distributions on higher shape spaces

10.1 Introduction

10.2 Shape distributions of angular central Gaussian type

10.3 Distributions without reflective symmetry

10.4 A test of reflective symmetry

10.5 Appendix: derivation of normalising constants

References

Chapter 11: Elastic registration and shape analysis of functional objects

11.1 Introduction

11.2 Registration in FDA: phase-amplitude separation

11.3 Elastic shape analysis of curves

11.4 Elastic shape analysis of surfaces

11.5 Metric-based image registration

11.6 Summary and future work

References

Part IV: Spatial, Image and Multivariate Analysis

Chapter 12: Evaluation of diagnostics for hierarchical spatial statistical models

12.1 Introduction

12.2 Example: Sudden Infant Death Syndrome (SIDS) data for North Carolina

12.3 Diagnostics as instruments of discovery

12.4 Evaluation of diagnostics

12.5 Discussion and conclusions

Acknowledgments

References

Chapter 13: Bayesian forecasting using spatiotemporal models with applications to ozone concentration levels in the Eastern United States

13.1 Introduction

13.2 Test data set

13.3 Forecasting methods

13.4 Forecast calibration methods

13.5 Results from a smaller data set

13.6 Analysis of the full Eastern US data set

13.7 Conclusion

References

Chapter 14: Visualisation

14.1 Introduction

14.2 The problem

14.3 A possible solution: self-explanatory visualisations

References

Chapter 15: Fingerprint image analysis: role of orientation patch and ridge structure dictionaries

15.1 Introduction

15.2 Dictionary construction

15.3 Orientation field estimation using orientation patch dictionary

15.4 Latent segmentation and enhancement using ridge structure dictionary

15.5 Conclusions and future work

References

Part V: Bioinformatics

Chapter 16: Do protein structures evolve around ‘anchor’ residues?

16.1 Introduction

16.2 Exploratory data analysis

16.3 Are the anchor residues artefacts?

16.4 Effect of gap-closing method on structure shape

16.5 Alternative to multiple structure alignment

16.6 Discussion

References

Chapter 17: Individualised divergences

17.1 The past: genealogy of divergences and the man of Anekāntavāda

17.2 The present: divergences and profile shape

17.3 The future: challenging data

References

Chapter 18: Proteins, physics and probability kinematics: a Bayesian formulation of the protein folding problem

18.1 Introduction

18.2 Overview of the article

18.3 Probabilistic formulation

18.4 Local and non-local structure

18.5 The local model

18.6 The non-local model

18.7 The formulation of the joint model

18.8 Kullback–Leibler optimality

18.9 Link with statistical potentials

18.10 Conclusions and outlook

Acknowledgments

References

Chapter 19: MAD-Bayes matching and alignment for labelled and unlabelled configurations

19.1 Introduction

19.2 Modelling protein matching and alignment

19.3 Gap priors and related models

19.4 MAD-Bayes

19.5 MAD-Bayes for unlabelled matching and alignment

19.6 Omniparametric optimisation of the objective function

19.7 MAD-Bayes in the sequence-labelled case

19.8 Other kinds of labelling

19.9 Simultaneous alignment of multiple configurations

19.10 Beyond MAD-Bayes to posterior approximation?

19.11 Practical uses of MAD-Bayes approximations

Acknowledgments

References

Index

Wiley Series in Probability and Statistics

End User License Agreement

Pages

xiii

xiv

xv

xvi

xvii

xviii

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

1

95

161

239

311

391

392

393

394

Guide

Cover

Table of Contents

Preface

Begin Reading

List of Illustrations

Chapter 1: A Conversation with Kanti Mardia

Figure 1.1 Kanti Mardia on his uncle's lap, Bombay, 1940.

Chapter 4: Some advances in constrained inference for ordered circular parameters in oscillatory systems

Figure 4.1 Advertisement costs (dashed curve), sales of airline tickets (dotted curve) and income revenues (solid curve) in dollars over time.

Figure 4.2 Peak costs plotted on a circle. Angle from 0 to the dashed line for advertisement costs, from 0 to the dotted line for sales of airline tickets and from 0 to the solid line for the income revenues.

Figure 4.3 Unconstrained estimates.

Figure 4.4 Possible constrained estimates with the CIRE appearing at the bottom circle.

Chapter 5: Parametric circular–circular regression and diagnostic analysis

Figure 5.1 Top: plot of the wind direction at 6.00 a.m. and 12.00 a.m. with fitted regression line (a), and histogram of the residuals with the kernel density estimate (b), Bottom: Q-Q plot (c) and P-P plot (d).

Figure 5.2 The continuous curve is the fitted regression model for all observations, where . The dashed fitted regression curves; (a) when omitting the observation 101, (b) when omitting the observation 102.

Chapter 6: On two-sample tests for circular data based on spacing-frequencies

Figure 6.1 GvM densities over with , and (solid line), (dashed line), (dashed-dotted line).

Chapter 7: Barycentres and hurricane trajectories

Figure 7.1 Trails of the 233 hurricanes recorded over the period 2000–2012 in the HURDAT2 data set (darker trajectories indicate high maximum sustained wind speed). The viewpoint of this and similar following images is placed km above the centre of the image, which is therefore distorted at normal viewing distance for all but the most extremely short-sighted. We will consider the longer period 1950–2012 and will restrict attention to hurricanes crossing N and N (drawn as continuous lines in figure), and register hurricanes on their first crossing of N.

Figure 7.2 Sampling points (measured every 6 hours) of the Best Track of hurricane Isaac in 2012. Typical separation between measurement points is about 100 km.

Figure 7.3 Numbers of hurricanes per year in the HURDAT2 data set over 1851–2012 which cross latitudes N and N, together with fitted lowess curve.

Figure 7.4 Numbers of hurricanes per year in the HURDAT2 data set over 1851–2012 which cross latitudes N and N, grouped according to whether they first cross latitude N at far-west, near-west, near-east, or far-east locations as indicated using -means clustering (with together with fitted lowess curves (except in the far-east case, for which the large majority of years record no hurricanes).

Figure 7.5 Plot of barycentre trajectories arising from the -means algorithm with , applied to the 1950–2012 data set of hurricanes crossing N and N. Barycentre trajectories are denoted by thick paths (the two outline paths correspond to clusters of just one or two hurricanes). The apparently anomalous trajectory running nearly horizontally near the more eastern end of the collection arises from a single rather long trajectory, whose initial behaviour (including its upcrossing of N) is cropped as part of the process of cropping all hurricane trajectories to the same maximal time-interval.

Figure 7.6 Quantile–quantile plot assessing approximate normality of the distribution of the test statistic (with constructed using -means clustering with ) based on 1950–2012 hurricanes crossing latitudes N, N.

Figure 7.7 Boxplots of root-mean-square (RMS) average distances of hurricanes from associated barycentres (in kilometres). Boxplot widths are proportional to square roots of sample sizes.

Chapter 8: Beyond Procrustes: a proposal to save morphometrics for biology

Figure 8.1 Deformations of a template (lower left) on an isotropic offset Gaussian model can be sorted into shells that are spherical in Procrustes distance .

Figure 8.2 In an approach far more likely to be useful to the organismal biologist, deformations may be constructed serially from a parcellation of the form into cells involving one new landmark each with isotropic variance that is linearly scaled to the size of its compartment in some sense.

Figure 8.3 Further instances of multiscale models like that in Figure 8.2, over a range of parcellation-scaled variances that increase from left to right.

Figure 8.4 Nontrivial eigenvectors of a quincunx of landmarks, here drawn as parallel displacements in the horizontal direction. They can be drawn in either the Procrustes norm (a) or the bending energy norm (b).

Figure 8.5 The same for a grid of landmarks. The attenuation by the square root of specific bending energy steadily increases from left to right.

Figure 8.6 Ordinary Procrustes shape coordinate scatters for the isotropic offset Gaussian distribution on the indicated grid of 13 landmarks (b) versus the self-similar version (c). The inset (a) indicates the numbering scheme for the landmarks in the next two Figure See text.

Figure 8.7 Ordinary Procrustes nonaffine shape scatters for square subconfigurations from the distributions in Figure 8.6. Upper row, to the isotropic Gaussian offset distribution; lower row, to the bending energy modification recommended in this paper. (a) For a small square in a corner of the full configuration. (b) For the square at the center. (c) For the four corners of the configuration as a whole. Under each panel is printed the variance of the -coordinate of any one of the landmarks plotted.

Figure 8.8 The same for squares rotated to a diamond orientation, at small size (a) or large (b).

Figure 8.9 Squared Procrustes length versus bending energy for an isotropic sample of perturbations of a square grid. The short segments indicate the slab extracted for closer examination in the next figure.

Figure 8.10 Vertical expansion of the slab from Figure 8.9, including an icon for each of the grid perturbations in this region.

Figure 8.11 Grid transformations of extremely low (a) or high (b) bending for the given Procrustes distance . These are the three leftmost and rightmost grids, respectively, in the preceding figure.

Figure 8.12 Toward the large-scale end of the callosal data set, the scaling of variation by eigenvectors of bending energy falls exactly inversely to bending energy, resulting in a spherical distribution after the standardization. Variation of these extended neural structures thus appears to be self-similar in the sense of the text. (a) The original Procrustes scatter, 40 points by 45 cases. Lower row: least-squares estimates of scaling dimension, all partial warps (b) or the 10 of the largest scale only (c).

Figure 8.13 The five partial warp score two-vectors, plus the uniform component, for the 18 Vilmann rat skulls imaged at eight ages each. Each partial warp is shown along the principal component of its own partial warp scores, except that the uniform term (far left column) is displayed separately for its 7-to-30-day (upper) and 30-to-150-day (lower) orientations. Below right: the octagon of landmarks for a typical specimen cut midsagittally (up the middle of the skull,

Source

: from Bookstein Bookstein (1991).

Figure 8.14 Summary of the single dimension dominating this covariance structure. (a) The large-scale (quadratic) growth gradient. (b) The first principal component of nonaffine growth, combining this gradient with a spatially focused additional feature at IPP (upper left margin of the configuration).

Figure 8.15 More detailed model of the Vilmann rodent skull. After the nugget correction, the diminution of variance with bending scales with dimension much faster than self-scaling . The local term (fifth partial warp; Figure 8.13, rightmost column) is an entirely separate feature.

Chapter 9: Nonparametric data analysis methods in medical imaging

Figure 9.1 Change in the ONH topography from normal (a) to glaucomatous (b) (Source: Derado et al. 2004, Figure 9.3, p. 1243. Reproduced by permission of Taylor and Francis http://www.tandfonline.com).

Figure 9.3 Optic nerve head region data (Source: Crane and Patrangenaru (2011), Figure 9.2, p. 232. Reproduced by permission of Elsevier).

Figure 9.2 Landmarks on the ONH for HRT outputs (

Source

: Derado et al. 2004, Figure 9.1, p. 1242. Reproduced by permission of Taylor and Francis http://www.tandfonline.com).

Figure 9.4 Nine anatomical landmarks of the ONH for one stereo image (Source: Crane and Patrangenaru (2011), Figure 9.3, p. 235. Reproduced by permission of Elsevier).

Figure 9.5 Histograms for the bootstrap distributions of for 20,000 resamples (Source: Crane and Patrangenaru (2011), Figure 9.4, p. 236. Reproduced by permission of Elsevier).

Figure 9.6 CT scan of an individual.

Figure 9.7 Select 3D reconstruction results via segmentation.

Figure 9.8 Two groups of landmarks around the eye: (a) and (b) .

Figure 9.9 Bootstrap distribution for the Schoenberg sample mean configurations based on 500 resamples.

Figure 9.10 Bootstrap distribution for the Schoenberg sample mean configurations based on 500 resamples.

Figure 9.11 DTI slice images of a control subject (a) and of a dyslexia subject (b) (Source: Osborne et al. (2013), Figure 9.1, p. 171. Reproduced by permission of Elsevier).

Figure 9.12 Marginals of the bootstrap distribution for the generalized Frobenius sample means for , , , , , and ; clinically normal (light gray) versus dyslexia (dark gray).

Figure 9.13 Bootstrap distribution of our test statistics : The images (1–3) in the first row correspond to the diagonal entries of the matrices : , , and images (4–6) in the second row corresponds to the lower triangular off-diagonal entries of the matrices : , , (Source: Osborne et al. (2013), Figure 9.2, p. 172. Reproduced by permission of Elsevier).

Figure 9.14 Bootstrap distribution of our test statistics : The images (1–3) in the first row correspond to the diagonal entries of the matrices : , , and images (4–6) in the second row corresponds to the lower off-diagonal entries of the matrices : , , .

Figure 9.15 Right hemisphere of Einstein's brain including CC midsagittal section (a) and its contour (b).

Figure 9.16 Corpus callosum midsagittal sections shape data, in subjects ages—64–83.

Figure 9.17 Matched sampling points on midsagittal sections in for CC data (Einstein's is the upper left CC).

Figure 9.18 Registered icons for 2D direct similarity shapes of CC midsections : sample mean (light gray) versus Albert Einstein's (dark gray).

Figure 9.19 bootstrap confidence region for the extrinsic mean CC contour by 1000 resamples.

Chapter 11: Elastic registration and shape analysis of functional objects

Figure 11.1 Alignment of two functions: align to . The middle panel shows the aligned result.

Figure 11.2 Multiple functions alignment. (a) A set of functions which have different height and peak locations. (b) The aligned result. (c) The optimal warping function 's.

Figure 11.3 Analysis of growth data. (a) The growth data for female subjects. (b) The growth data for male subjects.

Figure 11.4 An illustration of re-parameterization curve in domain .

Figure 11.5 Each row shows an example of geodesic path between the starting and ending shapes under the elastic framework.

Figure 11.6 Mean shapes of four different classes of shapes. Each mean shape (shown in bottom right) is calculated from shapes on its left.

Figure 11.7 Modes of variations: for each class of shapes in mean shape examples (Figure 11.6), we show the variation along the first and second principal modes. Shape in the center is the mean shape.

Figure 11.8 Random samples from the Gaussian shape distribution of different classes of shapes.

Figure 11.9 Each row shows an example of geodesic between a pair of objects (the starting and ending shapes) (

Source

: Xie et al. 2013, Figure 11.4, p. 870. Reproduced by permission of IEEE).

Figure 11.10 Comparing geodesic to linear interpolation (

Source

: Xie et al. 2013, Figure 11.5, p. 871. Reproduced by permission of IEEE)

Figure 11.11 Computing mean shape, PC analysis and random samples under a Gaussian model. (a) Some observations of chess piece. (b) The three main principal components. (c) Several randomly sampled chess pieces using a Gaussian model are shown (

Source

: Xie et al. 2013, Figure 11.9, p. 872. Reproduced by permission of IEEE)

Figure 11.12 Registering synthetic smooth grayscale images. and . , and (

Source

: adapted from Xie et al. 2014b, Figure 11.3, p. 246. Reproduced by permission of Springer).

Figure 11.13 Two examples of brain MR image registration (each row as an example). First column shows overlapped original images and ; second column shows overlapped images and deformed ; third column shows and deformed (

Source

: adapted from Xie et al. 2014b, Figure 11.4, p. 247. Reproduced by permission of Springer).

Figure 11.14 Two examples of brain image registration with landmarks. In each experiment, the top row shows the original images and , and in the bottom row, the first column shows the deformed images using only landmarks; the second column shows the final deformed images with as the initial condition; and the last column shows the registered images without involving landmarks (

Source

: adapted from Xie et al. 2014b, Figure 11.5, p. 248. Reproduced by permission of Springer).

Chapter 12: Evaluation of diagnostics for hierarchical spatial statistical models

Figure 12.1 Map of the 100 counties in North Carolina, showing edges between the counties whose seats are within 30 miles of each other. The counties are numbered according to the alphabetical order of their county name (

adapted from Bivand

2014).

Figure 12.2 Local Moran I statistic for the residuals of the null model fitted using the transformed SIDS rates: Positive (i.e., unusually large) values are shaded.

Figure 12.3 Local Getis–Ord statistic for the residuals of the null model fitted using the transformed SIDS rates: Positive (i.e., unusually large) values are shaded.

Figure 12.4 Cross-validation for the null model fitted to the transformed SIDS rates: Positive (i.e., unusually large) values are shaded.

Figure 12.5 DSC curves for the SIDS data, for in (12.17).

Chapter 13: Bayesian forecasting using spatiotemporal models with applications to ozone concentration levels in the Eastern United States

Figure 13.1 A plot of the 639 (62 validation and 577 model fitting) ozone monitoring sites in the Eastern United States.

Figure 13.2 Side-by-side boxplots of the observed daily maximum ozone concentration levels and Eta CMAQ output for 21 days from all 639 sites in the eastern United States.

Figure 13.3 Map of the four states, Ohio, Indiana, Illinois, and Kentucky. A total of 147 ozone monitoring locations are superimposed.

Figure 13.4 Plots of RMSE and MAE based on modeling 7 days data (a and b) and 14 days data (c and d).

Figure 13.5 Sharpness diagram using: (a) 7 days data (b) 14 days data.

Figure 13.6 PIT diagrams for (a) GP, (b) AR, and (c) GPP models using 14 days data for modeling.

Figure 13.7 Marginal calibration plots for all the models using (a) 7 days data (b) 14 days data for modeling.

Figure 13.8 Maps showing the forecasts and their standard deviations for July 8, 9 and 10 in 2010. Observed ozone levels are also superimposed on the forecast maps from a selected number of sites only, to avoid clutter.

Chapter 15: Fingerprint image analysis: role of orientation patch and ridge structure dictionaries

Figure 15.1 Some major milestones in fingerprint recognition.

Figure 15.2 Illustration of fingerprint features at three different levels. (a) A gray-scale fingerprint image (NIST SD30, A002_01), (b) level 1 features: orientation field and singular points (core point shown as a circle and delta point shown as a triangle), (c) level 2 features: ridge ending minutiae (squares) and ridge bifurcation minutiae (circles), and (d) level 3 features: pores and dots.

Figure 15.3 Three types of fingerprint images. (a) Rolled fingerprint (from NIST Special Database 4 2014), (b) plain fingerprint from (FVC2002 2002), and (c) latent fingerprint (

Source

: adapted from NIST Special Database 27 2014)

Figure 15.4 Examples of (a) rolled-to-rolled fingerprint matching and (b) latent-to-rolled fingerprint matching. Features in the rolled fingerprints shown here are extracted automatically by an AFIS, but features (minutiae, region of interest, and singular points) in the latent were manually marked.

Figure 15.5 Examples of orientation patches in the dictionary; an orientation patch contains orientation elements, and each orientation element corresponds to a block of pixels.

Figure 15.6 Orientation fields extracted from two latent fingerprint impressions ((a) and (b)) estimated using different patch sizes (increasing from left to right: , , , and ). Only the nearest dictionary element of each initial orientation patch is considered here (

Source

: Feng et al. (2013), Figure 15.8, p. 931. Reproduced by permission of IEEE).

Figure 15.8 Flowchart of the orientation field estimation algorithm, which consists of an off-line dictionary construction stage and an online orientation field estimation stage (

Source

: Feng et al. (2013), Figure 15.5, p. 929. Reproduced by permission of IEEE)

Figure 15.7 Examples of coarse and fine-level dictionaries. (a) A subset of elements in the coarse-level dictionary, and (b) a subset of elements in the 16 orientation-specific dictionaries. The th row in (b) corresponds to the orientation range , .

Figure 15.9 Examples of latents of different qualities. (a) Good, (b) bad, and (c) ugly.

Figure 15.10 CMC curves of three orientation field estimation algorithms and the manual markup of orientation field on the NIST SD27 latent database: (a) all (258 latents), (b) good quality (88 latents), (c) bad quality (85 latents), and (d) ugly quality (85 latents) (

Source

: Feng et al. (2013), Figure 15.13, p. 933. Reproduced by permission of IEEE).

Figure 15.11 Enhanced images of three latent fingerprints in (a) using orientation fields estimated by (a) FOMFE, (b) STFT, and (c) the orientation patch dictionary-based algorithm (

Source

: Feng et al. (2013), Figure 15.14, p. 934. Reproduced by permission of IEEE)

Figure 15.13 Patch reconstruction results (overlaid on orientation field) with different number of dictionary entries, . (a) Texture component of a high-quality fingerprint patch (top), low-quality fingerprint patch (the middle), and non-fingerprint patch (the bottom), (b), (c), (d), and (e) are the reconstruction results when , respectively. The SSIM indices between the given patch (column (a)) and the reconstructed patch with different value of are shown above the reconstructed patches.

Figure 15.12 Overview of latent segmentation and enhancement algorithm based on ridge structure dictionary. The off-line dictionary learning (a and c) and online latent segmentation and enhancement stage (b) are shown (

Source

: Cao et al. (2014), Figure 15.3, p. 1850. Reproduced by permission of IEEE)

Figure 15.14 Illustration of latent fingerprint segmentation. (a) Gray-scale latent images, (b) texture component images, (c) coarse-level quality maps, (d) fine-level quality maps, and (e) segmentation results shown overlaid on the gray-scale latent images. The top, middle, and bottom latent fingerprints in column (a) are of good, bad, and ugly quality in NIST SD27. The contrast of the middle and bottom latent fingerprints has been adjusted for better visual quality.

Figure 15.15 An example of latent segmentation and enhancement by the proposed algorithm. (a) A latent fingerprint (U286 from NIST SD27); (b) fully automatic segmentation of (a) by the proposed algorithm; (c) the true mate (rolled print) of (a) with the segmentation boundary in (c) outlined on the mate. By feeding the original latent in (a) and the segmented and enhanced latent in (b) into a commercial off-the-shelf (COTS) latent matcher (with a background database of 31,997 reference prints), the mated print is retrieved at ranks 4,152 and 2, respectively.

Figure 15.16 CMC curves of latent fingerprint identification with the COTS latent matcher on (a) NIST SD27 and (b) WVU DB (

Source

: adapted from Cao et al. (2014), Figure 15.12, p. 1857. Reproduced by permission of IEEE)

Chapter 16: Do protein structures evolve around ‘anchor’ residues?

Figure 16.1 A two-dimensional ball-and-stick model of peptide bond formation between two amino acids. Atoms are represented by circles and bonds are lines between them, where double bonds are indicated by two parallel lines. Nitrogen, Carbon, Oxygen and Hydrogen are represented by ‘N’, ‘C’, ‘O’ and ‘H’, respectively. The unique side chains or ‘R’ groups of the two amino acids are represented by a square. Peptide bonds are formed when the carboxyl group of one amino acid reacts with the amino group of another resulting in the loss of a water molecule, as shown in the lower panel.

Figure 16.2 Ribbon representation of a trypsin molecule (Protein Data Bank (PDB) accession code: 1S5S) displayed with the molecular visualisation software, Jmol. The secondary structures are coloured; dark grey indicates an alpha helix, light grey indicates a beta sheet and the black helix is a helix; a helix with three residues per turn rather than 3.6.

Figure 16.3 An overview of the MUSTANG algorithm (

Source:

adapted from Konagurthu et al. 2006, Figure 2, p. 562. Reproduced with permission of John Wiley and Sons).

Figure 16.4 Plot of median aligned residue–residue distance against the divergence between the distances for each pair of residues, for the MUSTANG structural alignment of the trypsin sample.

Figure 16.5 Plots of the rows of the median and divergence matrices calculated from structurally aligned distance matrices of the trypsin sample. The bars appear as a result of many points plotted close together. (a) Median, , of the structurally aligned distances plotted against position, , in the alignment. (b) Divergence, , of the structurally aligned distances plotted against position, , in the alignment.

Figure 16.6 Median matrix heatmap. The median residue–residue distances are plotted in greyscale; small distances are white and large distances are dark grey.

Figure 16.7 Multidimensional scaling structure of the median distance matrix, displayed in black. The atoms of each position in the alignment are given by a black circle. atoms corresponding to adjacent alignment positions are connected by black lines to represent the backbone of the median structure. The trypsin structure in Figure 16.2 is superimposed with the consensus structure and displayed in grey. The structures were superimposed using TM-align pairwise structural alignment algorithm (

Source

: Zhang and Skolnick 2005).

Figure 16.8 Divergence matrix heatmaps for different colour scales. The divergence between the residue–residue distances are plotted in greyscale; small divergences are white and large divergences are dark grey. (a) Divergence matrix heatmap based on the original scale. The information in white is diluted by a small amount of grey that is pulling up the scale. (b) Divergence matrix heatmap recalculated for all of the divergences that are less than , larger divergences are blacked out.

Figure 16.9 (a) Ribbon representation of a trypsin structure (PDB ID: 1JIR) identifying the location of the anchor residues, displayed in blocks of black, and the three disulphide bonds, indicated by black lines and labelled cysteine (C) residues. (b) The same structure identifying the location of functional residues, including the catalytic triad of residues and the oxyanion hole, displayed in blocks of black, and the three disulphide bonds, indicated by black lines and labelled cysteine (C) residues.

Figure 16.10 Plots of the rows of the median and divergence matrices calculated from structurally aligned distance matrices of the short-chain dehydrogenase sample. The bars appear as a result of many points plotted close together. (a) Median, , of the structurally aligned distances plotted against position, , in the alignment. (b) Divergence, , of the structurally aligned distances plotted against position, , in the alignment.

Figure 16.11 Example structure consisting only of atoms, represented by dots; adjacent residues are connected by lines to form the backbone of the structure. The atoms are labelled in accordance with the method for closing gaps in structure; is a vector containing the -coordinates of the residue that will be removed to form the gap, and and are the coordinates of the sequence of residues reading away from the gap on either side. See text for further explanation.

Figure 16.12 Plots of the rows of the median and divergence matrices calculated from structurally aligned distance matrices of the artificial trypsin sample. The bars appear as a result of many points plotted close together. (a) Median, , of the structurally aligned distances plotted against position, , in the alignment. (b) Divergence, , of the structurally aligned distances plotted against position, , in the alignment.

Figure 16.13 Divergence matrix heatmap for the artificial trypsin sample, recalculated for all of the divergences that are less than . Larger divergences are blacked out. The divergence between the residue–residue distances is plotted in greyscale; small distances are white and large distances are dark grey.

Figure 16.14 Plots of the rows of the median and divergence matrices calculated from structurally aligned distance matrices of the trypsin sample with only atoms. The bars appear as a result of many points plotted close together. (a) Median, , of the structurally aligned distances plotted against position, , in the alignment. (b) Divergence, , of the structurally aligned distances plotted against position, , in the alignment.

Figure 16.15 Plots of the rows of the median and divergence matrices calculated from structurally aligned distance matrices of the trypsin sample with the anchor residues removed. The bars appear as a result of many points plotted close together. (a) Median, , of the structurally aligned distances plotted against position, , in the alignment. (b) Divergence, , of the structurally aligned distances plotted against position, , in the alignment.

Figure 16.16 Plots displaying the effect of the gap-closing method on a zigzag structure. (a) Zigzag structure before a gap is closed. (b) Zigzag structure after closing a gap of size one that is introduced in the middle of the structure. (c) Zigzag structure after closing a gap of size 16 that is introduced in the middle of the structure.

Figure 16.17 Plots displaying the effect of the gap-closing method on a helix structure. (a) Helix structure before a gap is closed. (b) Helix structure after closing a gap of size one that is introduced in the middle of the structure. (c) Helix structure after closing a gap of size 16 that is introduced in the middle of the structure.

Figure 16.18 Plots of the rows of the median and divergence matrices calculated from the aligned distance matrices of the Clustal W multiple-sequence alignment of the trypsin sample. The bars appear as a result of many points plotted close together. (a) Median, , of the aligned distances plotted against position, , in the alignment. (b) Divergence, , of the aligned distances plotted against position, , in the alignment.

Figure 16.19 Plots of the rows of the median and divergence matrices calculated from the aligned distance matrices of the MUSCLE multiple-sequence alignment of the trypsin sample. The bars appear as a result of many points plotted close together. (a) , of the aligned distances plotted against position, , in the alignment. (b) Divergence, of the aligned distances plotted against position, , in the alignment.

Chapter 17: Individualised divergences

Figure 17.1 Eigen decomposition of evidence (‘toy’ illustrative genetic example from text). Upper biplot showing cases (dark circles), controls (open circles) and loadings (pale dots). Note good separation of cases from controls and in particular labelled loadings for IL1F proteins to the right. Almost horizontal case-control axis suggests robust well-informed study. Second row: to the left—heat-map for hsr cases; to the right—heat-map for sjs cases; Third row: to the left—heat-map for serotype B17 carriage; to the right—heat-map for HLA-A*68 carriage. Lower graphs (next page) show the shape of average case evidence profile over loci (grey mean shape) plotted above average control evidence profile (pale grey mean shape), with mean shape difference (case-control) plotted at foot in black. Shape of the comparative profile is impenetrable unless framed as an ordination of individuals in correlation space within the context of the biology.

Chapter 18: Proteins, physics and probability kinematics: a Bayesian formulation of the protein folding problem

Figure 18.1 When bond angles and bond lengths are considered as fixed to their ideal values, a vector of dihedral angles is the remaining degree of freedom describing a three-dimensional protein structure. The dihedral angles can be subdivided into backbone and side chain angles, respectively involving triplets and vectors of angles. All angles are illustrated in the figure, with the exception of , which is typically close to 180. The number of angles varies between zero and four for the twenty standard amino acids. The Figure shows a ball-and-stick representation of a single amino acid, glutamate, which has three angles, within a protein. The fading conformations in the background illustrate a rotation around . The Figure was made using PyMOL (http://www.pymol.org, DeLano Scientific LCC) (adapted from Harder et al. (2010) http://www.biomedcentral.com/1471-2105/11/306. Used under CC-BY-SA 2.0 http://creativecommons.org/licenses/by/2.0/).

Figure 18.2 Three views of the same protein (protein G; Protein Data Bank code 2GB1). (a) A ball-and-stick representation of the protein, showing all bonds between atoms as sticks. Apart from the dynamics, this view includes essentially all relevant details. (b) Same view, but only showing the linear polymer part of the protein—the so-called main chain. The side chains are not shown in this view. (c) A schematic representation of the protein–called a “cartoon”—which shows an -helix in the back, a -sheet consisting of four -strands (shown as arrows) and the interconnecting coils. The dotted lines show hydrogen bonds, which are some of the features that stabilize the folded conformation. The helices, strands, and coils can be considered “local” features, while the hydrogen bonds shown between the -strands in the -sheet can be considered “non-local” features, as they involve amino acids close in space, but relatively distant in sequence. This distinction between local and non-local is somewhat artificial, but can be used to great advantage in the formulation of probabilistic models of protein structure, as discussed in the chapter.

List of Tables

Chapter 4: Some advances in constrained inference for ordered circular parameters in oscillatory systems

Table 4.1 Cyclebase and CIRE phase angles estimates for the two Species

Table 4.2 MSCE and

Fp

-values for the 34 core set genes considered

Table 4.3 Partial orders for

S. cerevisiae

Genes

Table 4.4 Partial orders for

S. pombe

Genes

Chapter 5: Parametric circular–circular regression and diagnostic analysis

Table 5.1 The parameters , and the gradient at and for each of the models

Table 5.2 Type I errors for various values of and . Nominal level of test

Table 5.3 The power of the test for various values of and

Table 5.4 The estimates of parameters

a

and ν

Table 5.5 Maximum likelihood estimates (and SEs), the maximized log-likelihood, AIC and BIC for Taylor's full model (1) and constrained () model (2)

Table 5.6 Maximum likelihood estimates (and SEs), the maximized log-likelihood, AIC and BIC for model proposed by Jammalamadaka and SenGupta (2001)

Table 5.7 Maximum likelihood estimates of parameters, the maximized log-likelihood, AIC and BIC for model proposed by Kato and Jones (2010) and two of its sub-models (

Source

: Kato and Jones 2010)

Table 5.8 The estimates, log-likelihood and value of

D

i

for selected observations

Chapter 6: On two-sample tests for circular data based on spacing-frequencies

Table 6.1 Power comparison between Wheeler–Watson, Dixon's and Rao's tests

Chapter 7: Barycentres and hurricane trajectories

Table 7.1 179 hurricanes over 37 years, classified by year and by 20 groups using the -means algorithm with . Groups are ordered according to how westerly is the upcrossing by the corresponding barycentre trajectory of latitude (35)N

Table 7.2 Numbers of hurricanes in each of the 20 groups determined by the -means algorithm with . Groups are ordered according to how westerly is the upcrossing by the corresponding barycentre trajectory of latitude N

Chapter 9: Nonparametric data analysis methods in medical imaging

Table 9.1 90% Lower confidence limit for the bootstrap distribution of the 3D sample mean size-and-reflection shape configuration

Table 9.2 90% Upper confidence limit for the bootstrap distribution of the 3D sample mean size-and-reflection shape configuration

Chapter 12: Evaluation of diagnostics for hierarchical spatial statistical models

Table 12.1 A Table resulting from our diagnostic evaluation based on a precise follow-up reanalysis

Table 12.2 The Table given by Table 12.1, for the Local Moran I diagnostic applied to the transformed SIDS residuals after fitting the null model; cross-validation is abbreviated as CV

Table 12.3 The Table given by Table 12.1, for the Local Getis–Ord diagnostic applied to the transformed SIDS residuals after fitting the null model; cross-validation is abbreviated as CV

Chapter 13: Bayesian forecasting using spatiotemporal models with applications to ozone concentration levels in the Eastern United States

Table 13.1 Summaries of the daily maximum ozone concentration levels and Eta CMAQ output for the test data set described in Section 13.2

Table 13.2 CRPS values from modeling data from four states during July 8 (denoted as 7/8) to 14

Table 13.3 Empirical coverages of the 50% and 95% forecast intervals for the one-step ahead forecasts at the 20 randomly chosen validation sites

Table 13.4 Average width of the forecast intervals for the four states data set

Table 13.5 False alarm and hit rates for ozone threshold values of 65 and 75 for the four states data set

Table 13.6 Parameter estimates (mean and SD) for the models based on GPP approximation fitted with 14 days observations for the period June 24 (denoted as 6/24) to July 13, 2010 from the 577 modeling sites in the whole Eastern United States

Table 13.7 Values of the RMSE of the forecasts at the hold-out sites for the simple linear model and the GPP model based on modeling 7 and 14 days data for the whole of Eastern United States. The corresponding RMSE values for the Eta CMAQ output are also shown

Table 13.8 Empirical coverage of the 95% forecast intervals using the linear and GPP models and the CRPS values for the hold-out data for the GPP model for the whole Eastern US data set

Chapter 15: Fingerprint image analysis: role of orientation patch and ridge structure dictionaries

Table 15.1 Average estimation error (in degrees) of the orientation field estimation algorithm based on orientation patch dictionary and two competing algorithms on the latent fingerprints in the NIST SD27 Database

Chapter 16: Do protein structures evolve around ‘anchor’ residues?

Table 16.1 Conservation in percentage of each amino-acid residue and gaps in the positions of the structural alignment corresponding to the anchor residues

Chapter 17: Individualised divergences

Table 17.1 Various ‘-divergences’ (Ali–Silvey distances) between two discrete probability measures and in Euclidean space; see Nyguyen et al. (2005)

Table 17.2 Seven data columns from a small unpublished case-control genetic study on cutaneous adverse reactions to drug treatment covering a total of 78 patients (28 cases and 50 controls) assayed for 241 single nucleotide polymorphisms and 8 HLA loci

Table 17.3 Transformed data columns from Table 17.2 now containing surrogate numeric values for the individualised case-ness evidence of group distinction

Table 17.4 Proposed framework for contrasts of three groups (

row blocks

) leading to new divergences for profile comparisons

Table 17.5 New

Expected

divergences using the contrasts from Table 17.4

Geometry Driven Statistics

Edited by

Ian L. Dryden

University of Nottingham, UK

 

John T. Kent

University of Leeds, UK

 

 

 

Preface

Kanti Mardia celebrates his 80th birthday on 3 April 2015. Kanti has been a dynamic force in statistics for over 50 years and shows no signs of slowing down yet. He has made major contributions to many areas of statistics including multivariate analysis, directional data analysis, frequentist inference, Bayesian inference, spatial and spatial-temporal modelling, shape analysis and more specific contributions to application areas such as geophysics, medicine, biology and more recently bioinformatics. A distinctive feature of Kanti's activities has been the annual series of LASR (Leeds Annual Research Statistics) workshops which he established and organized. These have helped to foster interdisciplinary advances in these research areas and have given rise to a long-standing series of proceedings containing short state-of-the-art papers published by Leeds University Press.

A common theme that unifies much of his work is the importance of geometry in statistics, hence the name of this volume, “Geometry Driven Statistics.”

The research areas in which Kanti has worked continue to evolve and attract great interest and activity. It is, therefore, timely to provide a collection of papers from high-profile researchers summarizing the state of the art, giving some new developments and providing a vision for the future. Many of the authors have collaborated with Kanti at some stage in his career or know him personally.

To set the context for the later chapters, the book starts with some historical information on Kanti's life and work, together with a list of his main publications.

The papers have been split into four main topics, though of course there is considerable overlap and cross-fertilization between them:

directional data analysis

shape analysis

spatial, image and multivariate analysis

bioinformatics

The unifying theme throughout the book is geometry—with the first two topics specifically about statistics on manifolds. Directional data analysis involves the analysis of points on a circle (e.g., wind directions) or points on a sphere (e.g., location on the earth's surface), which are particularly simple non-linear manifolds. Kanti's 1972 book Statistics of Directional Data gave great visibility to thetopic area and contained many novel developments, with a second edition Directional Statistics published in 2000 with Peter Jupp. Shape analysis involves the study of much more complicated manifolds, where the shape of an object involves removing information about location, rotation and scale. The topic has numerous applications including the study of organisms in biology or molecules in chemistry. Kanti's 1998 book Statistical Shape Analysis, jointly written with Ian Dryden, summarizes the statistical aspects of the field.

The third topic is particularly broad, involving data collected over geographic regions, image data or other high-dimensional multivariate data. An important classic book that is very relevant here is Kanti's 1979 book Multivariate Analysis, jointly written with John Kent and John Bibby. The final topic has been a particular focus for Kanti in the past decade, especially geometric topics such as Bayesian approaches to structural bioinformatics, where the shapes of proteins are key for determining function. Kanti's work in the area has been highlighted by his 2012 edited volume Bayesian Methods in Structural Bioinformatics with Jesper Ferkinghoff-Borg and Thomas Hamelryck. All four of the main themes are highly connected. Indeed several of the papers could easily have been placed within a different theme, which emphasizes an underlying unity behind the main ideas of this volume.

Ian L. Dryden and John T. Kent

List of Contributors

Norhashidah Awang

School of Mathematical Sciences, Universiti Sains Malaysia, Penang, Malaysia

 

Khandoker Shuvo Bakar

Department of Statistics, Yale University, New Haven, CT, USA

 

Stuart Barber

Department of Statistics, School of Mathematics, University of Leeds, Leeds, UK

 

Sandra Barragán

Department of Statistics and O.R., Universidad de Valladolid, Valladolid, Spain

 

Fred L. Bookstein

Department of Statistics, University of Washington, Seattle, WA, USA

Department of Anthropology, University of Vienna, Vienna, Austria

 

Wouter Boomsma

Department of Biology, University of Copenhagen, Copenhagen, Denmark

 

Clive E. Bowman

Mathematical Institute, University of Oxford, Oxford, UK

 

Sandy Burden

National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Wollongong, New South Wales, Australia

 

Kai Cao

Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA

 

Yasuko Chikuse

Faculty of Engineering, Kagawa University, Takamatsu, Kagawa, Japan

 

Noel Cressie

National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Wollongong, New South Wales, Australia

 

Jesper Ferkinghoff-Borg

Biotech Research and Innovation Center, University of Copenhagen, Copenhagen, Denmark

 

Miguel A. Fernández

Department of Statistics and O.R., Universidad de Valladolid, Valladolid, Spain

 

Jesper Foldager

Department of Biology, University of Copenhagen, Copenhagen, Denmark

 

Jes Frellsen

Department of Engineering, University of Cambridge, Cambridge, UK

 

Riccardo Gatto

Institute of Mathematical Statistics and Actuarial Science, University of Bern, Bern, Switzerland

 

Walter R. Gilks

Department of Statistics, School of Mathematics, University of Leeds, Leeds, UK

 

John C. Gower

Department of Mathematics and Statistics, The Open University, Milton Keynes, UK

 

Peter J. Green

School of Mathematics, University of Bristol, Bristol, UK

University of Technology, Sydney, New South Wales, Australia

 

Arief Gusnanto

Department of Statistics, School of Mathematics, University of Leeds, Leeds, UK

 

Thomas Hamelryck

Department of Biology, University of Copenhagen, Copenhagen, Denmark

 

John Haslett

School of Computer Science and Statistics, Trinity College, Dublin, Ireland

 

Anil K. Jain

Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA

 

S. Rao Jammalamadaka

Department of Statistics and Applied Probability, University of California, Santa Barbara, CA, USA

 

Peter E. Jupp

School of Mathematics and Statistics, University of St Andrews, St Andrews, UK

 

Wilfrid S. Kendall

Department of Statistics, University of Warwick, Coventry, UK

 

Nitis Mukhopadhyay

Department of Statistics, University of Connecticut, Storrs, CT, USA

 

Colleen Nooney

Department of Statistics, School of Mathematics, University of Leeds, Leeds, UK

 

Daniel E. Osborne

Department of Mathematics, Florida Agricultural and Mechanical University, Tallahassee, FL, USA

 

Vic Patrangenaru

Department of Statistics, Florida State University, Tallahassee, FL

 

Shyamal D. Peddada

National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA

 

Orathai Polsen

Department of Applied Statistics, King Mongkut's University of Technology North Bangkok, Bangkok, Thailand

 

Mingfei Qiu

Department of Statistics, Florida State University, Tallahassee, FL, USA

 

Cristina Rueda

Department of Statistics and O.R., Universidad de Valladolid, Valladolid, Spain

 

Sujit Kumar Sahu

Mathematical Sciences and S

3

RI, University of Southampton, Southampton, UK

 

Anuj Srivastava

Department of Statistics, Florida State University, Tallahassee, FL, USA

 

Charles C. Taylor

Department of Statistics, University of Leeds, Leeds, UK

 

Douglas Theobald

Biochemistry Department, Brandeis University, Waltham, MA, USA

 

Hilary W. Thompson

School of Medicine, Division of Biostatistics, Louisiana State University, New Orleans, LA, USA

 

Qian Xie

Department of Statistics, Florida State University, Tallahassee, FL, USA

 

Zhengwu Zhang

Department of Statistics, Florida State University, Tallahassee, FL, USA

Part IKanti Mardia

Chapter 1A Conversation with Kanti Mardia

Nitis Mukhopadhyay

Department of Statistics, University of Connecticut, Storrs, CT, USA

This paper originally appeared in Statistical Science 2002, Vol. 17, No. 1, 113–148.

Kantilal Vardichand Mardia was born on April 3, 1935, in Sirohi, Rajasthan, India. He earned his B.Sc. degree in mathematics from Ismail Yusuf College—University of Bombay, in 1955, M.Sc. degrees in statistics and in pure mathematics from University of Bombay in 1957 and University of Poona in 1961, respectively, and Ph.D. degrees in statistics from the University of Rajasthan and the University of Newcastle, respectively, in 1965 and 1967. For significant contributions in statistics, he was awarded a D.Sc. degree from the University of Newcastle in 1973. He started his career as an Assistant Lecturer in the Institute of Science, Bombay and went to Newcastle as a Commonwealth Scholar. After receiving the Ph.D. degree from Newcastle, he joined the University of Hull as a lecturer in statistics in 1967, later becoming a reader in statistics in 1971. He was appointed a Chair Professor in Applied Statistics at the University of Leeds in 1973 and was the Head of the Department of Statistics during 1976–1993, and again from 1997 to the present. Professor Mardia has made pioneering contributions in many areas of statistics including multivariate analysis, directional data analysis, shape analysis, and spatial statistics. He has been credited for path-breaking contributions in geostatistics, imaging, machine vision, tracking, and spatio-temporal modeling, to name a few. He was instrumental in the founding of the Center of Medical Imaging Research in Leeds and he holds the position of a joint director of this internationally eminent center. He has pushed hard in creating exchange programs between Leeds and other scholarly centers such as the University of Granada, Spain, and the Indian Statistical Institute, Calcutta. He has written several scholarly books and edited conference proceedings and other special volumes. But perhaps he is best known for his books: Multivariate Analysis (coauthored with John Kent and John Bibby, 1979, Academic Press), Statistics of Directional Data (second edition with Peter Jupp, 1999, Wiley) and Statistical Shape Analysis (coauthored with Ian Dryden, 1998, Wiley). The conferences and workshops he has been organizing in Leeds for a number of years have had significant impacts on statistics and its interface with IT (information technology). He is dynamic and his sense of humor is unmistakable. He is a world traveler. Among other places, he has visited Princeton University, the University of Michigan, Harvard University, the University of Granada, Penn State and the University of Connecticut. He has given keynote addresses and invited lectures in international conferences on numerous occasions. He has been on the editorial board of statistical, as well as image related, journals including the IEEE Transactions on Pattern Analysis and Machine Intelligence, Journal of Environmental and Ecological Statistics, Journal of Statistical Planning and Inference and Journal of Applied Statistics. He has been elected a Fellow of the American Statistical Association, a Fellow of the Institute of Mathematical Statistics, and a Fellow of the American Dermatoglyphic Association. He is also an elected member of the International Statistical Institute and a Senior Member of IEEE. Professor Mardia retired on September 30, 2000 to take a full-time post as Senior Research Professor at Leeds—a new position especially created for him.

In April, 1999, Professor Kanti V. Mardia was invited to the University of Connecticut as a short-term guest professor for four weeks. This conversation began on Monday, April 19, 1999 in Nitis Mukhopadhyay's office in the Department of Statistics, University of Connecticut, Storrs.

1.1 Family background

Mukhopadhyay: Kanti, shall we start at the origin, so to speak? Where were you born?

Mardia: I was born in Sirohi on April 3, 1935. Sirohi, was the capital of the Sirohi State about ten thousand square miles in area, in Rajasthan, before India's independence. Subsequently, the Sirohi State became the Sirohi district. Sirohi is situated about four hundred miles east of Bombay. One of the greatest wonders near my place of birth has been the hill station, Mount Abu. It has one of the finest Jain temples, Delwara, with gorgeous Indian architecture from the eleventh century. The exquisite details are all meticulously hand-curved on marble, without parallels anywhere else in India. Those shapes and formations on the ceiling and columns with intricate details influenced me even when I was small child. Much later in my life, some of those incredible shapes made deeper and more tangible impacts on my research career.

Mukhopadhyay: Please tell me about your parents.

Mardia: