Dynamics of Statistical Experiments - Dmitri Koroliouk - E-Book

Dynamics of Statistical Experiments E-Book

Dmitri Koroliouk

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book is devoted to the system analysis of statistical experiments, determined by the averaged sums of sampling random variables. The dynamics of statistical experiments are given by difference stochastic equations with a speci?ed regression function of increments linear or nonlinear. The statistical experiments are studied by the sample volume increasing (N ??), as well as in discrete-continuous time by the number of stages increasing (k ??) for different conditions imposed on the regression function of increments. The proofs of limit theorems employ modern methods for the operator and martingale characterization of Markov processes, including singular perturbation methods. Furthermore, they justify the representation of a stationary Gaussian statistical experiment with the Markov property, as a stochastic difference equation solution, applying the theorem of normal correlation. The statistical hypotheses verification problem is formulated in the classification of evolutionary processes, which determine the dynamics of the predictable component. The method of stochastic approximation is used for classifying statistical experiments.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 197

Veröffentlichungsjahr: 2020

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Preface

List of Abbreviations

Introduction

1 Statistical Experiments

1.1. Statistical experiments with linear regression

1.2. Binary SEs with nonlinear regression

1.3. Multivariate statistical experiments

1.4. SEs with Wright–Fisher normalization

1.5. Exponential statistical experiments

2 Diffusion Approximation of Statistical Experiments in Discrete–Continuous Time

2.1. Binary DMPs

2.2. Multivariate DMPs in discrete–continuous time

2.3. A DMP in an MRE

2.4. The DMPs in a balanced MRE

2.5. Adapted SEs

2.6. DMPs in an asymptotical diffusion environment

2.7. A DMP with ASD

3 Statistics of Statistical Experiments

3.1. Parameter estimation of one-dimensional stationary SEs

3.2. Parameter estimators for multivariate stationary SEs

3.3. Estimates of continuous process parameters

3.4. Classification of EPs

3.5. Classification of SEs

3.6. Evolutionary model of ternary SEs

3.7. Equilibrium states in the dynamics of ternary SEs

4 Modeling and Numerical Analysis of Statistical Experiments

4.1. Numerical verification of generic model

4.2. Numerical verification of DMD

4.3. DMD and modeling of the dynamics of macromolecules in biophysics

References

Index

End User License Agreement

List of Tables

Chapter 3

Table 3.1. Table of models

Table 3.2. Classification of TSE as N, k → ∞

Chapter 4

Table 4.1. The numerical parameters for samples simulation using the Stokes–Eins...

Guide

Cover

Table of Contents

Begin Reading

Pages

v

iii

iv

ix

x

xi

xiii

xiv

xv

xvi

xvii

xviii

xix

xx

xxi

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

189

190

191

192

193

194

195

197

198

199

200

201

202

203

Series Editor

Nikolaos Limnios

Dynamics of Statistical Experiments

Dmitri Koroliouk

First published 2020 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUK

www.iste.co.uk

John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USA

www.wiley.com

© ISTE Ltd 2020

The rights of Dmitri Koroliouk to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2019956286

British Library Cataloguing-in-Publication DataA CIP record for this book is available from the British LibraryISBN 978-1-78630-598-5

Preface

This monograph is based on the Wright–Fisher model (Ethier and Kurtz 1986, Ch. 10) in mathematical theory of population genetics, considered as a dynamical experimental data flow, expressed in terms of the fluctuations, which are deviations from a certain equilibrium point (steady-state).

Statistical experiments (SEs) are a stochastic point of view of the collective behavior of a finite number of interacting agents taking a finite number of decisions M ≥ 1 (M = 1 corresponds to the basic binary model with two alternatives).

The dynamics, in discrete time, is described for stochastic difference equations (SDEs) that consist of two components: evolutionary processes (EPs) (the predictable component) and martingale differences (the stochastic processes).

An essential feature of SEs is that their characterization by regression function of increments (RFI), defines the EP (drift of SEs). The basic assumption for binary SEs is given by linear fluctuations with respect to an equilibrium point (steady state). The RFI can be interpreted as a fundamental principle “stimulation and deterrence”: at each stage, the EP increments decrease proportionally to the current fluctuation value.

Linear fluctuations of the RFI are basic models in the diffusion approximation of SEs in discrete-continuous time (Chapter 2).

The dynamics of SE, in discrete time, provides an effective statistical estimation of a linear drift factor V0 under the assumption of stationarity (section 3.1).

Stationary Gaussian Markov SEs, given by the solution of a SDE of increments, are characterized by two-dimensional covariance matrices. The expression of their dynamics in terms of the linear fluctuations with respect to an equilibrium value gives two equivalent determinations of the SEs: as the solution of the SDEs and by two-dimensional covariance matrices (Theorem 3.2.3).

The collective behavior of a finite number of N agents is given by the increment of SEs in a SDE containing two components: an evolutionary process (a predictable component) that determines the fluctuations of a given state with respect to the equilibrium and also martingale differences (a stochastic component) approximated by a normal Brownian motion as N → ∞.

The problem of statistical hypothesis verification is formulated in the classification of EPs, which determine the dynamics of the predictable component. The main classification of EPs reflects the trajectories behavior in the neighborhood of the equilibrium point and is subdivided into two types of behavior: attractive and repulsive.

Statistical experiment models are a valuable tool for advanced graduate students and practicing professionals in statistical modeling and dynamics of complex systems consisting of a large number of interacting elements.

Dmitri KOROLIOUK

December 2019

List of Abbreviations

AF

action functional

ASD

asymtotically small diffusion

DEE

difference evolutionary equation

DMD

discrete Markov diffusion

DMP

discrete Markov process

FCS

fluorescence correlation spectroscopy

EG

exponential generator

EP

evolutionary process

ESE

exponential statistical experiment

MC

Markov chain

MP

Markov process

MRE

Markov random environment

MSE

multivariate statistical experiment

OEF

optimal estimating function

QSF

quasi-score functional

ROI

region of interest

RF

regression function

RFI

regression function of increments

SA

stochastic approximation

SDE

stochastic difference equation

SE

statistical experiment

TSE

ternary statistical experiment

Introduction

The main objective of consideration is a statistical experiment (SE), defined as the averaged sum SN (·) of independent and identically distributed random samples, which take a finite number of values, in particular, the binary values ±1:

The basic assumption 3 (Proposition 1.3.3) is derived by using the regression function of increments (RFI), given by the predictable component of SEs [1.3.10].

SE presentation as averaged sums of independent and identically distributed random samples (which take a finite number of values) means that an SE is defined by two components: evolutionary processes (EPs) (predictable components) and martingale differences (stochastic components). So, the SE can be considered as special semi-martingales (Jacod and Shiryaev 1987).

An essential feature of specifying SEs is their characterization by stochastic difference equations (SDEs) consisting of two parts: a predictable component defined by RFI and a stochastic component characterized by its first two moments.

Proposition 1.2.3 (Basic assumption 3). The SEs given by the averaged sums of the sample values are determined by the solutions of the SDEs:

The stochastic component of the SDEs [1.2.34] and [1.2.35] is characterized by their quadratic characteristics.

The existence of equilibria ρ± of the predictable component provides the convergence with probability 1 (Theorem 1.2.2).

However, the stochastic component defined by the martingale differences generates, as N → ∞, a series (by k) of normally distributed random variables with certain quadratic variations which depend on the state of SEs (Theorem 1.2.3). Asymptotic representation of the normalized martingale differences provides a basic stochastic approximation by SDEs, with the stochastic part represented by the normally distributed martingale differences (Propositions 1.2.4 and 1.2.5).

The multivariate statistical experiments (MSEs) in section 1.3 are considered in the assumption of a finite number of the possible values E = {e0, e1, …, eM}, M ≥ 1. The linear RFI is determined as follows.

Proposition 1.3.1 (Basic assumption 1)

That is, the linear RFI is given by the fluctuation of increments with respect to the equilibrium value.

The essential result is given in Proposition 1.3.2 where the multivariate frequencies with Wright–Fisher normalization are considered for models in population genetics (Ethier and Kurtz 1986). The nonlinear RFI are represented as in canonical representation [1.3.9] with the fluctuation of increments with respect to the ratio of the values of EPs and the corresponding equilibria.

Proposition 1.3.3. (Basic assumption 3.) The frequencies of multivariate EPs are determined by difference evolutionary equations (DEEs) with nonlinear RFIs having the fluctuation representation [1.3.10].

The presence of the equilibrium state ensures the convergence of EPs, as k → ∞ (Theorem 1.3.1).

The SDE for MSEs is given with martingale difference [1.3.16] as the stochastic component with quadratic characteristics (Lemma 1.3.1; also see Proposition 1.3.4). A representation of MSEs as normalized sums of independent and identically distributed random variables [1.3.28] provides the convergence with probability 1 (Theorem 1.3.2).

In sections 1.3.1 and 1.3.2, the martingale differences are approximated by normally distributed random variables (Theorem 1.3.3). The MSEs are determined by the solutions of the SDEs with normally distributed stochastic components and the “constant” quadratic characteristics defined by equilibrium points (Proposition 1.3.5).

The SEs with Wright–Fisher normalization are developed with RFIs given by the fluctuations with respect to equilibrium introduced in section 1.4. The binary EPs are transformed in the increment probabilities with RFIs in terms of the fluctuations [1.4.12]. In this connection, the RFIs are postulated as basic assumption 2 (Proposition 1.4.1).

Section 1.5 is dedicated to the exponential SE (ESE) defined as a product of random variables (Definition 1.5.1; also see [1.5.7]).

The ESE is investigated in two normalized schemes: Theorem 1.5.1—the convergence, in probability, of ESEs with normalization λN = λ/N, N → ∞, is realized using Le Cam’s approximation (Borovskikh and Korolyuk 1997) and Theorem 1.5.2—the convergence, in distribution, of ESEs with normalization , to geometric Brownian motion.

Theorems 1.5.1 and 1.5.2 give us an opportunity to represent ESEs in exponential approximation scheme [1.5.41] using only three parameters for the normal process of autoregression. Note that the ESE has important interpretation in financial mathematics (Shiryaev 1999).

Chapter 2 is dedicated to the diffusion approximation of SEs in discrete–continuous time.

Discrete Markov processes (DMPs), determined by the solutions of the SDEs in discrete–continuous time, are approximated, as N → ∞, by diffusion processes with evolution, given by differential stochastic equations.

The discrete–continuous time is determined by the connection of discrete instants of time k ≥ 0, with continuous time t ≥ 0, t ∈ R+ = [0, +∞) by the formula k = [Nt], t ≥ 0. The integer part [Nt] = kdefines a discrete sequence . The adjacent moments of the continuous time and as N → ∞, k → ∞.

DMPs in discrete–continuous time are considered with the normalized fluctuation [2.1.12]. Basic assumption 2.1.2 supposes that the DMP, given by the solution of the SDE [2.1.15] with nonlinear RFIs, can be approximated by a linear SDE [2.1.20] (see basic assumption 2.1.3). The finite-dimensional distributions of DMPs converge, as N → ∞, to a diffusion process with evolution of Ornstein–Uhlenbeck type given by the solution of the SDE [2.1.24] (Theorem 2.2.1, Conclusion 2.2.1).

The diffusion approximation of DMPs is based on the operator characterization of Markov processes (Korolyuk and Limnios 2005).

Propositions 2.2.1 and 2.2.2 are determined the SDEs with predictable components, as well as the quadratic characteristics of stochastic component convergence, as N → ∞, to equilibrium values (Lemma 2.2.1) provide the linear character of the SDEs for limit processes, characterized only by the local fluctuations (Proposition 2.2.3).

The DMP in discrete and continuous Markov random environments (MREs) are considered in section 2.3. Using the method of singular perturbation (Korolyuk and Limnios 2005), Theorem 2.3.1 states the convergence of the finite-dimensional distributions to those of the limit diffusion processes with evolution, defined by averaged drift and diffusion parameters and [2.3.10], respectively. The averages are intend over the stationary distribution of the embedded Markov chain. Theorem 2.3.2 states the limit diffusion process with evolution, with parameters and averaged by the stationary distribution of a Markov ergodic environment.

Essential applications can give the technique of singular perturbation problem for a reducible invertible operator acting on a perturbed test function φ1(c, x) that provides the solvability condition of a certain operator equation (see [2.3.38]–[2.3.39]).

One specific model in section 2.4 is DMP in a balanced Markov random environment. The approximation of DMP in discrete–continuous time with balance condition [2.4.9] is given in Theorem 2.4.1. The limit Ornstein–Uhlenbeck diffusion process [2.4.20] is determined by the parameters and calculated by the explicit formulas [2.4.15]–[2.4.16]. Here the solution of a singular perturbation problem for the truncated operator [2.4.35] is realized on the perturbed test function [2.4.36] that consists of three differently scaled components.

In section 2.5, the adapted SEs are combined with a random time change, which transforms a discrete stochastic basis into a continuous stochastic basis .

The adapted SE with continuous basis are studied in the series scheme with a small series parameter ε → 0 (ε > 0). The limit diffusion process with evolution is determined in Theorem 2.5.1 by the predictable characteristics and .

In section 2.6, the DMPs are considered in an asymptotic diffusion environment, generated by the DEE [2.6.6] with the balance condition [2.6.7]. Theorem 2.6.1 states the convergence by ε → 0 to an Ornstein–Uhlenbeck diffusion process.

In section 2.7, the DMPs with asymptotically small diffusion are considered. Its exponential generator is determined in Theorem 2.7.2 by relation [2.7.2] on the test functions φ(c) ∈ C3(R). The action functional for DMPs is introduced, following the method developed in monograph (Freidlin and Ventzell 2012).

Chapter 3 is devoted to statistically estimating the drift parameter V0 and verifying the hypothesis for the SE dynamics as k → ∞.

The case of one-dimensional stationary SEs is considered in section 3.1. The predictable component is supposed to have a linear form with drift parameter V0> 0.

Theorem 3.1.1 establishes the wide-sense stationarity conditions of SEs (αt, t ≥ 0), given by the solution of the SDEs

[0.0.1]

as a relation between the dispersion of stationary SE and the quadratic characteristic of the stochastic component:

[0.0.2]

A wide-sense stationarity, SE

α

t

,

t

≥ 0, given by the solution of the SDE

[0.0.1]

, is characterized by the covariance matrix

[0.0.3]

of two-component vector (αt, Δαt), t ≥ 0, given by the following relations (Theorem 3.1.2):

[0.0.4]

The additional condition on a stationary SE can be supposed to be a Gaussian distribution of the stochastic component. Supposing the normal distribution of the initial value α0: Eα0 = 0, , the matrix of the quadratic form R−1 which generates two-dimensional normal distribution and can be represented by elements of the covariance matrix [3.1.9] follows the relation . In Corollaries 3.1.3 and 3.1.4, formulae [3.1.21] and [3.1.22] determine the drift parameter V0 and the correlation coefficient r2.

The trajectories of a stationary SE generate the covariance statistics

which serve, by virtue of relations [0.0.4], as the basis of consistent estimates of the drift parameter V0:

[0.0.5]

each of which estimate the drift parameter V0.

Theorem 3.1.3 states the convergence with probability 1:

The main statistical problem in analysis of stationary SEs is to construct the optimal estimating function (OEF) for justifying the statistical estimation of the drift parameter V0. The OEF is given by the quadratic variation (Heyde 1997) of the martingale differences:

The statistics of one-dimensional stationary SEs are also generalized to the multivariate stationary case (section 3.2).

Under some natural conditions of stationarity, in wide-sense, [3.2.3] is established in Theorem 3.2.1. The positive-definite covariance matrix , t ≥ 0, is given by the solution of the matrix equation

[0.0.6]

Representation of the OEF in Theorem 3.2.1 gives the following a priori statistic:

The residual term in the representation

gives the estimator of the stationary factor (see [0.0.2]).

In section 3.2.3, using the normal correlation theorem (Liptser and Shiryaev 2001, Ch. 13), any stationary Gaussian Markov SE, defined by its two-component covariances, is determined by the solution of the SDE [3.2.27] with Gaussian martingale differences [3.2.28].

Section 3.4 is dedicated to asymptotic analysis of EP dynamics determined by the solution of the DEE [3.4.2] or [3.4.8]. The classification of EPs is based on their trajectories, taking into account the equilibrium values ρ± as well as the values of the drift parameter V0.

The classification of frequency EPs is given in Propositions 3.4.1 and 3.4.2. Its justification is based on the limit behavior of frequency EPs, as k → ∞. In Theorem 3.4.1, the probabilities of alternatives P±(k), k ≥ 0, given by the DEE solution [3.4.2] or [3.4.8], have four possible asymptotic behavioral trends: attractive limk→∞P±(k) = ρ±; repulsive limk→∞P±(k) = 1 and limk→∞ P∓(k) = 0 under initial condition P±(0) > ρ± or P±(0) < ρ±; domination ± limk→∞P±(k) = 1 and limk→∞P∓(k) = 0.

The interpretation of the limit behavior of SEs in practical applications such as population genetics (Crow and Kimura 2009; see also Fisher 1930, Schoen 2006, Wright 1969), economics and finance (Shiryaev 1999) and others have a certain practical interest. Another interpretation of SE asymptotic trends is given for the models of collective behavior or collective teaching (section 3.4.6).

The convergence of the SDEs solutions in a stochastic approximation scheme (Nevelson and Has’minskii 1976) is proved using the classifiers defined by truncated RFIs (Definition 3.5.1).

Classification of SEs is based on limit theorems for the stochastic approximation of SEs (Theorem 3.5.2).

Sections 3.6 and 3.7 give an important illustrative example of MSEs with three possible decisional alternatives. The classification theorems are considered for EPs as well as for the ternary SEs.

In Chapter 4, we use the mathematical model of discrete Markov diffusion (DMD) to describe the mechanisms of interaction of biological macromolecules. This model is determined by the solution of the SDEs with predictable and stochastic components.

The basic statistical data is derived from the dynamic monitoring of macromolecules (Rigler and Elson 2001) in fluorescence correlation spectroscopy (FCS), which calculates the rate of fluctuations of fluorescence-labeled molecules in a confocal space of observations.

For this purpose, we use the estimates, by trajectories, of the DMD models’ statistical parameters: V and σ2 for mathematical description of mechanisms in collective biological interactions.

The mathematical model and its verification on the simulated data set are obtained on the basis of the well-known Stokes–Einstein model. In particular, they used numerically generated data of a mixture of particles with two values of the diffusion coefficient: D1 = 10 and D2 = 100 μm2. Such simulated data, considered as trajectories of the new mathematical model of the DMD, has shown good discriminatory properties for the revelation of a mixture of two Brownian movements: “fast” and “slow.”