Extremes in Random Fields - Benjamin Yakir - E-Book

Extremes in Random Fields E-Book

Benjamin Yakir

0,0
80,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Presents a useful new technique for analyzing the extreme-value behaviour of random fields

Modern science typically involves the analysis of increasingly complex data. The extreme values that emerge in the statistical analysis of complex data are often of particular interest. This book focuses on the analytical approximations of the statistical significance of extreme values. Several relatively complex applications of the technique to problems that emerge in practical situations are presented.  All the examples are difficult to analyze using classical methods, and as a result, the author presents a novel technique, designed to be more accessible to the user.

Extreme value analysis is widely applied in areas such as operational research, bioinformatics, computer science, finance and many other disciplines. This book will be useful for scientists, engineers and advanced graduate students who need to develop their own statistical tools for the analysis of their data. Whilst this book may not provide the reader with the specific answer it will inspire them to rethink their problem in the context of random fields, apply the method, and produce a solution.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 398

Veröffentlichungsjahr: 2013

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Series Page

Title Page

Copyright

Dedication

Preface

Acknowledgments

Part I: Theory

Chapter 1: Introduction

1.1 Distribution of extremes in random fields

1.2 Outline of the method

1.3 Gaussian and asymptotically Gaussian random fields

1.4 Applications

Chapter 2: Basic examples

2.1 Introduction

2.2 A power-one sequential test

2.3 A kernel-based scanning statistic

2.4 Other methods

Chapter 3: Approximation of the local rate

3.1 Introduction

3.2 Preliminary localization and approximation

3.3 Measure transformation

3.4 Application of the localization theorem

3.5 Integration

Chapter 4: From the local to the global

4.1 Introduction

4.2 Poisson approximation of probabilities

4.3 Average run length to false alarm

Chapter 5: The localization theorem

5.1 Introduction

5.2 A simplified version of the localization theorem

5.3 The localization theorem

5.4 A local limit theorem

5.5 Edge effects and higher order approximations

Part II: Applications

Chapter 6: Nonparametric tests: Kolmogorov–Smirnov and Peacock

6.1 Introduction

6.2 Analysis of the one-dimensional case

6.3 Peacock's test

6.4 Relations to scanning statistics

Chapter 7: Copy number variations

7.1 Introduction

7.2 The statistical model

7.3 Analysis of statistical properties

7.4 The false discovery rate

Chapter 8: Sequential monitoring of an image

8.1 Introduction

8.2 The statistical model

8.3 Analysis of statistical properties

8.4 Optimal change-point detection

Chapter 9: Buffer overflow

9.1 Introduction

9.2 The statistical model

9.3 Analysis of statistical properties

9.4 Heavy tail distribution, long-range dependence, and self-similarity

Chapter 10: Computing Pickands' constants

10.1 Introduction

10.2 Representations of constants

10.3 Analysis of statistical error

10.4 Enumerating the effect of local fluctuations

Appendix: Mathematical background

A.1 Transforms

A.2 Approximations of sum of independent random elements

A.3 Concentration inequalities

A.4 Random walks

A.5 Renewal theory

A.6 The Gaussian distribution

A.7 Large sample inference

A.8 Integration

A.9 Poisson approximation

A.10 Convexity

References

Index

Wiley Series in Probability and Statistics

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by WALTER A. SHEWHART and SAMUEL S. WILKS

Editors

David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Harvey Goldstein, Iain M. Johnstone, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg

Editors Emeriti

Vic Barnett, J. Stuart Hunter, Joseph B. Kadane, Jozef L. Teugels

This edition first published 2013

© 2013 by Higher Education Press

All rights reserved

Registered office

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom

For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.

The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.

Library of Congress Cataloging-in-Publication Data

Yakir, Benjamin, author.

Extremes in random fields : a theory and its applications/Benjamin Yakir.

pages cm

Includes bibliographical references and index.

ISBN 978-1-118-62020-5 (hardback)

1. Random fields. I. Title.

QA274.45.Y35 2013

519.2′3– dc23

2013018539

A catalogue record for this book is available from the British Library.

ISBN: 978-1-118-62020-5

Preface

This text started as class notes for a course that I gave in the Mathematical Sciences Center (MSC) in Tsinghua University, Beijing, that got overblown and became a book. I was enjoying a sabbatical leave in the Department of Statistics and Applied Probability (DSAP) of the National University of Singapore when I was given an offer to teach a summer course in China. Of course I accepted. How could I resist the opportunity to fulfil a childhood dream of visiting China?

After accepting the proposal I had to decide what to teach. I decided to fulfil yet another dream, the dream of summarizing and unifying a subject I was writing about all my career, even before I knew what the subject was. The subject is the distribution of extremes in random fields and the analysis of statistical problems that can be formulated in relation to such extremes. Immediately after obtaining my PhD, and as a continuation of my PhD thesis, I was interested in the investigation of the average run length of the Shiryaev–Roberts change-point detection rule. Therefore, I found it natural to try to address a challenge that was presented to me by David Siegmund during a barbecue meal that he prepared for me in his yard. The challenge was to develop a simpler method for analyzing this average run length. In an attempt to attack this problem I began experimenting with the likelihood ratio identity, one of David's favorite techniques, and followed the road that eventually led me to writing this book.

The original problem was the investigation of average run length in a sequential change-point detection problem.1 However, the basic technique that was developed turned out to be useful for the investigation of a relatively wide array of different statistical problems that involve the distribution ofmaxima.2 Among other things, David and I used the method in order to investigate the significance level of sequence alignment, for the computation of the false detection rate in scanning statistic, for producing more efficient ways of simulation, etc. Each application required this modification or that trick in order to apply the basic principle. However, after 20 years of repeating the same argument even I was able to identify the pattern. The thrust of this book is a description of the pattern and the demonstration of its usefulness in the analysis of nontrivial statistical problems.

The basic argument relies on a likelihood ratio identity that uses a sum of likelihood ratios. This identity translates the original problem that involves the approximation of a vanishingly small probability to a problem that calls for the summation of approximations of expectations. The expectations are with respect to alternative distributions in which the event in question is much more likely to occur. Moreover, by carefully selecting the alternative distributions one may separate the leading term in the probability from the expectations that form the sum, enabling the investigation to concentrate on finer effects.

The method is useful since it does not rely on the ordering of the parameter set and it does not require the normal distribution. In many applications, some of them are presented in the book, a natural formulation of the model calls for the use of collections of random variables that are parameterized not by subsets of the real line. Frequently, the normal assumption may fit the limit in a central limit formulation but may not fit as a description of the extreme tail. In all such cases an alternative to the methods that are usually advocated in the literature are required. The method we present is such an alternative which we felt others may benefit from by knowing about.

This is why we wrote the book. But who is the target audience? This is a tough call. Even if I may state otherwise, the book requires a relatively advanced knowledge in probability as background, perhaps at the level of Durrett's book.3 Prior knowledge in statistics is an advantage. Indeed, there is an appendix that lists theorems and results and can be used as reference for the statements that are made in the book. Still, I guess that this book is a not an easy read even for experts, and much less so for students.

With this warning in mind, I hope that the effort that is required in reading the book will be rewarding. Definitely, for an expert who wants to add yet another method to his toolbox but also for a student who wants to become an expert. For such students, the book can be used as a basis for an advanced seminar. Reading chapters of the book can be used as a primer for a student who is then required to analyze a new problem that was not digested for him/her in the book. This is how I intend to use this book with my students.

The teacher can start such a course by discussing Chapters 1–4 that give the basic background and demonstrate the technique. Chapter 5 is more technical and can be skipped, unless the main interest is in the mathematical details. From the second part of the book it is probably recommended to go over Chapter 6, which is of an intermediate level of difficulty, and then read some or all of Chapters 7–10 depending on the interests of the teacher and the students and on the time constraints.

1 Yakir B., Pollak M. A new representation for the renewal-theoretic constant appearing in asymptotic approximations of large deviations. Ann. Appl. Probab. 8, 749-774 (1998).

2 Grossman S., Yakir B. Large deviations for global maxima of independent superadditive processes with negative drift and an application to optimal sequence alignment. Bernoulli 5, 829–845 (2004).Seigmund D.O., Yakir B. Approximate p-values for local sequence alignments. Ann. Statist. 28, 657–680 (2000).Seigmund D.O., Yakir B. Statistical analysis of direct identity by descent mapping. Ann. Hum. Genet. 67, 464–470 (2003).Seigmund D.O., Yakir B. Correction note: Approximate p-values for local sequence alignments. Ann. Statist. 31, 1027–1031 (2003).Seigmund D.O., Yakir B. Significance level in interval mapping. In Development of Modern Statistics and Related Topics, Series in Biostatistics, Volume 1. World Scientific Publishing, River Edge, NV, 10–19 (2003).Shi J., Siegmund D.O., Yakir B. Importance sampling for estimating p-values in linkage analysis. JASA 102, 929–937 (2007).Yakir B. On the average run length to false alarm in surveillance problems which possess an invariance structure. Ann. Statist. 26, 1198–1214 (1998).Yakir B. Approximation of the p-value in a multipoint linkage analysis using grandparent grandchild pairs and partially informative markers. Nonlinear Anal. 47, 1973–1984 (2001).Yakir B. Discussion on “Is average run length to false alarm always an informative criterion?” by Yajun Mei. Sequential Analysis 27, 406–410 (2008).

3 Durrett R. Probability: Theory and Examples (2nd Edition). Duxbury Press, Belmont, CA (1995).

Acknowledgments

My first acknowledgments are to environments, and especially the people who enabled these environments. The first half of the book was written mainly in DSAP. I know of no better place to do this type of scientific work. I will always be grateful. The second place is MSC. Without them I do not know when, if at all, this book would have been written. Next, I would like to recognize the financial support that I got from the Israel Science Foundation (Grant No. 325/09) and from the US–Israel Binational Science Foundation (Grant No. 2006101). This support was instrumental for the development of the original work that led to the applications that are presented in the second part of the book.

Some of the people that gave me a helping hand I would like to mention by name. Unfortunately, I cannot give the names of the anonymous reviewers who made very useful suggestions on the first draft of the book and helped me improve it. But I can give the name of the editor from Higher Education Press, Liping Wang. Thanks to her this is a book and not just class notes. Also I would like to thank Yuval Nardi, Moshe Pollak, Ton Dieker and Nancy Zhang who coauthored with David Siegmund and myself some of the works that are related directly to the content of the book.

And finally there is David Siegmund. The work presented in this book is basically our joint work. The only reason that we do not share authorship is the fact that I wanted to dedicate this book to him as my modest contribution to the celebration of his career and his accomplishments and as an appreciation for what he gave me. It is not appropriate for a book to be dedicated to one of its authors. So here it is: this is for you, David.

Benjamin Yakir, Jerusalem, IsraelFebruary 2013

Part I

Theory

Chapter 1

Introduction

1.1 Distribution of extremes in random fields

The aim of this book is to present a method for analyzing the tail distribution of extreme values in random fields. A random field can be considered as a collection of random variables , indexed by a set of parameters . The index set may be quite complex. However, in the applications that we will analyze in this book it will typically turn out that is a ‘nice’ subset of , the -dimensional space of real numbers.

In some statistical applications one is interested in probabilities such as:

the probability that the maximum of the random field exceeds a threshold , for large values of . There are only a few special cases in which the problem of computing such probabilities has an exact solution. In all other cases one is forced to use numerical methods, such as simulations, or to apply asymptotic approximations in order to evaluate the probability. This book concentrates on the application of the proposed method for producing asymptotic analytical expansions of the probability. Nonetheless, some elements in the method may, and have been, applied in order to simulate numerical evaluations more efficiently. An application that illustrates the usefulness of the method in the context of simulations is presented in the second part of the book.

As a motivating example consider scanning statistics. Scanning statistics are used in order to detect rare signals in an environment contaminated by random noise. For example, let us assume measurements that are taken in a one-dimensional environment. Each measurement is associated with a point in the environment and the points are equally spaced. For the most part, the expected values of the observations are fixed at some baseline level throughout the environment. However, at some unknown locations the expected value is different from the baseline. Such a shift of the expectation extends over an interval of unknown length. An interval of shifted expectations is the signal we seek to identify. Such a signal is parameterized by the location of the interval, by the length of the interval, and perhaps also by the magnitude of the shift.

The expectations of the observations correspond to signals (or lack thereof). A complication in fulfilling the task at hand is the fact that the observations are subject also to random noise, which may be parameterized by the variance of the observations. Frequently, this random noise is taken to be normally distributed and independent among observations. In such a case, the expectation structure and the variance specifies completely the distribution of the observations.

Say that our goal is to decide whether or not there is any signal in the environment. A reasonable approach, which has statistical merits to it, is to associate with each potential signal a statistic that summarizes the information in the data regarding that signal. For example, if signals are all of the form of an interval with a fixed level of the expectation above the baseline then an appropriate statistic is the standardized sample average of the observations that belong to the interval, with standardization conducted with respect to the baseline expectation and variance. The presence of a signal in the environment is announced if there exists a statistic with a value above a previously determined threshold. False detection occurs when all observations share the same background expectation level but, due to random fluctuations, the threshold is crossed. The preliminary task of the statistician, in order to limit the probability of false alarms, is to determine the value of the threshold.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!