70,99 €
Provides an introduction to basic structures of probability with a view towards applications in information technology
A First Course in Probability and Markov Chains presents an introduction to the basic elements in probability and focuses on two main areas. The first part explores notions and structures in probability, including combinatorics, probability measures, probability distributions, conditional probability, inclusion-exclusion formulas, random variables, dispersion indexes, independent random variables as well as weak and strong laws of large numbers and central limit theorem. In the second part of the book, focus is given to Discrete Time Discrete Markov Chains which is addressed together with an introduction to Poisson processes and Continuous Time Discrete Markov Chains. This book also looks at making use of measure theory notations that unify all the presentation, in particular avoiding the separate treatment of continuous and discrete distributions.
A First Course in Probability and Markov Chains:
The authors present a unified and comprehensive overview of probability and Markov Chains aimed at educating engineers working with probability and statistics as well as advanced undergraduate students in sciences and engineering with a basic background in mathematical analysis and linear algebra.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 443
Veröffentlichungsjahr: 2012
Table of Contents
Title Page
Copyright
Preface
Chapter 1: Combinatorics
1.1 Binomial Coefficients
1.2 Sets, Permutations and Functions
1.3 Drawings
1.4 Grouping
Chapter 2: Probability measures
2.1 Elementary Probability
2.2 Basic Facts
2.3 Conditional Probability
2.4 Inclusion–Exclusion Principle
Chapter 3: Random variables
3.1 Random Variables
3.2 A Few Discrete Distributions
3.3 Some Absolutely Continuous Distributions
Chapter 4: Vector valued random variables
4.1 Joint Distribution
4.2 Covariance
4.3 Independent Random Variables
4.4 Sequences of Independent Random Variables
Chapter 5: Discrete time Markov chains
5.1 Stochastic Matrices
5.2 Markov Chains
5.3 Some Characteristic Parameters
5.4 Finite Stochastic Matrices
5.5 Regular Stochastic Matrices
5.6 Ergodic Property
5.7 Renewal Theorem
Chapter 6: An introduction to continuous time Markov chains
6.1 Poisson Process
6.2 Continuous Time Markov Chains
Appendix A: Power series
A.1 Basic Properties
A.2 Product of Series
A.3 Banach Space Valued Power Series
Appendix B: Measure and integration
B.1 Measures
B.2 Measurable Functions and Integration
B.3 Product Measures and Iterated Integrals
B.4 Convergence Theorems
Appendix C: Systems of linear ordinary differential equations
C.1 Cauchy Problem
C.2 Efficient Computation of eQt
C.3 Continuous Semigroups
References
Index
This edition first published 2013
© 2013 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Modica, Giuseppe.
A first course in probability and Markov chains / Giuseppe Modica and Laura Poggiolini.
pages cm
Summary: “A first course in Probability and Markov Chains presents an introduction to the basic
elements in statistics and focuses in two main areas”— Provided by publisher.
Includes bibliographical references and index.
ISBN 978-1-119-94487-4 (hardback)
1. Markov processes. I. Poggiolini, Laura. II. Title.
QA274.7.M63 2013
519.2′33— dc23
2012033463
A catalogue record for this book is available from the British Library.
ISBN: 978-1-119-94487-4
Preface
This book collects topics covered in introductory courses in probability delivered by the authors at the University of Florence. It aims to introduce the reader to typical structures of probability with a language appropriate for further advanced reading. The attention is mainly focused on basic structures.
There is a well established tradition of studies in probability due to the wide range of possible applications of related concepts and structures in science and technology. Therefore, an enormous amount of literature on the subject is available, including treatises, lecture notes, reports, journal papers and web pages. The list of references at the end of this book is obviously incomplete and includes only references used directly in writing the following pages. Throughout this book we adopt the language of measure theory (relevant notions are recalled in the appendices).
The first part of the book deals with basic notions of combinatorics and probability calculus: counting problems and uniform probability, probability measures, probability distributions, conditional probability, inclusion–exclusion principle, random variables, dispersion indexes, independence, and the law of large numbers are also discussed. Central limit theorem is presented without proof. Only a basic knowledge of linear algebra and mathematical analysis is required.
In the second part we discuss, as a first example of stochastic processes, Markov chains with discrete time and discrete states, including the Markov chain Monte Carlo method, and we introduce Poisson process and continuous time Markov chains with finite states. For this part, further notions in mathematical analysis (summarized in the appendices) are required: the Banach fixed point theorem, systems of linear ordinary differential equations, powers and power series of matrices.
We wish to thank all the students who have attended our courses. We also wish to thank our colleagues Matteo Focardi, Mariano Giaquinta, Paolo Maria Mariano, Andrea Mennucci, Marco Romito and Enrico Vicario who helped us with suggestions and comments. Special gratitude goes to Enrico Vicario for many helpful discussions on the applications.
Our special thanks also go to the editorial staff of Wiley for the excellent quality of their work.
We have tried to avoid misprints and errors. However, we would be very grateful to be notified of any errors or misprints and would be glad to receive any criticism or comments. Our e-mail addresses are:
We will try to keep up an errata corrige at the following web pages:
Chapter 1
Combinatorics
Combinatorics deals with the cardinality of classes of objects. The first example that jumps to our minds is the computation of how many triplets can be drawn from 90 different balls. In this chapter and the next we are going to compute the cardinality of several classes of objects.
Binomial coeffcients are defined as
Binomial coefficients are usually grouped in an infinite matrix
called a Pascal triangle given the triangular arrangement of the nonzero entries, see Figure 1.1. Here and throughout the book we denote the entries of a matrix (finite or infinite) where the superscript i and the subscript j mean the ith row and the jth column, respectively. Notice that the entries of each row of C are zero if the column index is large enough, with j > i ≥ 0. We also recall the Newton binomial formula,
Thus formula can be proven with an induction argument on n or by means of Taylor formula.
Figure 1.1 Pascal matrix of binomial coefficients , k, n ≥ 0.
Many formulas are known on binomial coefficients. In the following proposition we collect some of the simplest and most useful ones.
Estimate (ix) in Proposition 1.1 can be made more precise. For instance, from the Stirling asymptotical estimate of the factorial,
one gets
so that
or, equivalently,
1.1
For we define the sequence of generalized binomial coefficients as
Notice that if and if . The power series
1.2
is called the binomial series.
Figure 1.2 The coefficients .
A few entries of the inverse of the matrix of binomial coefficients are shown in Figure 1.3. As a consequence of Theorem 1.4 the following inversion formulas hold.
Figure 1.3 The inverse of the matrix of binomial coefficients .
Similarly,
Figure 1.4 The sum of a few series related to the geometric series.
We recall that a finite setA is an unordered collection of pairwise different objects. For example, the collection A hose objects are 1,2,3 is a finite set which we denote as ; the collection 1,2,2,3 is not a finite set, and and are the same set.
Let N be a finite set and let n be its cardinality. Without loss of generality, we can assume . A permutation of N is an injective (and thus one-to-one) mapping π:N → N. Since composing bijective maps yields another bijective map, the family of permutations of a set N is a group with respect to the composition of maps; the unit element is the identity map; this group is called the group of permutations ofN. It is denoted as Sn or . Notice that is a not a commutative group if n ≥ 3.
or, in brief, with its image-word 231465.
The set of permutations of has n! elements,
In fact, the image π(1) of 1 can be chosen among n possible values, then the image π(2) of 2 can be chosen among n − 1 possible values and so on. Hence
We now compute the cardinality dn of the set of permutations without fixed points, also called derangements.
Figure 1.5 contains the first elements of the sequence {dn}.
Figure 1.5 From the left, the numbers d0, d1, d2, d3, … of derangements of 0, 1, 2, 3, … points.
Another interesting structure is an unordered list of elements taken from a given set A. This structure is called a multiset on A. More formally, a multiset on a set A is a couple (A, a) where A is a given set and is the multiplicity function which counts ‘how many times’ an element x ∈ A appears in the multiset. Clearly, each set is a multiset where each object has multiplicity 1. We denote as or a2b2c5 the multiset on where a and b have multiplicity 2 and c has multiplicity 5. The cardinality of a multiset (A, a) is ∑x∈Aa(x) and is denoted by |(A, a)| or #(A, a). For instance, the cardinality of a2b2c5 is 9.
If B is a subset of A, then B is also the multiset (A, a) on A where
Given two multisets (B, b) and (A, a), we say that (B, b) is included in (A, a) if B ⊂ A and b(x) ≤ a(x) . In this case, where
Given a set A, a list of k objects from the set A or a k-word with symbols in A is an ordered k-tuple of objects. For instance, if , then the 6-tuples (1,2,3,3,2,1) and (3,2,1,3,2,1) are two different 6-words of objects in A. In these lists, or words, repetitions are allowed and the order of the objects is taken into account. Since each element of the list can be chosen independently of the others, there are n possible choices for each object in the list. Hence, the following holds.
Figure 1.6 The cardinality of the set of injective maps for n, k ≥ 0.
Let , k ≤ n, be the set of strictly monotone increasing functions . The image-list of any such function is an ordered k-tuple of strictly increasing–hence pairwise disjoint–elements of . The k-tuple is thus identified by the subset of the elements of appearing in it, so that we have the following.
Let be the class of monotone nondecreasing functions . The image-list of any such function is a nondecreasing ordered k-tuple of elements of , so that elements can be repeated. The functions in are as many as the multisets with cardinality k included in a multiset (A, a), where and a(x) ≥ k. Thus, see Proposition 1.12, we have the following.
The computation of the number of surjective functions is more delicate. Let denote the family of surjective functions from onto and let
There are exactly subsets of with cardinality j and there are different surjective functions onto each of these sets. Thus, and
Since we defined , we get
1.5
Therefore, applying the inversion formula in Corollary 1.5 we conclude the following.
We point out that the equality holds also if k ≤ n so that
Another useful formula for is an inductive one, obtained starting from and for any k and n with k < n.
1.6
Some of the 's are in Figure 1.7.
Figure 1.7 The cardinality of the set of surjective maps for n, k ≥ 0.
A drawing or selection of k objects from a population of n is the choice of k elements among the n available ones. We want to compute how many of such selections are possible. In order to make this computation, it is necessary to be more precise, both on the composition of the population and on the rules of the selection as, for instance, if the order of selection is relevant or not. We consider a few cases:
The population is made by pairwise different elements, as in a lottery: in other words, the population is a set.
The population is a multiset (
A
,
a
). In this case, we say that we are dealing with a
drawing from
A
with repetitions
.
The selected objects may be given an order. In this case we say that we consider an
ordered selection
. Unordered selections are also called
simple selections
.
Some drawing policies simply boil down to the previous cases:
In the lottery game, numbers are drawn one after another, but the order of drawings is not taken into account: it is a simple selection of objects from a set.
In ordered selections the
k
-elements are selected one after another and the order is taken into account.
A drawing with replacement, i.e. a drawing from a set where each selected object is put back into the population before the next drawing is equivalent to a drawing with repetitions, i.e. to drawing from a multiset where each element has multiplicity larger than the total number of selected objects.
Ordered drawings of k objects from a multiset (A, a) are k-words with symbols taken from A.
different k-words with pairwise different symbols.
In particular, the number of ordered drawings with replacement of k elements from A is nk.
The population from which we make the selection is a set A. To draw k objects from A is equivalent to selecting a subset of k elements of A: we do not distinguish selections that contain the same objects with a different ordering.
i.e. S is a multiset of k elements included in (A, a) (cf. Proposition 1.17).
The previous results on drawings can also be obtained from the following combinatorics properties of drawings.
A similar result holds for ordered drawings
Many classical counting problems amount to a collocation or grouping problem: how many different arrangements of k objects in n boxes are there? Putting it another way, how many different ways of grouping k objects into n groups are there? Also in this case a definite answer cannot be given: we must be more precise both on the population to be arranged, on the rules (or policy) of the procedure, and on the way the groups are evaluated. For example, one must say whether the objects to be arranged are pairwise different or not, whether the order of the objects in each box must be taken into account or not, whether the boxes are pairwise distinct or not, and if further constraints are imposed. Here we deal with a few cases, all referring to collocation or grouping in pairwise different boxes. We consider the formed groups as a list instead of as a set: for instance, if we start with the objects then the two arrangements in two boxes and are considered to be different.
Arranging k distinct objects in n pairwise different boxes is the act of deciding the box in which each object is going to be located. Since both the objects and the boxes are pairwise distinct, we may identify the objects and the boxes with the sets and , respectively. Each arrangement corresponds to a grouping map that puts the object j into the box f(j).
In this case the set of possible locations is in a one-to-one correspondence with the set of all maps . Therefore, there are nk different ways to locate k-different objects in n boxes.
different choices for the elements in the nth box. Thus the different possible arrangements are
1.7
the ratio in (1.7) is called the multinomial coefficient and is denoted as
From (1.7) we infer that the possible collocations of k pairwise different objects in n pairwise different boxes are
We now want to compute the number of different arrangements with at least one object in each box. Assuming we have k objects and n boxes, collocations of this type are in a one-to-one correspondence with the class of surjective maps from onto , thus there are
collocations of k pairwise different into n pairwise different boxes that place at least one object in each box.
1.8
ways to arrange k different objects in n boxes with ij objects in the box j. Thus the number of arrangements with no empty box is
1.9
We now impose a different constraint: each box may contain at most one object. Assuming we have k objects and n boxes, collocations of this type are in a one-to-one correspondence with the class of injective grouping maps from onto , thus there are
collocations of this type.
Here, we want to compute the number of ways of grouping k pairwise different objects in n pairwise different boxes and pretend that the order of the objects in each box matters. In other words we want to compute how many different ways exist to group k objects in a list of n lists of objects. We proceed as follows.
We want to compute the number of ways to arrange k identical objects in n pairwise different boxes. In this case each arrangement is characterized by the number of elements in each box, that is by the map which counts how many objects are in each box. Obviously, . If the k objects are copies of the number ‘0’, then each arrangement is identified by the binary sequence
1.10
where the number ‘0’ denotes the fact that we are changing box.
Let us compute the number of such arrangements with no further constraint. There is a one-to-one correspondence between such arrangements and the set of all binary sequences of the type (1.10). Therefore, see Proposition 1.12, the different collocations ofkidentical objects innpairwise different boxes is
1.11
We add now the constraint that each box must contain at least one object. If k < n no such arrangement is possible. If k ≥ n, we then place one object in each box so that the constraint is satisfied. The remaining k − n objects can be now collocated without constraints. Therefore, cf. (1.11), there are
ways to arrange k identical objects in n boxes, so that no box remains empty.
We consider arrangements of k identical objects in n pairwise different boxes that place at most one object into each box. In this case, each arrangement is completely characterized by the subset of filled boxes. Since we can choose them in different ways, we conclude that the collocations ofkidentical objects innpairwise different boxes with at most one object per box is
Combinatorial properties hold for collocations as well as for drawings.
In statistical physics, each ‘particle’ is allowed to be in a certain ‘state’; an ‘energy level’ is associated with each state. The total energy of a system of particles depends on how many particles are in each of the possible states; the mean value of the energy depends on the probabilities that particles stay in a certain state. Thus, the number of possible collocations of the particles in the available states must be evaluated.
This is the case of classical statistical physics: the particles are distinct and no constraint is imposed on their distribution in different states. The number of possible collocations of k particles in n states is thus the number of collocations of k pairwise different objects in n pairwise different boxes, i.e. nk.
The particles are indistinguishable and no constraint is imposed on their distribution in different states. Particles with this behaviour are called bosons. The number of collocations of k particles in n states is then the number of collocations of k identical objects in n pairwise different boxes, i.e.
The particles are indistinguishable and each state can be occupied by at most one particle (Pauli exclusion principle). Particles following this behaviour are called fermions. Then the collocations of k particles in n states is the number of possible choices for the states to be occupied, i.e. . Obviously, the Pauli exclusion principle implies n ≥ k.
1.12
Chapter 2
Probability measures
The correspondence between Blaise Pascal (1623–1662) and Pierre de Fermat (1601–1665) on some matters related to card games suggested by the nobleman and gambler Chevalier de Méré is considered to be the birth of probability. The first treatise, De Ratiociniis in ludo aleae, by Christiaan Huygens (1629–1695) was published in 1647. Other publications followed in the eighteenth century: at the beginning of the century, the Ars conjectandi by Jacob Bernoulli (1654–1705) and the Doctrine of Chances by Abraham de Moivre (1667–1754) were published in 1713 and 1718, respectively. Then the Théorie analytique des probabilités by Pierre-Simon Laplace (1749–1827), which appeared in 1812, summarized the knowledge on probability of the whole century.
The definition of probability given by Pascal for games of chance is the so-called classical interpretation: the probability of an event E is the ratio between the number of successes (the cases in which the event E happens) and the number of all possible cases:
2.1
Clearly, the definition can be criticized: it cannot define the probability of those events for which we do not know a priori the number of successes and it does not make sense when there are an infinite number of possible cases. Generally speaking, the definition does not apply to a multitude of cases which we would like to speak of in probability terms.
Another definition, given by Richard von Mises (1883–1953), is the frequentist interpretation: the probability of an event is the limit of the frequencies of successes as the number of trials goes to infinity:
2.2
This definition can also be easily criticized: how can the limit of a sequence be computed if the sequence itself is not known? Moreover, the limit may not exist and in many situations, such as in economics, experiments cannot be repeated.
A further definition, first given by Jacob Bernoulli (1654–1705), is the subjectivist interpretation: the probability of an event E is a number that measures the degree of belief that E may happen, expressed by a coherent individual. The ‘coherence’ is expressed by the fact that if the individual measures as p ∈ [0, 1] the degree of belief that E happens, then 1 − p is the degree of belief that E does not happen.
Another way to describe the subjectivist interpretation is the bet that an individual, according to the information in his possession, regards as a fair cost to pay (to get paid) in order to get (pay) one if the event E happens and in order to get (pay) nothing if E does not happen. The coherence is in a fair evaluation of the probability that E happens: the individual may be either the gambler or the bookmaker but the evaluation remains the same. The following example is from [1].
In the classic definition by Pascal, the probability
