Probability, Random Variables, Statistics, and Random Processes - Ali Grami - E-Book

Probability, Random Variables, Statistics, and Random Processes E-Book

Ali Grami

0,0
101,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Probability, Random Variables, Statistics, and Random Processes: Fundamentals & Applications is a comprehensive undergraduate-level textbook. With its excellent topical coverage, the focus of this book is on the basic principles and practical applications of the fundamental concepts that are extensively used in various Engineering disciplines as well as in a variety of programs in Life and Social Sciences. The text provides students with the requisite building blocks of knowledge they require to understand and progress in their areas of interest. With a simple, clear-cut style of writing, the intuitive explanations, insightful examples, and practical applications are the hallmarks of this book.

The text consists of twelve chapters divided into four parts. Part-I, Probability (Chapters 1 – 3), lays a solid groundwork for probability theory, and introduces applications in counting, gambling, reliability, and security. Part-II, Random Variables (Chapters 4 – 7), discusses in detail multiple random variables, along with a multitude of frequently-encountered probability distributions. Part-III, Statistics (Chapters 8 – 10), highlights estimation and hypothesis testing. Part-IV, Random Processes (Chapters 11 – 12), delves into the characterization and processing of random processes. Other notable features include:

  • Most of the text assumes no knowledge of subject matter past first year calculus and linear algebra
  • With its independent chapter structure and rich choice of topics, a variety of syllabi for different courses at the junior, senior, and graduate levels can be supported
  • A supplemental website includes solutions to about 250 practice problems, lecture slides, and figures and tables from the text 

Given its engaging tone, grounded approach, methodically-paced flow, thorough coverage, and flexible structure, Probability, Random Variables, Statistics, and Random Processes: Fundamentals & Applications clearly serves as a must textbook for courses not only in Electrical Engineering, but also in Computer Engineering, Software Engineering, and Computer Science.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 561

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Preface

Acknowledgments

About the Companion Website

Part I: Probability

1 Basic Concepts of Probability Theory

1.1 Statistical Regularity and Relative Frequency

1.2 Set Theory and Its Applications to Probability

1.3 The Axioms and Corollaries of Probability

1.4 Joint Probability and Conditional Probability

1.5 Statistically Independent Events and Mutually Exclusive Events

1.6 Law of Total Probability and Bayes' Theorem

1.7 Summary

Problems

2 Applications in Probability

2.1 Odds and Risk

2.2 Gambler's Ruin Problem

2.3 Systems Reliability

2.4 Medical Diagnostic Testing

2.5 Bayesian Spam Filtering

2.6 Monty Hall Problem

2.7 Digital Transmission Error

2.8 How to Make the Best Choice Problem

2.9 The Viterbi Algorithm

2.10 All Eggs in One Basket

2.11 Summary

Problems

3 Counting Methods and Applications

3.1 Basic Rules of Counting

3.2 Permutations and Combinations

3.3 Multinomial Counting

3.4 Special Arrangements and Selections

3.5 Applications

3.6 Summary

Problems

Part II: Random Variables

4 One Random Variable: Fundamentals

4.1 Types of Random Variables

4.2 The Cumulative Distribution Function

4.3 The Probability Mass Function

4.4 The Probability Density Function

4.5 Expected Values

4.6 Conditional Distributions

4.7 Functions of a Random Variable

4.8 Transform Methods

4.9 Upper Bounds on Probability

4.10 Summary

Problems

5 Special Probability Distributions and Applications

5.1 Special Discrete Random Variables

5.2 Special Continuous Random Variables

5.3 Applications

5.4 Summary

Problems

6 Multiple Random Variables

6.1 Pairs of Random Variables

6.2 The Joint Cumulative Distribution Function of Two Random Variables

6.3 The Joint Probability Mass Function of Two Random Variables

6.4 The Joint Probability Density Function of Two Random Variables

6.5 Expected Values of Functions of Two Random Variables

6.6 Independence of Two Random Variables

6.7 Correlation between Two Random Variables

6.8 Conditional Distributions

6.9 Distributions of Functions of Two Random Variables

6.10 Random Vectors

6.11 Summary

Problems

7 The Gaussian Distribution

7.1 The Gaussian Random Variable

7.2 The Standard Gaussian Distribution

7.3 Bivariate Gaussian Random Variables

7.4 Jointly Gaussian Random Vectors

7.5 Sums of Random Variables

7.6 The Sample Mean

7.7 Approximating Distributions with the Gaussian Distribution

7.8 Probability Distributions Related to the Gaussian Distribution

7.9 Summary

Part III: Statistics

8 Descriptive Statistics

8.1 Overview of Statistics

8.2 Data Displays

8.3 Measures of Location

8.4 Measures of Dispersion

8.5 Measures of Shape

8.6 Summary

9 Estimation

9.1 Parameter Estimation

9.2 Properties of Point Estimators

9.3 Maximum Likelihood Estimators

9.4 Bayesian Estimators

9.5 Confidence Intervals

9.6 Estimation of a Random Variable

9.7 Maximum a Posteriori Probability Estimation

9.8 Minimum Mean Square Error Estimation

9.9 Linear Minimum Mean Square Error Estimation

9.10 Linear MMSE Estimation Using a Vector of Observations

9.11 Summary

Problems

10 Hypothesis Testing

10.1 Significance Testing

10.2 Hypothesis Testing for Mean

10.3 Decision Tests

10.4 Bayesian Test

10.5 Neyman–Pearson Test

10.6 Summary

Problems

Part IV: Random Processes

11 Introduction to Random Processes

11.1 Classification of Random Processes

11.2 Characterization of Random Processes

11.3 Moments of Random Processes

11.4 Stationary Random Processes

11.5 Ergodic Random Processes

11.6 Gaussian Processes

11.7 Poisson Processes

11.8 Summary

Problems

12 Analysis and Processing of Random Processes

12.1 Stochastic Continuity, Differentiation, and Integration

12.2 Power Spectral Density

12.3 Noise

12.4 Sampling of Random Signals

12.5 Optimum Linear Systems

12.6 Summary

Bibliography

Books

InternetWebsites

Answers

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter 11

Chapter 12

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Set identities.

Chapter 2

Table 2.1 Statistics about developing disease versus receiving treatment.

Table 2.2 Results of staying or switching in a Monty Hall problem.

Table 2.3 Optimal probability

P

(

S

)

of finding the best candidate as a function of...

Chapter 3

Table 3.1 Permutations and combinations, with and without replacement.

Table 3.2 Hands in poker.

Table 3.3 Probabilities that at least two out of

k

people share a common birthday...

Table 3.4 Probability that

m

of

M

tested items are defective.

Table 3.5 Probabilities of winning in the best‐of‐seven championship series.

Table 3.6 Probabilities of winning six‐forty‐nine lottery.

Chapter 7

Table 7.1

Q

(

x

)

represents area to the right of

x

in the standard Gaussian distrib...

Chapter 8

Table 8.1 Various percentiles for the Gaussian distribution.

Chapter 10

Table 10.1 Decisions and errors in significance testing.

Table 10.2 Statistic for mean.

Table 10.3 Critical values of

Z

.

Table 10.4 Critical values of

T

.

Table 10.5 Decision rules.

Table 10.6 Significance of

p

‐value.

List of Illustrations

Chapter 1

Figure 1.1 Relative frequency in a tossing of a fair coin.

Figure 1.2 Sets and their relationships through geometric intuition: (a) univer...

Figure 1.3 Relationship among sample space, events, and probability.

Figure 1.4 Occurrence of event

A

, as probabilities represented by areas.

Figure 1.5 A partition of sample space

S

into

n

disjoint sets.

Chapter 2

Figure 2.1 Link configuration with

k

components: (a) series and (b) parallel.

Figure 2.2 (a) Link configuration. (b) Configuration with permanently failed li...

Figure 2.3 Transition probability diagram of binary symmetric channel.

Figure 2.4 Steps in the Viterbi algorithm. (a) The trellis diagram with all pos...

Figure 2.5 Probability of group success versus probability of individual succes...

Chapter 3

Figure 3.1 Venn diagram.

Figure 3.2 Tree diagram for bit strings of length 5 without two consecutive 0s.

Figure 3.3 List of all possible pairs for

n

 = 6

and

k = 2.

...

Figure 3.4 Tree diagram for championship series.

Figure 3.5 Probability of winning in championship series in terms of the probab...

Chapter 4

Figure 4.1 Illustration of the relationship among sample space, random variable...

Figure 4.2 Illustration of the relationship among cdf, pmf, and pdf.

Figure 4.3 cdf examples: (a) discrete random variable; (b) continuous random va...

Figure 4.4 pmf of the discrete random variable

X

and its cdf.

Figure 4.5 Probabilities of possible and impossible events for a continuous ran...

Figure 4.6 pdf of the continuous random variable

X

and its conditional pdf.

Figure 4.7 Discrete random variable: (a) marginal pmf; (b) conditional pmf.

Chapter 6

Figure 6.1 A function assigns a pair of real numbers in the real plane to each ...

Figure 6.2 cdfs defined by regions: (a) joint cdf defined by the semi‐infinite ...

Figure 6.3 Scatter diagrams for different pairs of random variables, each with ...

Figure 6.4 Relationship between uncorrelatedness and independence.

Chapter 7

Figure 7.1 pdfs of the Gaussian distribution with mean

μ

X

and variance

.

Figure 7.2 Area interpretation of

Q

(

x

)

and

Φ(

x

)

for the Gaussian distribut...

Figure 7.3 Using the standard Gaussian distribution to calculate the probabilit...

Figure 7.4 The critical points of the standard Gaussian distribution. (a)

z

α

...

Figure 7.5 (a) pdf of one variable; (b) pdf of sum of two variables; (c) pdf of...

Chapter 8

Figure 8.1 Relationship between population parameter and sample statistic.

Figure 8.2 Class marks: (a) histogram; (b) cumulative relative frequency chart.

Figure 8.3 Class grades: (a) bar chart; (b) Paretochart; (c) pie chart.

Figure 8.4 Boxplot of a data set.

Figure 8.5 Skewness of distribution.

Figure 8.6 Tailedness of distribution.

Chapter 9

Figure 9.1 A simple breakdown of estimation problems.

Figure 9.2 Relationship between a point estimate

and an unknown parameter

θ.

...

Chapter 10

Figure 10.1 Critical regions: (a) right‐tailed test; (b) left‐tailed test; and ...

Figure 10.2 (a, b) Probabilities of Types I and II errors for sample means with...

Figure 10.3 Simple binary hypothesis tests.

Figure 10.4 An ROC curve.

Chapter 11

Figure 11.1 Classification of random processes: (a) continuous‐time continuous‐...

Figure 11.2 Relationship between stationarity and ergodicity.

Figure 11.3 A sample function of the Poisson process.

Chapter 12

Figure 12.1 (a) Slowly‐fluctuating random processes. (b) Rapidly‐fluctuating ra...

Figure 12.2 Input–output relations for the power spectral and cross‐spectral de...

Figure 12.3 White noise: (a) power spectral density; (b) autocorrelation functi...

Figure 12.4 Notation for optimum filter: (a) maximum signal‐to‐noise‐ratio crit...

Guide

Cover

Table of Contents

Begin Reading

Pages

xiii

xiv

xv

xvii

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

239

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

311

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

369

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

387

388

389

390

391

392

393

394

395

396

397

Probability, Random Variables, Statistics, and Random Processes

Fundamentals & Applications

Ali Grami

Copyright

This edition first published 2020

© 2020 John Wiley & Sons, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Ali Grami to be identified as the author of this work has been asserted in accordance with law.

Registered Office

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA

Editorial Office

111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication data applied for

Hardback: 9781119300816

Cover image: © Markus Gann/Shutterstock

Cover design: Wiley

Dedication

To my beloved mother,

the exceptional symbol of hard work, perseverance, and confidence

and

In loving memory of my father,

the epitome of integrity and generosity

Preface

I enjoy writing a book, but more importantly, I enjoy having written a book, as my hope is to make a difference by helping students study, learn, and apply subject matter. The purpose of this textbook is to present a modern introduction to fundamentals and applications of probability, statistics, and random processes for students in various disciplines of engineering and science.

The importance of the material discussed in this book lies in the fact that randomness and uncertainty, which always go hand in hand, exist in virtually every aspect of life, and their impacts are expanding exponentially. Think of the thousands of hourly weather forecasts, the millions of insurance policies issued each day, the billions of stock exchange transactions per week, and the trillions of wireless phone calls, text messages, e‐mails, and web searches each month, they are all based on probability models and statistical analyses.

This book provides a simple and intuitive approach, while maintaining a reasonable level of mathematical rigor and subtleties. The style of writing is less formal than most textbooks, so students can build more interest in reading the book and having it as a self‐study reference. Most of the text does not require any previous knowledge apart from first year calculus and linear algebra, the exception to this is the analysis and processing of random processes.

I firmly believe that before statistics and random processes can be understood, students must have a good knowledge of probability and random variables. As over time, discrete random variables and discrete‐time random processes have rapidly grown in importance, the mathematics of discrete random variables is introduced separately from the mathematics of continuous random variables. An attempt has also been made to have chapters that are rather independent, so a couple of topics are visited more than once. This allows the instructor to follow any sequence of chapters.

The pedagogy behind this book and its choice of contents evolved over many years. Most of the material in the book has been class‐tested and proven very effective. There are about 200 examples to help understand the fundamental concepts and approximately 250 problems at the end of chapters to test understanding of the material. Moreover, there are also dozens of practical applications, along with the introduction of dozens of widely known, frequently encountered probability distributions.

This textbook contains 12 chapters. As the title of the book suggests, it consists of four parts of Probability (Chapters 1-3), Random Variables (Chapters 4-7), Statistics (Chapters 8-10), and Random Processes (Chapters 11 and 12). There is intentionally more in this book than can be taught in one semester (of roughly 36 lecture hours) so as to provide the instructor flexibility in the selection of topics. The coverage of the book, in terms of depth and breadth as well as tone and scope, supports a variety of syllabi for various courses. As a third‐year course, the coverage may include Chapters 1, 2 (select sections), 4, 5 (select sections), 6, 7, and 8. As a fourth‐year course, the coverage may include Chapters 1-8 inclusive as well as Chapter 11. As a graduate course taken by first year master's students, the entire book should be covered, beginning first with a complete review of Chapters 1-8, followed by an in‐depth coverage of Chapters 9-12.

Upon request from the publisher, a Solutions Manual can be obtained only by instructors who use the book for adoption in a course. No book is flawless, and this book is certainly no different from others. Your comments and suggestions for improvements are always welcomed. I would greatly appreciate it if you could send your feedback to [email protected].

Ali Grami

January 2019

Acknowledgments

Acknowledgements are viewed by the reader as a total bore, but regarded by the author as a true source of incredible joy.

An expression of great respect and gratitude is due to all who have ever taught me. To my teachers at Shahrara Elementary School and Hadaf High School, as well as to my professors at the University of Manitoba, McGill University, and the University of Toronto, I owe an unrepayable debt of gratitude.

Writing a textbook is in some sense collaborative, as one is bound to lean on the bits and pieces of materials developed by others. I would therefore like to greatly thank the many authors whose writings helped me. I am heavily indebted to many individuals for their contributions to the development of this text, including Dr. I. Dincer, Dr. X. Fernando, Dr. Y. Jing, Dr. M. Nassehi, and Dr. H. Shahnasser, as well as the anonymous reviewers who provided helpful comments. I would also like to thank Nickrooz Grami who provided several examples and reviewed the book to improve coherence and clarity as well as Neeloufar Grami who prepared multiple figures and tables. My deep gratitude goes to Ahmad Manzar who carefully reviewed the entire manuscript, provided valuable comments, and helped improve many aspects of the book.

The financial support of Natural Sciences and Engineering Research Council (NSERC) of Canada was also crucial to this project. I am grateful to the staff of John Wiley & Sons for their support throughout various phases of this project, namely Tiina Wigley, Jon Gurstelle, Viniprammia Premkumar and all members of the production team.

Last, but certainly most important of all, no words can ever express my most heartfelt appreciation to my wife, Shirin, as well as my children, Nickrooz and Neeloufar, for their immeasurable love and support, without which this book would not have been possible.

About the Companion Website

This book is accompanied by a companion website:

www.wiley.com/go/grami/PRVSRP

The website includes: Solutions Manual and PowerPoint Slides with all Figures and Tables

Part IProbability

1Basic Concepts of Probability Theory

Randomness and uncertainty, which always go hand in hand, exist in virtually every aspect of life. To this effect, almost everyone has a basic understanding of the term probability through intuition or experience. The study of probability stems from the analysis of certain games of chance. Probability is the measure of chance that an event will occur, and as such finds applications in disciplines that involve uncertainty. Probability theory is extensively used in a host of areas in science, engineering, medicine, and business, to name just a few. As claimed by Pierre‐Simon Laplace, a prominent French scholar, probability theory is nothing but common sense reduced to calculation. The basic concepts of probability theory are discussed in this chapter.

1.1 Statistical Regularity and Relative Frequency

An experiment is a measurement procedure or observation process. The outcome is the end result of an experiment, where if one outcome occurs, then no other outcome can occur at the same time. An event is a single outcome or a collection of outcomes of an experiment.

If the outcome of an experiment is certain, that is the outcome is always the same, it is then a deterministic experiment. In other words, a deterministic experiment always produces the same output from a given starting condition or initial state. The measurement of the temperature in a certain location at a given time using a thermometer is an example of a deterministic experiment.

In a random experiment, the outcome may unpredictably vary when the experiment is repeated, as the conditions under which it is performed cannot be predetermined with sufficient accuracy. In studying probability, we are concerned with experiments, real or conceptual, and their random outcomes. In a random experiment, the outcome is not uniquely determined by the causes and cannot be known in advance, because it is subject to chance. In a lottery game, as an example, the random experiment is the drawing, and the outcomes are the lottery number sequences. In such a game, the outcomes are uncertain, not because of the inaccuracy in how the experiment may be performed, but because how it has been designed to produce uncertain results. As another example, in a dice game, such as craps or backgammon, the random experiment is rolling a pair of dice and the outcomes are positive integers in the range of one to six inclusive. In such a game, the outcomes are uncertain, because it has been designed to produce uncertain results as well as the fact that there exists some inaccuracy and inconsistency in how the experiment may be carried out, i.e. how a pair of dice may be rolled.

In random experiments, the making of each measurement or observation, i.e. each repetition of the experiment, is called a trial. In independent trials, the observable conditions are the same and the outcome of one trial has no bearing on the other. In other words, the outcome of a trial is independent of the outcomes of the preceding and subsequent trials. For instance, it is reasonable to assume that in coin tossing and dice rolling repeated trials are independent. The conditions under which a random experiment is performed influence the probabilities of the outcomes of an experiment. To account for uncertainties in a random experiment, a probabilistic model is required. A probability model, as a simplified approximation to an actual random experiment, details enough to include all major aspects of the random phenomenon. Probability models are generally based on the fact that averages obtained in long sequences of independent trials of random experiments almost always give rise to the same value. This property, known as statistical regularity, is an experimentally verifiable phenomenon in many cases.

The ratio that represents the number of times a particular event occurs over the number of times the trial has been repeated is defined as the relative frequency of the event. When the number of times the experiment being repeated approaches infinity, the relative frequency of the event, which approaches a limit because of statistical regularity, is called the relative‐frequency definition of probability. Note that this limit, which is based on an a posteriori approach, cannot truly exist, as the number of times a physical experiment can be repeated may be very large, but always finite.

Figure 1.1 shows a sequence of trials in a coin‐tossing experiment, where the coin is fair (unbiased) and Nh represents the number of heads in a sequence of N independent trials. The relative frequency fluctuates widely when the number of independent trials is small, but eventually settles down near when the number of independent trials is increased. If an ideal (fair) coin is tossed infinitely many times, the probability of heads is then

Figure 1.1 Relative frequency in a tossing of a fair coin.

If outcomes are equally likely, then the probability of an event is equal to the number of outcomes that the event can have divided by the total number of possible outcomes in a random experiment. This probability, known as the classical definition of probability, is determined a priori without actually carrying out the random experiment. For instance, if an ideal (fair) die is rolled, the probability of getting a 4 is then .

1.2 Set Theory and Its Applications to Probability

In probability models associated with random experiments, simple events can be combined using set operations to obtain complicated events. We thus briefly review the set theory as it is the mathematical basis for probability.

A set is a collection of objects or things, which are called elements or members. As shorthand, a set is generally represented by the symbol { }. It is customary to use capital letters to denote sets and lowercase letters to refer to set elements. If x is an element of the set A, we use the notation x ∈ A, and if x does not belong to the set A, we write x ∉ A. It is essential to have a clear definition of a set either by using the set roster method, that is listing all its elements between curly brackets, such as {3, 6, 9}, or by using the set builder notation, that is describing some property held only by all members of the set, such as {x | x is a positive integer less than 10 that is a multiple of 3}. Noting the order of elements presented in a set is immaterial, the number of distinct elements in a set A is called the cardinality of A, written as ∣A∣. The cardinality of a set may be finite or infinite.

We use Venn diagrams, as shown in Figure 1.2, to pictorially illustrate an important collection of widely known sets and their logical relationships through geometric intuition.

Figure 1.2 Sets and their relationships through geometric intuition: (a) universal set; (b) subset; (c) equal sets; (d) union of sets; (e) intersection of sets; (f) complement of a set; (g) difference of sets; and (h) mutually exclusive sets.

The universal setU, also represented by the symbol Ω, is defined to include all possible elements in a given setting. The universal set U is usually represented pictorially as the set of all points within a rectangle, as shown in Figure 1.2 a. The set B is a subset of the set A if every member of B is also a member of A. We use the symbol ⊂ to denote subset, B ⊂ A thus implies B is a subset of A, as shown in Figure 1.2 b. Every set is thus a subset of the universal set. The empty set or null set, denoted by ∅, is defined as the set with no elements. The empty set is thus a subset of every set.

The set of all subsets of a set A, which also includes the empty set ∅ and the set A itself, is called the power set of A. Given two sets of A and B, the Cartesian product of A and B, denoted by A × B and read as A cross B, is the set of all ordered pairs (a, b), where a ∈ A and b ∈ B. The number of ordered pairs in the Cartesian product of A and B is equal to the product of the number of elements in the set A and the number of elements in the set B. In general, we have A × B ≠ B × A, unless A = B.

The sets A and B are considered to be equal sets if and only if they contain the same elements, as shown in Figure 1.2 c. In other words, we have

1.1a1.1a

The union of two sets A and B is the set of all elements that are in A or in B or in both, as shown in Figure 1.2 d. The operation union corresponds to the logical “or” operation:

1.1b1.1b

The intersection of two sets A and B is the set of all elements that contain in both A and B, as shown in Figure 1.2 e. The operation intersection corresponds to the logical “and” operation:

1.1c1.1c

The union and intersection operations can be repeated for an arbitrary number of sets. Thus, the union of n sets is the set of all elements that are in at least one of the n sets and the intersection of n sets is the set of all elements that are shared by all n sets. The intersection of n sets is a subset of each of the n sets, and in turn, each of the n sets is a subset of the union of n sets.

The complement of a set A, with respect to the universal set U, denoted by Ac, is the set of all elements that are not in A, as shown in Figure 1.2 f. The operation complement corresponds to the logical “not” operation:

1.1d1.1d

Note that the complement of the universal set is the null set, and vice versa. The difference of sets A and B is the set of elements in A that are not in B, as shown in Figure 1.2 g. The difference operation can be represented as follows:

1.1e1.1e

Note that the set A − B is different from the set B − A. The sets A and B are mutually exclusive, also known as disjoint, if and only if the sets A and B have no common elements, as shown in Figure 1.2 h. The intersection of two mutually exclusive sets (disjoint sets) is thus an empty set, i.e. we have

1.1f1.1f

It is important to note that a set A is a collection of n mutually exclusive subsets of A, i.e. A1, A2, …, An, whose union equals A, where n is a positive integer. The cardinality of a union of two finite sets A and B can be found using the principle of inclusion–exclusion, i.e. we have

1.1g1.1g

Note that |A| + ∣ B∣ counts each element that is in set A, but not in set B once, and in set B, but not in set A once, and each element that is in both sets A and B exactly twice. The number of elements that is in both A and B, i.e. ∣A ∩ B∣, is subtracted from ∣A ∣ + ∣ B∣ so as to count the elements in the intersection A ∩ B only once. Note that if A and B are mutually exclusive, then we have |A ∪ B| ≜ |A| + |B|.

The principle of inclusion–exclusion can be easily extended to more than two sets. In general, for n sets, where n is a positive integer, the principle of inclusion–exclusion has a maximum of 2n − 1 terms. However, some of these terms may be zero, as it is possible that some of the n sets are mutually exclusive.

Some basic set operations can be combined to form other sets, as shown in Table 1.1. Due to the principle of duality in set theory, any identity involving sets is also true if we replace unions by intersections, intersections by unions, and sets by their complements. This principle is used when the complements of events are easier to define and specify than the events themselves. For instance, De Morgan's laws as defined below use the principle of duality:

1.1h1.1h

Table 1.1 Set identities.

Identity

Name

A

 ∪ 

B

=

B

 ∪ 

A

Commutative laws

A

 ∩ 

B

=

B

 ∩ 

A

(

A

 ∪ 

B

) ∪ 

C

=

A

 ∪ (

B

 ∪ 

C

) =

A

 ∪ 

B

 ∪ 

C

Associative laws

(

A

 ∩ 

B

) ∩ 

C

=

A

 ∩ (

B

 ∩ 

C

) =

A

 ∩ 

B

 ∩ 

C

A

 ∪ (

B

 ∩ 

C

) = (

A

 ∪ 

B

) ∩ (

A

 ∪ 

C

)

Distributive laws

A

 ∩ (

B

 ∪ 

C

) = (

A

 ∩ 

B

) ∪ (

A

 ∩ 

C

)

A

 ∪  ∅  =

A

Identity laws

A

 ∩ 

U

=

A

A

 ∪ 

U

=

U

Domination laws

A

 ∩  ∅  = ∅

A

 ∪ 

A

=

A

Idempotent law

A

 ∩ 

A

=

A

(

A

c

)

c

=

A

Complementation law

A

 ∪ 

A

c

=

U

Complement laws

A

 ∩ 

A

c

= ∅

A

 − 

B

=

A

 ∩ 

B

c

Relative complement law

A

 ∪ (

A

 ∩ 

B

) =

A

Absorption laws

A

 ∩ (

A

 ∪ 

B

) =

A

(

A

 ∪ 

B

)

c

=

A

c

 ∩ 

B

c

De Morgan's laws

(

A

 ∩ 

B

)

c

=

A

c

 ∪ 

B

c

A

 ⊆ 

B

iff

A

 ∪ 

B

=

B

Consistency laws

A

 ⊆ 

B

iff

A

 ∩ 

B

=

A

|

A

 ∪ 

B

| = |

A

| + |

B

| −  ∣ 

A

 ∩ 

B

Inclusion–exclusion principle

Example 1.1

Consider the universal set U consisting of all positive integers from 1 to 10 inclusive. We define the event A consisting of all positive integers from 1 to 10 inclusive that are multiples of 2 and the event B consisting of all positive integers from 1 to 10 inclusive that are multiples of 3. Verify the following set of identities: (A ∪ B)c = Ac ∩ Bc, (A ∩ B)c = Ac ∪ Bc, A − B = A ∩ Bc, and |A ∪ B| = |A| + |B| −  ∣ A ∩ B∣.

Solution

Since we have

and

we obtain the following:

Since we have

we obtain the following:

The sample spaceS of a random experiment is defined as the set of all possible outcomes of an experiment. In a random experiment, the outcomes, also known as sample points, are mutually exclusive, i.e. they cannot occur simultaneously. When no single outcome is any more likely than any other, we then have equally likely outcomes. An event is a subset of the sample space of an experiment, i.e. a set of one or more sample points, and the symbol { } is used as shorthand for representing an event.

Two mutually exclusive events, also known as disjoint events, have no common outcomes, i.e. the occurrence of one precludes the occurrence of the other. The union of two events is the set of all outcomes that are in either one or both of the two events. The intersection of two events, also known as the joint event, is the set of all outcomes that are in both events. A certain (sure) event consists of all outcomes, and thus always occurs. A null (impossible) event contains no outcomes, and thus never occurs. The complement of an event contains all outcomes not included in the event. A sure event and an impossible event are thus complements of one another.

Example 1.2

Consider a random experiment that constitutes rolling a fair six‐sided cube‐shaped die and coming to rest on a flat surface, where the face of the die that is uppermost yields the outcome. Provide specific examples to highlight the above definitions associated with the sample space and events in a die‐rolling experiment.

Solution

The sample space S includes six sample points 1, 2, 3, 4, 5, and 6. Since the die is fair (unbiased), all six outcomes are equally likely. Some different events can be defined as follows: one with even outcomes (i.e. 2, 4, and 6), one with odd outcomes (i.e. 1, 3, and 5), and one whose outcomes are divisible by 3 (i.e. 3 and 6). Two mutually exclusive events may be an event with even outcomes and an event with odd outcomes. The union of two events, where one is with odd outcomes and the other is with outcomes that are divisible by 3, consists of 1, 3, 5, and 6, and their intersection consists of only 3. A certain (sure) event consists of outcomes that are integers between 1 and 6 inclusive. A null (impossible) event consists of outcomes that are less than 1 or greater than 6. The complement of the event whose outcomes are divisible by 3 is an event that contains 1, 2, 4, and 5.

When the sample space S is countable, it is known as a discrete sample space. In a discrete sample space, the probability law for a random experiment can be specified by giving the probabilities of all possible outcomes. With a finite nonempty sample space of equally likely outcomes, the probability of an event that is a subset of the sample space is the ratio of the number of outcomes in the event to the number of outcomes in the sample space.

Example 1.3

Suppose a pair of fair (ideal) dice is rolled. Determine the probability when the sum of the outcomes is three.

Solution

For mere clarity, it is assumed that one of the two dice is red and the other is green. There are six possible outcomes for the red die and six possible outcomes for the green die. The outcome of one die is independent of the outcome of the other die. In other words, for a specific outcome for the red die, there can be six different outcomes for the green die, and for a specific outcome for the green die, there can be 6 different outcomes for the red die. We therefore have 36 (=6 × 6) possible outcomes in rolling a pair of dice, and as the dice are fair, we have a total of 36 equally likely outcomes. It is thus a discrete sample space. To have a sum of three, the outcomes must then be a 1 and a 2. But to get a sum of 3, there are two different possible scenarios that include the red die is a 1 and the green die is a 2 or the red die is a 2 and the green die is a 1. Hence, the probability that a sum of three comes up is .

When the sample space S is uncountably infinite, it is known as a continuous sample space. In a continuous sample space, the probability law for a random experiment specifies a rule for assigning numbers to intervals of the real line. In the case of continuous sample space, an outcome that occurs only once in an infinite number of trials has a zero relative frequency that is the probability that the outcome takes on a specific value is zero. However, this does not imply that it cannot occur, but rather that its occurrence is extremely infrequent, hence zero relative frequency.

Example 1.4

In a game of darts, the dartboard has a radius of 22.5 cm. Consider the sample space is the entire dartboard, it is thus a continuous sample space. Assuming a dart is thrown at random and lands on the board, determine the probability that it lands in a region within 7.5 cm from the center of the board, and the probability that the dart lands exactly at the center of the board.

Solution

The total area of the dartboard is (22.5)2π cm2. The event of interest, which represents the area of a region within 7.5 cm from the center, is (7.5)2π cm2. Hence, the probability of interest is . Although it is possible that the dart lands exactly at the center of the board, the probability of its occurrence is zero, as the center of the board is one of the infinitely many points that the dart can hit.

1.3 The Axioms and Corollaries of Probability

Axioms are self‐evidently true statements that are unproven, but accepted. Axiomatic probability theory was developed by Andrey N. Kolmogorov, a prominent Russian mathematician, who was the founder of modern probability theory and believed the theory of probability as a mathematical discipline can and should be developed from axioms in exactly the same way as geometry and algebra. In the axiomatic definition of probability, the probability of the event A, denoted by P(A), in the sample space S of a random experiment is a real number assigned to A that satisfies the following axioms of probability:

1.2

Note that these results do not indicate the method of assigning probabilities to the outcomes of an experiment, they merely restrict the way it can be done. These axioms satisfy the intuitive notion of probability. Axiom I of probability highlights that the probability of an event is nonnegative, i.e. chances are always at least zero. Axiom II of probability emphasizes that the probability of all possible outcomes is 1, i.e. the chance that something happens is always 100%. Axiom III of probability underscores that the total probability of a number of disjoint (mutually exclusive) events, i.e. nonoverlapping events, is the sum of the individual probabilities. Figure 1.3 shows the relationship among sample space, events, and probability.

Figure 1.3 Relationship among sample space, events, and probability.

Example 1.5

A box contains 20 balls, numbered from 1 to 20 inclusive. Determine the probability that the number on a ball chosen at random is either a prime number or a multiple of 4.

Solution

Let A1 be the event that includes prime numbers, and A2 be the event that includes numbers that are multiples of 4, that is we have

We also have the following:

As the events A1 and A2 are mutually exclusive, we can use Axiom III to determine the probability that the number on a ball chosen at random is either a prime number or a multiple of 4. We thus have

From the three axioms of probability, the following important corollaries of probability can be derived:

1.3

The abovementioned corollaries provide the following insights:

(i) The impossible event has probability zero, and it provides a symmetry to Axiom II.

(ii) If the sample space is partitioned into two mutually exclusive events, the sum of the probabilities of these two events is then one. Quite often, one of these two probabilities is much easier than the other to compute.

(iii) The probability of an event is always less than or equal to one, and provides a sanity check on whether the results are correct, as the probabilities can never be negative or greater than one.

(iv) The probability of an event is the sum of the probabilities of two mutually exclusive events, and this divide‐and‐conquer approach can often help to easily compute the probability of interest.

(v) Probability is a nondecreasing function of the number of outcomes in an event, that is with more outcomes, the probability may either increase or remain the same.

(vi) It provides the generalization of Axiom III, when the two events are not mutually exclusive.

(vii) The sum of two individual probabilities is lower bounded by the probability of the union of two events. Clearly, equality holds if and only if the events are mutually exclusive. This property can be viewed as a special case of the union bound, as the

union bound

, also known as

Boole's inequality

, states that the probability of a finite union of events is upper bounded by the sum of the probabilities of the constituent events.

Example 1.6

A sequence of 20 bits is randomly generated. Noting that the probability of a bit being 0 is equal to that being 1, determine the probability that at least one of these bits is 0.

Solution

There are many possibilities where at least 1 of the 20 bits is a zero. For instance, there may only be one 0 in the sequence, and this constitutes 20 different possibilities, such as the first one is a 0 or the second one is a 0, so on and so forth. There may be only two zeros or three zeros or four zeros in the sequence or even more, where the locations of the zeros could be anywhere in the sequence of 20 bits. In fact, there are quite many possibilities, in each of which there is at least one 0 in the sequence of 20 bits, and the sum of the probabilities of all these possibilities would then determine the probability that at least one of these bits is 0.

There is an alternative strategy for finding the probability of an event when a direct approach is long and difficult. Instead of determining the probability of an event A, the probability of its complement Ac can be first found. The complement of the event that has at least one bit being 0 is the event that all bits are 1s. There are 220 = 1 048 576 different sequences of 20 bits, i.e. there are 1 048 576 sample points in the sample space. The probability of a sequence with all 1s is thus Using P(A) = 1 − P(Ac), the probability that at least one of these bits is 0 is then as follows: .

Example 1.7

A box contains 10 balls, numbered from 2 to 11 inclusive. A ball is drawn at random. Determine the probability that the drawn ball has an odd number, a prime number, both a prime number and an odd number, a prime number or an odd number or both, and neither an odd number nor a prime number.

Solution

Let A be the event with balls having odd numbers and B be the event with balls having prime numbers, we thus have the following:

Example 1.8

Of all bit sequences of length 8, an 8‐bit sequence is selected at random. Assuming that the probability of a bit being 0 is equal to that being 1, determine the probability that the selected bit sequence starts with a 1 or ends with the two bits 00.

Solution

The sample space S includes 256 equally likely outcomes, as we can construct a bit sequence of length eight in 28 = 256 ways, i.e. |S| = 256. The event A includes bit sequences of length 8 that all begin with a 1, we can also construct such sequences in 28 − 1 = 27 = 128 ways, i.e. |A| = 128. The event B includes bit sequences of length eight that all end with the two bits 00, we can also construct such sequences in 28 − 2 = 26 = 64 ways, i.e. |B| = 64. Some of the ways to construct a bit sequence of length 8 starting with a 1 are the same as the ways to construct a bit sequence of length 8 that ends with the two bits 00, there are 28 − 3 = 25 = 32 ways to construct such a sequence, i.e. |A ∩ B| = 32.

Using the principle of inclusion–exclusion, the number of bit sequences starting with the bit 1 or ending with the two bits 00 is thus as follows:

The probability of selecting the random sequence of interest is then .

Example 1.9

A pair of fair (ideal) cube‐shaped dice is rolled. Determine the probability that the sum on the dice is n, where n is an integer ranging between 2 and 12 inclusive, and also the probability that at least one of the two dice is m, where m is an integer between 1 and 6 inclusive.

Solution

Since the dice are fair, all 36 (=6 × 6) outcomes are equally likely. In the following, we present the values of the first and second dice by an ordered pair. Let An denote the event that the sum on the dice is n, we thus have the following:

Suppose i is the number appearing on the first die, where i = 1, 2, …, 6, and j is the number appearing on the second die, where j = 1, 2, …, 6, and ai, j denotes such an outcome. Let Bm denote the event that at least one of the two dice is m, where m = 1, 2, …, 6. Noting that in rolling a pair of dice, the outcome of one die is independent of the outcome of the other die, we have

Noting we have , , and , we therefore have

Example 1.10

Suppose for the events A and B, we have P(A) = 0.8 and P(B) = 0.6. Show that P(A ∩ B) ≥ 0.4.

Solution

We have

The probability of an impossible (null) event is zero; however, the converse is not true, that is the probability of an event can be zero, but that event may not be an impossible event. For instance, the probability of randomly selecting an integer that is both odd and even is zero, simply because such an integer does not exist, as it is an impossible event. However, the probability of randomly selecting an integer out of infinitely many possible integers is zero, as we have , yet such an event can occur, for it is not an impossible event.

1.4 Joint Probability and Conditional Probability

The probability of the occurrence of a single event, such as P(A) or P(B), which takes a specific value irrespective of the values of other events, i.e. unconditioned on any other events, is called a marginal probability. For instance, in rolling a die, the probability of getting a two represents a marginal probability.

The probability that both events A and B simultaneously occur is known as the joint probability of events A and B and is denoted by P(A ∩ B) or P(A, B), and read as probability of A and B. From the joint probability of two events, all relevant probabilities involving either or both events can be obtained. For instance, in rolling a pair of dice, the probability of getting a two on one die and a five on the other die represents a joint probability.

The following case should provide some insight into the utility of conditional probability. Consider two individuals, a man and a woman, while noting that one individual has four children and the other has none. Suppose we need to guess which of the two individuals is a parent. If no information is given, we say that the probability that the man is the parent is 0.5. If, however, we have the information that the man is 60 years old and the woman is 20 years old, then we say that the man is most likely the parent. Of course, this could be wrong, but it is likely to be true.

If we assume the probability of event B is influenced by the outcome of event A and we also know that event A has occurred, then the probability that event B will occur may be different from P(B). The probability of event B when it is known that event A has occurred is defined as the conditional probability, denoted by P(B ∣ A) and read as probability of B given A. The conditional probability P(B ∣ A) captures the information that the occurrence of event A provides about event B.

In Figure 1.4, assuming the probabilities are represented by the areas of the events, the occurrence of the event A brings about two effects: (i) the original sample space S (the large square) becomes the event A (the smaller square) and (ii) the event B (the large rectangle) becomes the event A ∩ B (the smaller rectangle). In essence, due to the occurrence of the event A, rather than dealing with a subset B of a sample space S, we are now concerned with a subset A ∩ B of the restricted sample space A. The conditional probability P(B ∣ A) and the conditional probability P(A ∣ B) are, respectively, defined as follows:

1.4

Figure 1.4 Occurrence of event A, as probabilities represented by areas.

In conditional probabilities P(B ∣ A) and P(A ∣ B), we must have P(A) ≠ 0 and P(B) ≠ 0, respectively. The reason lies in the fact that if we have P(A) = 0 (i.e. A never occurs) or P(B) = 0 (i.e. B never occurs), then the corresponding conditional probability, i.e. the probability of B, given that A occurs or the probability of A, given that B occurs, becomes totally meaningless. Note that P(A) or P(B) in the denominator normalizes the probability of the joint event P(A ∩ B). The normalization process in the calculation of conditional probability, i.e. dividing the joint probability by the marginal probability, ensures that the sum of all conditional probabilities in a sample space is one. Note that all three axioms of probability and the resulting important properties hold also for conditional probabilities. In addition, the conditional probability does not commute, that is we have P(B ∣ A) ≠ P(A ∣ B), unless P(A) = P(B).

Example 1.11