108,99 €
Praise for the First Edition "...[t]he book is great for readers who need to apply the methods and models presented but have little background in mathematics and statistics." -MAA Reviews Thoroughly updated throughout, Introduction to Time Series Analysis and Forecasting, Second Edition presents the underlying theories of time series analysis that are needed to analyze time-oriented data and construct real-world short- to medium-term statistical forecasts. Authored by highly-experienced academics and professionals in engineering statistics, the Second Edition features discussions on both popular and modern time series methodologies as well as an introduction to Bayesian methods in forecasting. Introduction to Time Series Analysis and Forecasting, Second Edition also includes: * Over 300 exercises from diverse disciplines including health care, environmental studies, engineering, and finance * More than 50 programming algorithms using JMP®, SAS®, and R that illustrate the theory and practicality of forecasting techniques in the context of time-oriented data * New material on frequency domain and spatial temporal data analysis * Expanded coverage of the variogram and spectrum with applications as well as transfer and intervention model functions * A supplementary website featuring PowerPoint® slides, data sets, and select solutions to the problems Introduction to Time Series Analysis and Forecasting, Second Edition is an ideal textbook upper-undergraduate and graduate-levels courses in forecasting and time series. The book is also an excellent reference for practitioners and researchers who need to model and analyze time series data to generate forecasts.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 752
Established by WALTER A. SHEWHART and SAMUEL S. WILKS
Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay, Sanford Weisberg
Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels
A complete list of the titles in this series appears at the end of this volume.
Second Edition
DOUGLAS C. MONTGOMERY
Arizona State University Tempe, Arizona, USA
CHERYL L. JENNINGS
Arizona State University Tempe, Arizona, USA
MURAT KULAHCI
Technical University of Denmark Lyngby, Denmark and Luleå University of Technology Luleå, Sweden
Copyright © 2016 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data applied for.
Preface
Chapter 1: Introduction to Forecasting
1.1 The Nature and Uses of Forecasts
1.2 Some Examples of Time Series
1.3 The Forecasting Process
1.4 Data for Forecasting
1.5 Resources for Forecasting
Exercises
Chapter 2: Statistics Background for Forecasting
2.1 Introduction
2.2 Graphical Displays
2.3 Numerical Description of Time Series Data
2.4 Use of Data Transformations and Adjustments
2.5 General Approach to Time Series Modeling and Forecasting
2.6 Evaluating and Monitoring Forecasting Model Performance
2.7 R Commands for Chapter 2
Exercises
Chapter 3: Regression Analysis and Forecasting
3.1 Introduction
3.2 Least Squares Estimation in Linear Regression Models
3.3 Statistical Inference in Linear Regression
3.4 Prediction of New Observations
3.5 Model Adequacy Checking
3.6 Variable Selection Methods in Regression
3.7 Generalized and Weighted Least Squares
3.8 Regression Models for General Time Series Data
3.9 Econometric Models
3.10 R Commands for Chapter 3
Exercises
Chapter 4: Exponential Smoothing Methods
4.1 Introduction
4.2 First-Order Exponential Smoothing
4.3 Modeling Time Series Data
4.4 Second-Order Exponential Smoothing
4.5 Higher-Order Exponential Smoothing
4.6 Forecasting
4.7 Exponential Smoothing for Seasonal Data
4.8 Exponential Smoothing of Biosurveillance Data
4.9 Exponential Smoothers and Arima Models
4.10 R Commands for Chapter 4
Exercises
Note
Chapter 5: Autoregressive Integrated Moving Average (ARIMA) Models
5.1 Introduction
5.2 Linear Models for Stationary Time Series
5.3 Finite Order Moving Average Processes
5.4 Finite Order Autoregressive Processes
5.5 Mixed Autoregressive–Moving Average Processes
5.6 Nonstationary Processes
5.7 Time Series Model Building
5.8 Forecasting Arima Processes
5.9 Seasonal Processes
5.10 Arima Modeling of Biosurveillance Data
5.11 Final Comments
5.12 R Commands for Chapter 5
Exercises
Chapter 6: Transfer Functions and Intervention Models
6.1 Introduction
6.2 Transfer Function Models
6.3 Transfer Function–Noise Models
6.4 Cross-Correlation Function
6.5 Model Specification
6.6 Forecasting with Transfer Function–Noise Models
6.7 Intervention Analysis
6.8 R Commands for Chapter 6
Exercises
Chapter 7: Survey of Other Forecasting Methods
7.1 Multivariate Time Series Models and Forecasting
7.2 State Space Models
7.3 Arch and Garch Models
7.4 Direct Forecasting of Percentiles
7.5 Combining Forecasts to Improve Prediction Performance
7.6 Aggregation and Disaggregation of Forecasts
7.7 Neural Networks and Forecasting
7.8 Spectral Analysis
7.9 Bayesian Methods in Forecasting
7.10 Some Comments on Practical Implementation and Use of Statistical Forecasting Procedures
7.11 R Commands for Chapter 7
Exercises
Appendix A: Statistical Tables
Appendix B: Data Sets for Exercises
Appendix C: Introduction to R
Basic Concepts in R
Bibliography
Index
Wiley Series
EULA
Chapter 2
Table 2.1
Table 2.2
Table 2.3
Table 2.4
Table E2.1
Table E2.2
Chapter 3
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Table 3.5
Table 3.6
Table 3.7
Table 3.8
Table 3.9
Table 3.10
Table 3.11
Table 3.12
Table 3.13
Table 3.14
Table 3.15
Table 3.16
Table 3.17
Table 3.18
Table 3.19
Table 3.20
Table 3.21
Table 3.22
Table 3.23
Table 3.24
Table E3.1
Table E3.2
Table E3.3
Table E3.4
Table E3.5
Table E3.6
Table E3.7
Chapter 4
Table 4.1
Table 4.2
Table 4.3
Table 4.4
Table 4.5
Table 4.6
Table 4.7
Table 4.8
Table 4.9
Table 4.10
Table 4.11
Table 4.12
Table 4.13
Table 4.14
Table 4.15
Table E4.1
Chapter 5
Table 5.1
Table 5.2
Table 5.3
Table 5.4
Table 5.5
Table 5.6
Table 5.7
Table 5.8
Table 5.9
Table 5.10
Table E5.1
Table E5.2
Chapter 6
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 6.5
Table 6.6
Table 6.7
Table 6.8
Table 6.9
Table 6.10
Table 6.11
Table 6.12
Table 6.13
Table E6.1
Table E6.2
Chapter 7
Table 7.1
Table 7.2
Table 7.3
Table 7.4
Table 7.5
Table 7.6
Table 7.7
Table E7.1
Table E7.2
Table E7.3
Table E7.4
Table E7.5
Appendix A
Table A.1
Table A.2
Table A.3
Table A.4
Table A.5
Appendix B
Table B.1
Table B.2
Table B.3
Table B.4
Table B.5
Table B.6
Table B.7
Table B.8
Table B.9
Table B.10
Table B.11
Table B.12
Table B.13
Table B.14
Table B.15
Table B.16
Table B.17
Table B.18
Table B.19
Table B.20
Table B.21
Table B.22
Table B.23
Table B.24
Table B.25
Table B.26
Table B.27
Table B.28
Cover
Table of Contents
Preface
xi
xii
xiii
xiv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
41
42
43
44
46
47
48
49
50
51
52
53
54
55
56
57
58
59
61
62
63
64
65
66
67
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
90
92
93
94
95
96
97
98
99
100
101
102
103
104
105
107
108
109
110
111
112
113
114
115
116
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
184
185
186
187
188
189
190
191
192
193
194
195
196
197
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
224
225
226
227
228
229
230
231
233
234
235
236
238
239
240
241
243
244
245
246
247
248
249
250
251
252
254
255
256
257
258
259
260
261
264
265
267
268
269
273
274
275
276
277
278
279
280
284
285
286
290
295
296
297
299
300
301
302
303
304
305
306
307
308
309
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
363
364
365
366
367
368
369
370
371
372
374
376
377
378
379
380
381
382
383
384
388
389
390
391
393
395
397
399
401
402
404
405
407
408
409
410
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
436
437
438
439
440
441
442
443
444
445
446
447
448
449
451
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
469
470
471
472
473
474
476
479
481
482
483
484
485
486
487
488
489
490
491
493
494
495
496
497
498
502
505
506
507
508
509
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
535
536
537
538
539
540
541
542
543
544
545
547
551
552
553
554
556
559
560
561
580
581
582
627
628
629
630
631
632
633
634
635
636
637
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
Analyzing time-oriented data and forecasting future values of a time series are among the most important problems that analysts face in many fields, ranging from finance and economics to managing production operations, to the analysis of political and social policy sessions, to investigating the impact of humans and the policy decisions that they make on the environment. Consequently, there is a large group of people in a variety of fields, including finance, economics, science, engineering, statistics, and public policy who need to understand some basic concepts of time series analysis and forecasting. Unfortunately, most basic statistics and operations management books give little if any attention to time-oriented data and little guidance on forecasting. There are some very good high level books on time series analysis. These books are mostly written for technical specialists who are taking a doctoral-level course or doing research in the field. They tend to be very theoretical and often focus on a few specific topics or techniques. We have written this book to fill the gap between these two extremes.
We have made a number of changes in this revision of the book. New material has been added on data preparation for forecasting, including dealing with outliers and missing values, use of the variogram and sections on the spectrum, and an introduction to Bayesian methods in forecasting. We have added many new exercises and examples, including new data sets in Appendix B, and edited many sections of the text to improve the clarity of the presentation.
Like the first edition, this book is intended for practitioners who make real-world forecasts. We have attempted to keep the mathematical level modest to encourage a variety of users for the book. Our focus is on short- to medium-term forecasting where statistical methods are useful. Since many organizations can improve their effectiveness and business results by making better short- to medium-term forecasts, this book should be useful to a wide variety of professionals. The book can also be used as a textbook for an applied forecasting and time series analysis course at the advanced undergraduate or first-year graduate level. Students in this course could come from engineering, business, statistics, operations research, mathematics, computer science, and any area of application where making forecasts is important. Readers need a background in basic statistics (previous exposure to linear regression would be helpful, but not essential), and some knowledge of matrix algebra, although matrices appear mostly in the chapter on regression, and if one is interested mainly in the results, the details involving matrix manipulation can be skipped. Integrals and derivatives appear in a few places in the book, but no detailed working knowledge of calculus is required.
Successful time series analysis and forecasting requires that the analyst interact with computer software. The techniques and algorithms are just not suitable to manual calculations. We have chosen to demonstrate the techniques presented using three packages: Minitab®, JMP®, and R, and occasionally SAS®. We have selected these packages because they are widely used in practice and because they have generally good capability for analyzing time series data and generating forecasts. Because R is increasingly popular in statistics courses, we have included a section in each chapter showing the R code necessary for working some of the examples in the chapter. We have also added a brief appendix on the use of R. The basic principles that underlie most of our presentation are not specific to any particular software package. Readers can use any software that they like or have available that has basic statistical forecasting capability. While the text examples do utilize these particular software packages and illustrate some of their features and capability, these features or similar ones are found in many other software packages.
There are three basic approaches to generating forecasts: regression-based methods, heuristic smoothing methods, and general time series models. Because all three of these basic approaches are useful, we give an introduction to all of them. Chapter 1 introduces the basic forecasting problem, defines terminology, and illustrates many of the common features of time series data. Chapter 2 contains many of the basic statistical tools used in analyzing time series data. Topics include plots, numerical summaries of time series data including the autocovariance and autocorrelation functions, transformations, differencing, and decomposing a time series into trend and seasonal components. We also introduce metrics for evaluating forecast errors and methods for evaluating and tracking forecasting performance over time. Chapter 3 discusses regression analysis and its use in forecasting. We discuss both crosssection and time series regression data, least squares and maximum likelihood model fitting, model adequacy checking, prediction intervals, and weighted and generalized least squares. The first part of this chapter covers many of the topics typically seen in an introductory treatment of regression, either in a stand-alone course or as part of another applied statistics course. It should be a reasonable review for many readers. Chapter 4 presents exponential smoothing techniques, both for time series with polynomial components and for seasonal data. We discuss and illustrate methods for selecting the smoothing constant(s), forecasting, and constructing prediction intervals. The explicit time series modeling approach to forecasting that we have chosen to emphasize is the autoregressive integrated moving average (ARIMA) model approach. Chapter 5 introduces ARIMA models and illustrates how to identify and fit these models for both nonseasonal and seasonal time series. Forecasting and prediction interval construction are also discussed and illustrated. Chapter 6 extends this discussion into transfer function models and intervention modeling and analysis. Chapter 7 surveys several other useful topics from time series analysis and forecasting, including multivariate time series problems, ARCH and GARCH models, and combinations of forecasts. We also give some practical advice for using statistical approaches to forecasting and provide some information about realistic expectations. The last two chapters of the book are somewhat higher in level than the first five.
Each chapter has a set of exercises. Some of these exercises involve analyzing the data sets given in Appendix B. These data sets represent an interesting cross section of real time series data, typical of those encountered in practical forecasting problems. Most of these data sets are used in exercises in two or more chapters, an indication that there are usually several approaches to analyzing, modeling, and forecasting a time series. There are other good sources of data for practicing the techniques given in this book. Some of the ones that we have found very interesting and useful include the U.S. Department of Labor—Bureau of Labor Statistics (http://www.bls.gov/data/home.htm), the U.S. Department of Agriculture—National Agricultural Statistics Service, Quick Stats Agricultural Statistics Data (http://www.nass.usda.gov/Data_and_Statistics/Quick_Stats/index.asp), the U.S. Census Bureau (http://www.census.gov), and the U.S. Department of the Treasury (http://www.treas.gov/offices/domestic-finance/debt-management/interest-rate/). The time series data library created by Rob Hyndman at Monash University (http://www-personal.buseco.monash.edu.au/∼hyndman/TSDL/index.htm) and the time series data library at the Mathematics Department of the University of York (http://www.york.ac.uk/depts/maths/data/ts/) also contain many excellent data sets. Some of these sources provide links to other data. Data sets and other materials related to this book can be found at ftp://ftp.wiley.com/public/scitechmed/ timeseries.
We would like to thank the many individuals who provided feedback and suggestions for improvement to the first edition. We found these suggestions most helpful. We are indebted to Clifford Long who generously provided the R codes he used with his students when he taught from the book. We found his codes very helpful in putting the end-of-chapter R code sections together. We also have placed a premium in the book on bridging the gap between theory and practice. We have not emphasized proofs or technical details and have tried to give intuitive explanations of the material whenever possible. The result is a book that can be used with a wide variety of audiences, with different interests and technical backgrounds, whose common interests are understanding how to analyze time-oriented data and constructing good short-term statistically based forecasts.
We express our appreciation to the individuals and organizations who have given their permission to use copyrighted material. These materials are noted in the text. Portions of the output contained in this book are printed with permission of Minitab Inc. All material remains the exclusive property and copyright of Minitab Inc. All rights reserved.
DOUGLAS C. MONTGOMERY CHERYL L. JENNINGS MURAT KULAHCI
It is difficult to make predictions, especially about the future
NEILS BOHR, Danish physicist
Aforecast is a prediction of some future event or events. As suggested by Neils Bohr, making good predictions is not always easy. Famously “bad” forecasts include the following from the book Bad Predictions:
“The population is constant in size and will remain so right up to the end of mankind.”
L'Encyclopedie
, 1756.
“1930 will be a splendid employment year.” U.S. Department of Labor,
New Year's Forecast
in 1929, just before the market crash on October 29.
“Computers are multiplying at a rapid rate. By the turn of the century there will be 220,000 in the U.S.”
Wall Street Journal
, 1966.
Forecasting is an important problem that spans many fields including business and industry, government, economics, environmental sciences, medicine, social science, politics, and finance. Forecasting problems are often classified as short-term, medium-term, and long-term. Short-term forecasting problems involve predicting events only a few time periods (days, weeks, and months) into the future. Medium-term forecasts extend from 1 to 2 years into the future, and long-term forecasting problems can extend beyond that by many years. Short- and medium-term forecasts are required for activities that range from operations management to budgeting and selecting new research and development projects. Long-term forecasts impact issues such as strategic planning. Short- and medium-term forecasting is typically based on identifying, modeling, and extrapolating the patterns found in historical data. Because these historical data usually exhibit inertia and do not change dramatically very quickly, statistical methods are very useful for short- and medium-term forecasting. This book is about the use of these statistical methods.
Most forecasting problems involve the use of time series data. A time series is a time-oriented or chronological sequence of observations on a variable of interest. For example, Figure 1.1 shows the market yield on US Treasury Securities at 10-year constant maturity from April 1953 through December 2006 (data in Appendix B, Table B.1). This graph is called a time series plot. The rate variable is collected at equally spaced time periods, as is typical in most time series and forecasting applications. Many business applications of forecasting utilize daily, weekly, monthly, quarterly, or annual data, but any reporting interval may be used. Furthermore, the data may be instantaneous, such as the viscosity of a chemical product at the point in time where it is measured; it may be cumulative, such as the total sales of a product during the month; or it may be a statistic that in some way reflects the activity of the variable during the time period, such as the daily closing price of a specific stock on the New York Stock Exchange.
Figure 1.1 Time series plot of the market yield on US Treasury Securities at 10-year constant maturity. Source: US Treasury.
The reason that forecasting is so important is that prediction of future events is a critical input into many types of planning and decision-making processes, with application to areas such as the following:
Operations Management
. Business organizations routinely use forecasts of product sales or demand for services in order to schedule production, control inventories, manage the supply chain, determine staffing requirements, and plan capacity. Forecasts may also be used to determine the mix of products or services to be offered and the locations at which products are to be produced.
Marketing
. Forecasting is important in many marketing decisions. Forecasts of sales response to advertising expenditures, new promotions, or changes in pricing polices enable businesses to evaluate their effectiveness, determine whether goals are being met, and make adjustments.
Finance and Risk Management
. Investors in financial assets are interested in forecasting the returns from their investments. These assets include but are not limited to stocks, bonds, and commodities; other investment decisions can be made relative to forecasts of interest rates, options, and currency exchange rates. Financial risk management requires forecasts of the volatility of asset returns so that the risks associated with investment portfolios can be evaluated and insured, and so that financial derivatives can be properly priced.
Economics
. Governments, financial institutions, and policy organizations require forecasts of major economic variables, such as gross domestic product, population growth, unemployment, interest rates, inflation, job growth, production, and consumption. These forecasts are an integral part of the guidance behind monetary and fiscal policy, and budgeting plans and decisions made by governments. They are also instrumental in the strategic planning decisions made by business organizations and financial institutions.
Industrial Process Control
. Forecasts of the future values of critical quality characteristics of a production process can help determine when important controllable variables in the process should be changed, or if the process should be shut down and overhauled. Feedback and feedforward control schemes are widely used in monitoring and adjustment of industrial processes, and predictions of the process output are an integral part of these schemes.
Demography
. Forecasts of population by country and regions are made routinely, often stratified by variables such as gender, age, and race. Demographers also forecast births, deaths, and migration patterns of populations. Governments use these forecasts for planning policy and social service actions, such as spending on health care, retirement programs, and antipoverty programs. Many businesses use forecasts of populations by age groups to make strategic plans regarding developing new product lines or the types of services that will be offered.
These are only a few of the many different situations where forecasts are required in order to make good decisions. Despite the wide range of problem situations that require forecasts, there are only two broad types of forecasting techniques—qualitative methods and quantitative methods.
Qualitative forecasting techniques are often subjective in nature and require judgment on the part of experts. Qualitative forecasts are often used in situations where there is little or no historical data on which to base the forecast. An example would be the introduction of a new product, for which there is no relevant history. In this situation, the company might use the expert opinion of sales and marketing personnel to subjectively estimate product sales during the new product introduction phase of its life cycle. Sometimes qualitative forecasting methods make use of marketing tests, surveys of potential customers, and experience with the sales performance of other products (both their own and those of competitors). However, although some data analysis may be performed, the basis of the forecast is subjective judgment.
Perhaps the most formal and widely known qualitative forecasting technique is the Delphi Method. This technique was developed by the RAND Corporation (see Dalkey [1967]). It employs a panel of experts who are assumed to be knowledgeable about the problem. The panel members are physically separated to avoid their deliberations being impacted either by social pressures or by a single dominant individual. Each panel member responds to a questionnaire containing a series of questions and returns the information to a coordinator. Following the first questionnaire, subsequent questions are submitted to the panelists along with information about the opinions of the panel as a group. This allows panelists to review their predictions relative to the opinions of the entire group. After several rounds, it is hoped that the opinions of the panelists converge to a consensus, although achieving a consensus is not required and justified differences of opinion can be included in the outcome. Qualitative forecasting methods are not emphasized in this book.
Quantitative forecasting techniques make formal use of historical data and a forecasting model. The model formally summarizes patterns in the data and expresses a statistical relationship between previous and current values of the variable. Then the model is used to project the patterns in the data into the future. In other words, the forecasting model is used to extrapolate past and current behavior into the future. There are several types of forecasting models in general use. The three most widely used are regression models, smoothing models, and general time series models. Regression models make use of relationships between the variable of interest and one or more related predictor variables. Sometimes regression models are called causal forecasting models, because the predictor variables are assumed to describe the forces that cause or drive the observed values of the variable of interest. An example would be using data on house purchases as a predictor variable to forecast furniture sales. The method of least squares is the formal basis of most regression models. Smoothing models typically employ a simple function of previous observations to provide a forecast of the variable of interest. These methods may have a formal statistical basis, but they are often used and justified heuristically on the basis that they are easy to use and produce satisfactory results. General time series models employ the statistical properties of the historical data to specify a formal model and then estimate the unknown parameters of this model (usually) by least squares. In subsequent chapters, we will discuss all three types of quantitative forecasting models.
The form of the forecast can be important. We typically think of a forecast as a single number that represents our best estimate of the future value of the variable of interest. Statisticians would call this a point estimate or point forecast. Now these forecasts are almost always wrong; that is, we experience forecast error. Consequently, it is usually a good practice to accompany a forecast with an estimate of how large a forecast error might be experienced. One way to do this is to provide a prediction interval (PI) to accompany the point forecast. The PI is a range of values for the future observation, and it is likely to prove far more useful in decision-making than a single number. We will show how to obtain PIs for most of the forecasting methods discussed in the book.
Other important features of the forecasting problem are the forecast horizon and the forecast interval. The forecast horizon is the number of future periods for which forecasts must be produced. The horizon is often dictated by the nature of the problem. For example, in production planning, forecasts of product demand may be made on a monthly basis. Because of the time required to change or modify a production schedule, ensure that sufficient raw material and component parts are available from the supply chain, and plan the delivery of completed goods to customers or inventory facilities, it would be necessary to forecast up to 3 months ahead. The forecast horizon is also often called the forecast lead time. The forecast interval is the frequency with which new forecasts are prepared. For example, in production planning, we might forecast demand on a monthly basis, for up to 3 months in the future (the lead time or horizon), and prepare a new forecast each month. Thus the forecast interval is 1 month, the same as the basic period of time for which each forecast is made. If the forecast lead time is always the same length, say, T periods, and the forecast is revised each time period, then we are employing a rolling or moving horizon forecasting approach. This system updates or revises the forecasts for T−1 of the periods in the horizon and computes a forecast for the newest period T. This rolling horizon approach to forecasting is widely used when the lead time is several periods long.
Time series plots can reveal patterns such as random, trends, level shifts, periods or cycles, unusual observations, or a combination of patterns. Patterns commonly found in time series data are discussed next with examples of situations that drive the patterns.
The sales of a mature pharmaceutical product may remain relatively flat in the absence of unchanged marketing or manufacturing strategies. Weekly sales of a generic pharmaceutical product shown in Figure 1.2 appear to be constant over time, at about 10,400 × 103 units, in a random sequence with no obvious patterns (data in Appendix B, Table B.2).
Figure 1.2 Pharmaceutical product sales.
To assure conformance with customer requirements and product specifications, the production of chemicals is monitored by many characteristics. These may be input variables such as temperature and flow rate, and output properties such as viscosity and purity.
Due to the continuous nature of chemical manufacturing processes, output properties often are positively autocorrelated; that is, a value above the long-run average tends to be followed by other values above the average, while a value below the average tends to be followed by other values below the average.
The viscosity readings plotted in Figure 1.3 exhibit autocorrelated behavior, tending to a long-run average of about 85 centipoises (cP), but with a structured, not completely random, appearance (data in Appendix B, Table B.3). Some methods for describing and analyzing autocorrelated data will be described in Chapter 2.
Figure 1.3 Chemical process viscosity readings.
The USDA National Agricultural Statistics Service publishes agricultural statistics for many commodities, including the annual production of dairy products such as butter, cheese, ice cream, milk, yogurt, and whey. These statistics are used for market analysis and intelligence, economic indicators, and identification of emerging issues.
Blue and gorgonzola cheese is one of 32 categories of cheese for which data are published. The annual US production of blue and gorgonzola cheeses (in 103 lb) is shown in Figure 1.4 (data in Appendix B, Table B.4). Production quadrupled from 1950 to 1997, and the linear trend has a constant positive slope with random, year-to-year variation.
Figure 1.4 The US annual production of blue and gorgonzola cheeses. Source: USDA–NASS.
The US Census Bureau publishes historic statistics on manufacturers' shipments, inventories, and orders. The statistics are based on North American Industry Classification System (NAICS) code and are utilized for purposes such as measuring productivity and analyzing relationships between employment and manufacturing output.
The manufacture of beverage and tobacco products is reported as part of the nondurable subsector. The plot of monthly beverage product shipments (Figure 1.5) reveals an overall increasing trend, with a distinct cyclic pattern that is repeated within each year. January shipments appear to be the lowest, with highs in May and June (data in Appendix B, Table B.5). This monthly, or seasonal, variation may be attributable to some cause such as the impact of weather on the demand for beverages. Techniques for making seasonal adjustments to data in order to better understand general trends will be discussed in Chapter 2.
Figure 1.5 The US beverage manufacturer monthly product shipments, unadjusted. Source: US Census Bureau.
To determine whether the Earth is warming or cooling, scientists look at annual mean temperatures. At a single station, the warmest and the coolest temperatures in a day are averaged. Averages are then calculated at stations all over the Earth, over an entire year. The change in global annual mean surface air temperature is calculated from a base established from 1951 to 1980, and the result is reported as an “anomaly.”
The plot of the annual mean anomaly in global surface air temperature (Figure 1.6) shows an increasing trend since 1880; however, the slope, or rate of change, varies with time periods (data in Appendix B, Table B.6). While the slope in earlier time periods appears to be constant, slightly increasing, or slightly decreasing, the slope from about 1975 to the present appears much steeper than the rest of the plot.
Figure 1.6 Global mean surface air temperature annual anomaly. Source: NASA-GISS.
Business data such as stock prices and interest rates often exhibit nonstationary behavior; that is, the time series has no natural mean. The daily closing price adjusted for stock splits of Whole Foods Market (WFMI) stock in 2001 (Figure 1.7) exhibits a combination of patterns for both mean level and slope (data in Appendix B, Table B.7).
Figure 1.7 Whole foods market stock price, daily closing adjusted for splits.
While the price is constant in some short time periods, there is no consistent mean level over time. In other time periods, the price changes at different rates, including occasional abrupt shifts in level. This is an example of nonstationary behavior, which will be discussed in Chapter 2.
The Current Population Survey (CPS) or “household survey” prepared by the US Department of Labor, Bureau of Labor Statistics, contains national data on employment, unemployment, earnings, and other labor market topics by demographic characteristics. The data are used to report on the employment situation, for projections with impact on hiring and training, and for a multitude of other business planning activities. The data are reported unadjusted and with seasonal adjustment to remove the effect of regular patterns that occur each year.
The plot of monthly unadjusted unemployment rates (Figure 1.8) exhibits a mixture of patterns, similar to Figure 1.5 (data in Appendix B, Table B.8). There is a distinct cyclic pattern within a year; January, February, and March generally have the highest unemployment rates. The overall level is also changing, from a gradual decrease, to a steep increase, followed by a gradual decrease. The use of seasonal adjustments as described in Chapter 2 makes it easier to observe the nonseasonal movements in time series data.
Figure 1.8 Monthly unemployment rate—full-time labor force, unadjusted. Source: US Department of Labor-BLS.
Solar activity has long been recognized as a significant source of noise impacting consumer and military communications, including satellites, cell phone towers, and electric power grids. The ability to accurately forecast solar activity is critical to a variety of fields. The International Sunspot Number R is the oldest solar activity index. The number incorporates both the number of observed sunspots and the number of observed sunspot groups. In Figure 1.9, the plot of annual sunspot numbers reveals cyclic patterns of varying magnitudes (data in Appendix B, Table B.9).
Figure 1.9 The international sunspot number. Source: SIDC.
In addition to assisting in the identification of steady-state patterns, time series plots may also draw attention to the occurrence of atypical events. Weekly sales of a generic pharmaceutical product dropped due to limited availability resulting from a fire at one of the four production facilities. The 5-week reduction is apparent in the time series plot of weekly sales shown in Figure 1.10.
Figure 1.10 Pharmaceutical product sales.
Another type of unusual event may be the failure of the data measurement or collection system. After recording a vastly different viscosity reading at time period 70 (Figure 1.11), the measurement system was checked with a standard and determined to be out of calibration. The cause was determined to be a malfunctioning sensor.
Figure 1.11 Chemical process viscosity readings, with sensor malfunction.
A process is a series of connected activities that transform one or more inputs into one or more outputs. All work activities are performed in processes, and forecasting is no exception. The activities in the forecasting process are:
Problem definition
Data collection
Data analysis
Model selection and fitting
Model validation
Forecasting model deployment
Monitoring forecasting model performance
These activities are shown in Figure 1.12.
Figure 1.12 The forecasting process.
Problem definition involves developing understanding of how the forecast will be used along with the expectations of the “customer” (the user of the