Forecasting Time Series Data with Facebook Prophet - Greg Rafferty - E-Book

Forecasting Time Series Data with Facebook Prophet E-Book

Greg Rafferty

0,0
38,39 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Prophet enables Python and R developers to build scalable time series forecasts. This book will help you to implement Prophet’s cutting-edge forecasting techniques to model future data with higher accuracy and with very few lines of code. You will begin by exploring the evolution of time series forecasting, from the basic early models to the advanced models of the present day. The book will demonstrate how to install and set up Prophet on your machine and build your first model with only a few lines of code. You'll then cover advanced features such as visualizing your forecasts, adding holidays, seasonality, and trend changepoints, handling outliers, and more, along with understanding why and how to modify each of the default parameters. Later chapters will show you how to optimize more complicated models with hyperparameter tuning and by adding additional regressors to the model. Finally, you'll learn how to run diagnostics to evaluate the performance of your models and see some useful features when running Prophet in production environments.
By the end of this Prophet book, you will be able to take a raw time series dataset and build advanced and accurate forecast models with concise, understandable, and repeatable code.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 262

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Forecasting Time Series Data with Facebook Prophet

Build, improve, and optimize time series forecasting models using the advanced forecasting tool

Greg Rafferty

BIRMINGHAM—MUMBAI

Forecasting Time Series Data with Facebook Prophet

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh

Publishing Product Manager: Sunith Shetty

Senior Editor: Roshan Kumar

Content Development Editor: Tazeen Shaikh

Technical Editor: Manikandan Kurup

Copy Editor: Safis Editing

Project Coordinator: Aishwarya Mohan

Proofreader: Safis Editing

Indexer: Priyanka Dhadke

Production Designer: Jyoti Chauhan

First published: March 2021

Production reference: 1100221

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-853-2

www.packt.com

Contributors

About the author

Greg Rafferty is a data scientist in San Francisco, California. With over a decade of experience, he has worked with many of the top firms in tech, including Google, Facebook, and IBM. Greg has been an instructor in business analytics on Coursera and has led face-to-face workshops with industry professionals in data science and analytics. With both an MBA and a degree in engineering, he is able to work across the spectrum of data science and communicate with both technical experts and non-technical consumers of data alike.

About the reviewers

Jose Angel Sanchez, born and raised in Oaxaca, Mexico, is a software developer at Pinterest. Previously, Jose worked at Bayer, Credijusto, and Connus International, and throughout his career has had the opportunity to work with different technologies, solving problems in different disciplines. A math lover and a crypto-believer, he knows that only through science and skeptical thinking will the human race achieve its true potential. Jose lives happily with his wife, Mariana, and their dog, Koly, in St Louis, Missouri.

Mert Sarıkaya graduated with a double major in industrial engineering and mathematics from Bogazici University and is currently pursuing his master's degree in industrial engineering from Bogazici University, with a focus on time series forecasting. He has worked for a retail analytics company as a machine learning engineer, specializing in demand prediction by using time series data. He is currently working as a data scientist in Algopoly to solve large-scale forecasting problems in the energy and finance sectors. He is a supporter of the open source community and has also made a small contribution to the fbprophet library. He is interested in cinema and loves to cook in his free time.

Table of Contents

Preface

Section 1: Getting Started

Chapter 1: The History and Development of Time Series Forecasting

Understanding time series forecasting

The problem with dependent data

Moving average and exponential smoothing

ARIMA

ARCH/GARCH

Neural networks

Prophet

Summary

Chapter 2: Getting Started with Facebook Prophet

Technical requirements

Installing Prophet

Installation on macOS

Installation on Windows

Installation on Linux

Building a simple model in Prophet

Interpreting the forecast DataFrame

Understanding components plots

Summary

Section 2: Seasonality, Tuning, and Advanced Features

Chapter 3: Non-Daily Data

Technical requirements

Using monthly data

Using sub-daily data

Using data with regular gaps

Summary

Chapter 4: Seasonality

Technical requirements

Understanding additive versus multiplicative seasonality

Controlling seasonality with Fourier order

Adding custom seasonalities

Adding conditional seasonalities

Regularizing seasonality

Global seasonality regularization

Local seasonality regularization

Summary

Chapter 5: Holidays

Technical requirements

Adding default country holidays

Adding default state/province holidays

Creating custom holidays

Creating multi-day holidays

Regularizing holidays

Global holiday regularization

Individual holiday regularization

Summary

Chapter 6: Growth Modes

Technical requirements

Applying linear growth

Understanding the logistic function

Saturating forecasts

Increasing logistic growth

Non-constant cap

Decreasing logistic growth

Applying flat growth

Summary

Chapter 7: Trend Changepoints

Technical requirements

Automatic trend changepoint detection

Default changepoint detection

Regularizing changepoints

Specifying custom changepoint locations

Summary

Chapter 8: Additional Regressors

Technical requirements

Adding binary regressors

Adding continuous regressors

Interpreting the regressor coefficients

Summary

Chapter 9: Outliers and Special Events

Technical requirements

Correcting outliers that cause seasonality swings

Correcting outliers that cause wide uncertainty intervals

Detecting outliers automatically

Winsorizing

Standard deviation

Moving average

Error standard deviation

Modeling outliers as special events

Summary

Chapter 10: Uncertainty Intervals

Technical requirements

Modeling uncertainty in trends

Modeling uncertainty in seasonality

Summary

Section 3: Diagnostics and Evaluation

Chapter 11: Cross-Validation

Technical requirements

Performing k-fold cross-validation

Performing forward-chaining cross-validation

Creating the Prophet cross-validation DataFrame

Parallelizing cross-validation

Summary

Chapter 12: Performance Metrics

Technical requirements

Understanding Prophet's metrics

Mean squared error

Root mean squared error

Mean absolute error

Mean absolute percent error

Median absolute percent error

Coverage

Choosing the best metric

Creating the Prophet performance metrics DataFrame

Handling irregular cut-offs

Tuning hyperparameters with grid search

Summary

Chapter 13: Productionalizing Prophet

Technical requirements

Saving a model

Updating a fitted model

Making interactive plots with Plotly

Plotly forecast plot

Plotly components plot

Plotly single component plot

Plotly seasonality plot

Summary

Why subscribe?

Other Books You May Enjoy

Section 1: Getting Started

The first part of this book will give you an understanding of the historical developments in time series forecasting techniques that led to the inception of Prophet and then guide you through the installation of the program. The section closes with a walk-through of a basic Prophet forecasting model and introduces the output that such a model produces.

This section comprises the following chapters:

Chapter 1, The History and Development of Time Series ForecastingChapter 2, Getting Started with Facebook Prophet

Chapter 1: The History and Development of Time Series Forecasting

Facebook Prophet is a powerful tool for creating, visualizing, and optimizing your forecasts! With Prophet, you'll be able to understand what factors will drive your future results and enable you to make more confident decisions. You'll accomplish these tasks and goals through an intuitive but very flexible programming interface that is designed for both the beginner and expert alike.

You don't need a deep knowledge of the math or statistics behind time series forecasting techniques to leverage the power of Prophet, although if you do possess this knowledge, Prophet includes a rich feature set that allows you to deploy your experience to great effect. You'll be working in a structured paradigm where each problem follows the same pattern, allowing you to spend less time figuring out how to optimize your forecast and more time discovering key insights to supercharge your decisions.

This chapter introduces the foundational ideas behind time series forecasting and discusses some of the key model iterations that eventually led to the development of Prophet. In this chapter, you'll learn what time series data is and why it must be handled differently than non-time series data, and then you'll discover the most powerful innovations, of which Prophet is the latest. Specifically, we will cover an overview of what time series forecasting is and then go into more detail on some specific approaches:

Understanding time series forecastingMoving average and exponential smoothingARIMAARCH/GARCHNeural networksProphet

Understanding time series forecasting

A time series is a set of data collected sequentially over time. For example, think of any chart where the x-axis is some measurement of time—anything from the number of stars in the Universe since the Big Bang until today or the amount of energy released each nanosecond of a nuclear reaction. The data behind both are time series. The chart in the weather app on your phone showing the expected temperature for the next 7 days? That's also the plot of a time series.

In this book, we are mostly concerned with events on the human scales of years, months, days, and hours, but all of this is time series data. Predicting future values is the act of forecasting.

Forecasting the weather has obviously been important to humans for millennia, particularly since the advent of agriculture. In fact, over 2,300 years ago, the Greek philosopher Aristotle wrote a treatise called Meteorology that contained a discussion of early weather forecasting. The very word forecast was coined by an English meteorologist in the 1850s, Robert FitzRoy, who achieved fame as the captain of the HMSBeagle during Charles Darwin's pioneering voyage.

But time series data is not unique to weather. The field of medicine adopted time series analysis techniques with the 1901 invention of the first practical electrocardiogram by the Dutch physician Willem Einthoven. The ECG, as it is commonly known, produces the familiar pattern of heartbeats we now see on the machine next to a patient's bed in every medical drama.

Today, one of the most discussed fields of forecasting is economics. There are entire television channels dedicated to analyzing trends of the stock market. Governments use economic forecasting to advise central bank policy, politicians use economic forecasting to develop their platforms, and business leaders use economic forecasting to guide their decisions.

In this book, we will be forecasting topics as varied as carbon dioxide levels high in the atmosphere, the number of riders on Chicago's public bike share program, the growth of the wolf population in Yellowstone, the solar sunspot cycles, local rainfall, and even Instagram likes on some popular accounts.

The problem with dependent data

So, why does time series forecasting require its own unique approach? From a statistical perspective, you might see a scatter plot of time series with a relatively clear trend and attempt to fit a line using standard regression—the technique for fitting a straight line through data. The problem is that this violates the assumption of independence that linear regression demands.

To illustrate time series dependence with an example, let's say that a gambler is rolling an unbiased die. I tell you that he just rolled a 2 and ask what the next value will be. This data is independent; previous rolls have no effect on future rolls, so knowing that the previous roll was a 2 does not provide any information about the next roll.

However, in a different situation, let's say that I call you from an undisclosed location somewhere on Earth and ask you to guess the temperature at my location. Your best bet would be to guess some average global temperature for that day. But now, imagine that I tell you that yesterday's temperature at my location was 90°F. That provides a great deal of information to you because you intuitively know that yesterday's temperature and today's temperature are linked in some way; they are not independent.

With time series data, you cannot randomly shuffle the order of data around without disturbing the trends, within a reasonable margin of error. The order of the data matters; it is not independent. When data is dependent like this, a regression model can show statistical significance by random chance, even when there is no true correlation, much more often than your chosen confidence level would suggest.

Because high values tend to follow high values and low values tend to follow low values, a time series dataset is more likely to show more clusters of high or low values than would otherwise be present, and this in turn can lead to the appearance of more correlations than would otherwise be present.

The website Spurious Correlations by Tyler Vigen specializes in pointing out examples of seemingly significant, but utterly ridiculous, time series correlations. Here is one example:

Figure 1.1 – A spurious time series correlation. https://www.tylervigen.com/spurious-correlations

Obviously, the number of people who drown in pools each year is completely independent of the number of films Nicolas Cage appears in. They simply have no effect on each other at all. However, by making the fallacy of treating time series data as if it were independent, Vigen has shown that by pure random chance, the two series of data do, in fact, correlate significantly. These types of random chances are much more likely to happen when ignoring the dependence of time series data.

Now that you understand what exactly time series data is and what sets it apart from other datasets, let's look at a few milestones in the development of models, from the earliest models up to Prophet.

Moving average and exponential smoothing

Possibly the simplest form of forecasting is the moving average. Often, a moving average is used as a smoothing technique to find a straighter line through data with a lot of variation. Each data point is adjusted to the value of the average of n surrounding data points, with n being referred to as the window size. With a window size of 10, for example, we would adjust a data point to be the average of the 5 values before and the 5 values after. In a forecasting setting, the future values are calculated as the average of the n previous values, so again, with a window size of 10, this means the average of the 10 previous values.

The balancing act with a moving average is that you want a large window size in order to smooth out the noise and capture the actual trend, but with a larger window size, your forecasts are going to lag the trend significantly as you reach back further and further to calculate the average. The idea behind exponential smoothing is to apply exponentially decreasing weights to the values being averaged over time, giving recent values more weight and older values less. This allows the forecast to be more reactive to changes, while still ignoring a good deal of noise.

As you can see in the following plot of simulated data, the moving average line exhibits much rougher behavior than the exponential smoothing line, but both lines still adjust to trend changes at the same time:

Figure 1.2 – Moving average versus exponential smoothing

Exponential smoothing originated in the 1950s with simple exponential smoothing, which does not allow for a trend or seasonality. Charles Holt advanced the technique in 1957 to allow for a trend with what he called double exponential smoothing; and in collaboration with Peter Winters, Holt added seasonality support in 1960, in what is commonly called Holt-Winters exponential smoothing.

The downside to these methods of forecasting is that they can be slow to adjust to new trends and so forecasted values lag behind reality—they do not hold up well to longer forecasting timeframes, and there are many hyperparameters to tune, which can be a difficult and very time-consuming process.

ARIMA

In 1970, the mathematicians George Box and Gwilym Jenkins published Time Series: Forecasting and Control, which described what is now known as the Box-Jenkins model. This methodology took the idea of the moving average further with the development of ARIMA. As a term, ARIMA is often used interchangeably with Box-Jenkins, although technically, Box-Jenkins refers to a method of parameter optimization for an ARIMA model.

ARIMA is an acronym of three concepts: Autoregressive (AR), Integrated (I), and Moving Average (MA). We already understand the moving average part. Autoregressive means that the model uses the dependent relationship between a data point and some number of lagged data points. That is, the model predicts upcoming values based upon previous values. This is similar to predicting that it will be warm tomorrow because it's been warm all week so far.

The integrated part means that instead of using any raw data point, the difference between that data point and some previous data point is used. Essentially, this means that we convert a series of values into a series of changes in values. Intuitively, this suggests that tomorrow will be more or less the same temperature as today because the temperature all week hasn't varied too much.

Each of the AR, I, and MA components of an ARIMA model are explicitly specified as a parameter in the model. Traditionally, p is used as the number of lag observations to use, also known as the lag order. The number of times that a raw observation is differenced, or the degree of differencing, is known as d, and q represents the size of the moving average window. Thus arises the standard notation for an ARIMA model of ARIMA(p, d, q), where p, d, and q are all non-negative integers.

A problem with ARIMA models is that they do not support seasonality, or data with repeating cycles, such as temperature rising in the day and falling at night or rising in summer and falling in winter. SARIMA, or Seasonal ARIMA, was developed to overcome this drawback. Similar to the ARIMA notation, the notation for a SARIMA model is SARIMA(p, d, q)(P, D, Q)m, with P being the seasonal autoregressive order, D the seasonal difference order, Q the seasonal moving average order, and m the number of time steps for a single seasonal period.

You may also come across other variations on ARIMA models, including VARIMA (Vector ARIMA, for cases with multiple time series as vectors); FARIMA (Fractional ARIMA) or ARFIMA (Fractionally Integrated ARMA), both of which include a fractional differencing degree allowing a long memory in the sense that observations far apart in time can have non-negligible dependencies; and SARIMAX, a seasonal ARIMA model where the X stands for exogenous or additional variables added to the model, such as adding a rain forecast to a temperature model.

ARIMA does typically exhibit very good results, but the downside is complexity. Tuning and optimizing ARIMA models is often computationally expensive and successful results can depend upon the skill and experience of the forecaster. It is not a scalable process, but better suited to ad hoc analyses by skilled practitioners.

ARCH/GARCH

When the variance of a dataset is not constant over time, ARIMA models face problems with modeling it. In economics and finance, in particular, this can be common. In a financial time series, large returns tend to be followed by large returns and small returns tend to be followed by small returns. The former is called high volatility, and the latter low volatility.

Autoregressive Conditional Heteroscedasticity (ARCH) models were developed to solve this problem. Heteroscedasticity is a fancy way of saying that the variance or spread of the data is not constant throughout, with the opposite term being homoscedasticity. The difference is visualized here:

Figure 1.3 – Scedasticity

Robert Engle introduced the first ARCH model in 1982 by describing the conditional variance as a function of previous values. For example, there is a lot more uncertainty about daytime electricity usage than there is about nighttime usage. In a model of electricity usage, then, we might assume that the daytime hours have a particular variance, and usage during the night would have a lower variance.

Tim Bollerslev and Stephen Taylor introduced a moving average component to the model in 1986 with their Generalized ARCH model, or GARCH. In the electricity example, the variance in usage was a function of time of day. But perhaps the swings in volatility don't necessarily occur at specific times of the day, but the swings are themselves random. This is when GARCH is useful.

Both ARCH and GARCH models can handle neither trend nor seasonality though, so often, in practice, an ARIMA model may first be built to extract out the seasonal variation and trend of a time series, and then an ARCH model may be used to model the expected variance.

Neural networks

A relatively recent development in time series forecasting is the use of Recurrent Neural Networks (RNNs). This was made possible with the development of the Long Short-Term Memory unit, or LSTM, by Sepp Hochreiter and Jürgen Schmidhuber in 1997. Essentially, an LSTM unit allows a neural network to process a sequence of data, such as speech or video, instead of a single data point, such as an image.

A standard RNN is called recurrent