Modern Time Series Forecasting with Python - Manu Joseph - E-Book

Modern Time Series Forecasting with Python E-Book

Manu Joseph

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Predicting the future, whether it's market trends, energy demand, or website traffic, has never been more crucial. This practical, hands-on guide empowers you to build and deploy powerful time series forecasting models. Whether you’re working with traditional statistical methods or cutting-edge deep learning architectures, this book provides structured learning and best practices for both.
Starting with the basics, this data science book introduces fundamental time series concepts, such as ARIMA and exponential smoothing, before gradually progressing to advanced topics, such as machine learning for time series, deep neural networks, and transformers. As part of your fundamentals training, you’ll learn preprocessing, feature engineering, and model evaluation. As you progress, you’ll also explore global forecasting models, ensemble methods, and probabilistic forecasting techniques.
This new edition goes deeper into transformer architectures and probabilistic forecasting, including new content on the latest time series models, conformal prediction, and hierarchical forecasting. Whether you seek advanced deep learning insights or specialized architecture implementations, this edition provides practical strategies and new content to elevate your forecasting skills.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 982

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Modern Time Series Forecasting with Python

Second Edition

Industry-ready machine learning and deep learning time series analysis with PyTorch and pandas

Manu Joseph

Jeffrey Tackes

Packt and this book are not officially connected with Python. This book is an effort from the Python community of experts to help more developers.

Modern Time Series Forecasting with Python

Second Edition

Copyright © 2024 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Senior Publishing Product Manager: Bhavesh Amin

Acquisition Editor – Peer Reviews: Jane D’Souza

Project Editor: Parvathy Nair

Content Development Editor: Deepayan Bhattacharjee

Copy Editor: Safis Editor

Technical Editor: Karan Sonawane

Proofreader: Safis Editor

Indexer: Hemangini Bari

Presentation Designer: Pranit Padwal

Developer Relations Marketing Executive: Anamika Singh

First published: November 2022

Second edition: October 2024

Production reference: 2300625

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK.

ISBN 978-1-83588-318-1

www.packt.com

Foreword

Forecasting as a discipline has evolved significantly. For decades, the field was dominated by simple models that often outperformed more complex ones. Machine learning methods, in various competitions, were repeatedly shown to be uncompetitive or, at best, to add little value. This period, during which I began my work in forecasting as a PhD student, has been termed the forecasting winter by some.

Since then, much has changed, and we now live in a different world in forecasting. With developments like the global modeling paradigm and the availability of more data and data with higher frequencies, machine learning methods have become highly competitive in many forecasting situations, and forecasting research is now driven by these approaches. Similarly, on the practitioner side, forecasting is often carried out by data scientists with a machine learning background but limited specialized training in forecasting. Their preferred programming tool is usually Python.

Manu’s book is the first and most comprehensive resource reflecting this profound shift that I am aware of. It brings together many concepts and ideas into a coherent and understandable format that data scientists would otherwise struggle to acquire and understand. It should be the go-to resource for any data science practitioner working in forecasting.

This second edition acknowledges the rapid developments in the field. It offers an update using the most established software frameworks, such as the NIXTLAverse, introduces some of the latest models, and also covers methods for probabilistic forecasting, most notably conformal prediction.

Manu and his co-author for this edition, Jeffrey, both have extensive practical experience and clearly know their subject. Even I learned a few new things while reading the book.

Christoph Bergmeir

María Zambrano (Senior) Fellow

Department of Computer Science and AI

University of Granada

Spain

Contributors

About the authors

Manu Joseph is a self-made data scientist with 15 years of experience working with many Fortune 500 companies enabling digital and AI transformations, specifically in machine-learning-based Demand Forecasting. He is considered an expert, thought leader, and strong voice in the world of time series forecasting. Currently, Manu is a Staff Data Scientist at Walmart and the developer of an open-source library—PyTorch Tabular. Originally from Trivandrum, India, Manu currently resides in Bangalore, India, with his wife and son.

I extend my heartfelt gratitude to my lovely wife for her unwavering support through the years, and to Siddharth Roy for taking a chance on a newbie to the world of data science and enabling me to climb the ladder of knowledge. Even if I have not named you, thanks to everyone who has supported me and helped me on my journey.

Jeff Tackes is a seasoned data scientist specializing in Demand Forecasting with over a decade of industry experience. Currently, he is at Kraft Heinz, where he leads the research team in charge of demand forecasting. He has pioneered the development of best-in-class forecasting systems utilized by leading Fortune 500 companies. Jeff’s approach combines a robust data-driven methodology with innovative strategies, enhancing forecasting models and business outcomes significantly. Leading cross-functional teams, Jeff has designed and implemented Demand Forecasting systems that have markedly improved forecast accuracy, inventory optimization, and customer satisfaction. Jeff actively contributes to open-source communities, notably to PyTimeTK, where he develops tools that enhance time series analysis capabilities. He currently resides in Chicago, IL with his wife and son.

I would like to express my deepest gratitude to my wife, whose unwavering support, patience, and encouragement have been my constant source of strength throughout this journey. Without her belief in me, this work would not have been possible. I also extend my heartfelt thanks to the many mentors, colleagues, and friends who have shared their knowledge and provided invaluable opportunities that have shaped my path. To everyone who has contributed to my growth, both personal and professional, I am deeply grateful. This achievement is as much yours as it is mine.

About the reviewers

Greg Rafferty is the author of Forecasting Time Series Data with Prophet. He is a data scientist in the San Francisco Bay Area with over a decade of experience working with many of the top firms in tech, including Google, Meta, and IBM. Greg has also worked as a consultant for companies in the retail sector, such The Gap and Albertsons/Safeway, taught business analytics on Coursera, and has led face-to-face workshops with industry professionals in data science and analytics.

Raghurami Etukuru, PhD, is the originator of the concept of Complexity-Conscious Prediction, a novel approach that recognizes and integrates the inherent complexity of data by quantifying the complexity of input data and designing AI models tailored to this complexity.

He is an AI scientist with over 25 years of industry experience, excelling in data science and AI, with a track record of impactful AI research and model development. He is the author of several books, including AI-Driven Time Series Forecasting: Complexity-Conscious Prediction and Decision-Making.

Join our community on Discord

Join our community’s Discord space for discussions with authors and other readers:

https://packt.link/mts

Contents

Preface

Who this book is for

What this book covers

To get the most out of this book

Setting up an environment

Using Anaconda/Miniconda/Mamba

Using Python Virtual environments and pip

What to do when environment creation throws an error?

Download the data

Get in touch

Making the Most Out of This Book – Get to Know Your Free Benefits

Part 1: Getting Familiar with Time Series

Introducing Time Series

Technical requirements

What is a time series?

Types of time series

Main areas of application for time series analysis

Data-generating process (DGP)

Generating synthetic time series

White and red noise

Cyclical or seasonal signals

Autoregressive signals

Mix and match

Stationary and non-stationary time series

Change in mean over time

Change in variance over time

What can we forecast?

Forecasting terminology

Summary

Further reading

Acquiring and Processing Time Series Data

Technical requirements

Understanding the time series dataset

Preparing a data model

pandas datetime operations, indexing, and slicing—a refresher

Converting the date columns into pd.Timestamp/DatetimeIndex

Using the .dt accessor and datetime properties

Indexing and slicing

Creating date sequences and managing date offsets

Handling missing data

Converting the half-hourly block-level data (hhblock) into time series data

Compact, expanded, and wide forms of data

Enforcing regular intervals in time series

Converting the London Smart Meters dataset into a time series format

Expanded form

Compact form

Mapping additional information

Saving and loading files to disk

Handling longer periods of missing data

Imputing with the previous day

Hourly average profile

The hourly average for each weekday

Seasonal interpolation

Summary

Analyzing and Visualizing Time Series Data

Technical requirements

Components of a time series

The trend component

The seasonal component

The cyclical component

The irregular component

Visualizing time series data

Line charts

Seasonal plots

Seasonal box plots

Calendar heatmaps

Autocorrelation plot

Decomposing a time series

Detrending

Moving averages

LOESS

Deseasonalizing

Period adjusted averages

Fourier series

Implementations

seasonal_decompose from statsmodel

Seasonality and trend decomposition using LOESS (STL)

Fourier decomposition

Multiple seasonality decomposition using LOESS (MSTL)

Detecting and treating outliers

Standard deviation

IQR

Isolation Forest

Extreme studentized deviate (ESD) and seasonal ESD (S-ESD)

Treating outliers

Summary

References

Further reading

Setting a Strong Baseline Forecast

Technical requirements

Setting up a test harness

Creating holdout (test) and validation datasets

Choosing an evaluation metric

Generating strong baseline forecasts

Naïve forecast

Moving average forecast

Seasonal naive forecast

Exponential smoothing

AutoRegressive Integrated Moving Average (ARIMA)

Theta forecast

TBATS

Box-Cox transformation

Exponentially smoothed trend

Seasonal decomposition using Fourier series (trigonometric seasonality)

ARMA

Parameter optimization

Multiple Seasonal-Trend decomposition using LOESS (MSTL)

Evaluating the baseline forecasts

Assessing the forecastability of a time series

Coefficient of variation

Residual variability

Entropy-based measures

Spectral entropy

Kaboudan metric

Summary

References

Further reading

Part 2: Machine Learning for Time Series

Time Series Forecasting as Regression

Understanding the basics of machine learning

Supervised machine learning tasks

Overfitting and underfitting

Hyperparameters and validation sets

Time series forecasting as regression

Time delay embedding

Temporal embedding

Global forecasting models—a paradigm shift

Summary

References

Further reading

Feature Engineering for Time Series Forecasting

Technical requirements

Understanding feature engineering

Avoiding data leakage

Setting a forecast horizon

Time delay embedding

Lags or backshift

Rolling window aggregations

Seasonal rolling window aggregations

Exponentially weighted moving average (EWMA)

Temporal embedding

Calendar features

Time elapsed

Fourier terms

Summary

Target Transformations for Time Series Forecasting

Technical requirements

Detecting non-stationarity in time series

Detecting and correcting for unit roots

Unit roots

The Augmented Dickey-Fuller (ADF) test

Differencing transform

Detecting and correcting for trends

Deterministic and stochastic trends

Kendall’s Tau

Mann-Kendall test (M-K test)

Detrending transform

Detecting and correcting for seasonality

Detecting seasonality

Deseasonalizing transform

Detecting and correcting for heteroscedasticity

Detecting heteroscedasticity

Log transform

Box-Cox transformation

AutoML approach to target transformation

Summary

References

Further reading

Forecasting Time Series with Machine Learning Models

Technical requirements

Training and predicting with machine learning models

Generating single-step forecast baselines

Standardized code to train and evaluate machine learning models

FeatureConfig

MissingValueConfig

ModelConfig

MLForecast

The fit function

The predict function

The feature_importance function

Helper functions for evaluating models

Linear regression

Regularized linear regression

Regularization–a geometric perspective

Decision trees

Random forest

Gradient boosting decision trees

Training and predicting for multiple households

Using AutoStationaryTransformer

Summary

References

Further reading

Ensembling and Stacking

Technical requirements

Combining forecasts

Best fit

Measures of central tendency

Simple hill climbing

Stochastic hill climbing

Simulated annealing

Optimal weighted ensemble

Stacking and blending

Summary

References

Further reading

Global Forecasting Models

Technical requirements

Why Global Forecasting Models?

Sample size

Cross-learning

Multi-task learning

Engineering complexity

Creating GFMs

Strategies to improve GFMs

Increasing memory

Adding more lag features

Adding rolling features

Adding EWMA features

Using time series meta-features

Ordinal encoding and one-hot encoding

Frequency encoding

Target mean encoding

LightGBM’s native handling of categorical features

Tuning hyperparameters

Grid search

Random search

Bayesian optimization

Partitioning

Random partition

Judgmental partitioning

Algorithmic partitioning

Interpretability

Summary

References

Further reading

Part 3: Deep Learning for Time Series

Introduction to Deep Learning

Technical requirements

What is deep learning and why now?

Why now?

Increase in compute availability

Increase in data availability

What is deep learning?

Perceptron, the first neural network

Components of a deep learning system

Representation learning

Linear transformation

Activation functions

Sigmoid

Hyperbolic tangent (tanh)

Rectified linear units and variants

Output activation functions

Softmax

Loss function

Forward and backward propagation

Gradient descent

Summary

References

Further reading

Building Blocks of Deep Learning for Time Series

Technical requirements

Understanding the encoder-decoder paradigm

Feed-forward networks

Recurrent neural networks

RNN architecture

RNN in PyTorch

Long short-term memory (LSTM) networks

LSTM architecture

LSTM in PyTorch

Gated recurrent unit (GRU)

GRU architecture

GRU in PyTorch

Convolution networks

Convolution

Padding, stride, and dilations

Convolution in PyTorch

Summary

References

Further reading

Common Modeling Patterns for Time Series

Technical requirements

Tabular regression

Single-step-ahead recurrent neural networks

Sequence-to-sequence (Seq2Seq) models

RNN-to-fully connected network

RNN-to-RNN

Summary

Reference

Further reading

Attention and Transformers for Time Series

Technical requirements

What is attention?

The generalized attention model

Alignment functions

Dot product

Scaled dot product attention

General attention

Additive/concat attention

The distribution function

Forecasting with sequence-to-sequence models and attention

Transformers—Attention is all you need

Attention is all you need

Self-attention

Multi-headed attention

Positional encoding

Position-wise feed-forward layer

Encoder

Decoder

Transformers in time series

Forecasting with Transformers

Summary

References

Further reading

Strategies for Global Deep Learning Forecasting Models

Technical requirements

Creating global deep learning forecasting models

Preprocessing the data

Understanding TimeSeriesDataset from PyTorch Forecasting

Initializing TimeSeriesDataset

Creating the dataloader

Visualizing how the dataloader works

Building the first global deep learning forecasting model

Defining our first RNN model

Initializing the RNN model

Training the RNN model

Forecasting with the trained model

Using time-varying information

Using static/meta information

One-hot encoding and why it is not ideal

Embedding vectors and dense representations

Defining a model with categorical features

Using the scale of the time series

Balancing the sampling procedure

Visualizing the data distribution

Tweaking the sampling procedure

Using and visualizing the dataloader with WeightedRandomSampler

Summary

Further reading

Specialized Deep Learning Architectures for Forecasting

Technical requirements

The need for specialized architectures

Introduction to NeuralForecast

Common parameters and configurations

“Auto” models

Exogenous features

Neural Basis Expansion Analysis for Interpretable Time Series Forecasting (N-BEATS)

The architecture of N-BEATS

Blocks

Stacks

The overall architecture

Basis functions and interpretability

Forecasting with N-BEATS

Interpreting N-BEATS forecasting

Neural Basis Expansion Analysis for Interpretable Time Series Forecasting with Exogenous Variables (N-BEATSx)

Handling exogenous variables

Exogenous blocks

Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS)

The Architecture of N-HiTS

Multi-rate data sampling

Hierarchical interpolation

Synchronizing the input sampling and output interpolation

Forecasting with N-HiTS

Autoformer

The architecture of the Autoformer model

Uniform Input Representation

Generative-style decoder

Decomposition architecture

Auto-correlation mechanism

Forecasting with Autoformer

LTSF-Linear family of models

Linear

D-Linear

N-Linear

Forecasting with the LTSF-Linear family

Patch Time Series Transformer (PatchTST)

The architecture of the PatchTST model

Patching

Channel independence

Forecasting with PatchTST

iTransformer

The architecture of iTransformer

Forecasting with iTransformer

Temporal Fusion Transformer (TFT)

The architecture of TFT

Locality Enhancement Seq2Seq layer

Temporal fusion decoder

Gated residual networks

Variable selection networks

Forecasting with TFT

Interpreting TFT

TSMixer

The architecture of the TSMixer model

Mixer Layer

Temporal Projection Layer

TSMixerx—TSMixer with auxiliary information

Forecasting with TSMixer and TSMixerx

Time Series Dense Encoder (TiDE)

The architecture of the TiDE model

Residual block

Encoder

Decoder

Forecasting with TiDE

Summary

References

Further reading

Probabilistic Forecasting and More

Probabilistic forecasting

Types of Predictive Uncertainty

What are probabilistic forecasts and Prediction Intervals?

Confidence levels, error rates, and quantiles

Measuring the goodness of prediction intervals

Probability Density Function (PDF)

Forecasting with PDF—machine learning models

Forecasting with PDF—deep learning models

Quantile function

Forecasting with quantile loss (machine learning)

Forecasting with quantile loss (deep learning)

Monte Carlo Dropout

Creating a custom model in neuralforecast

Forecasting with MC Dropout (neuralforecast)

Conformal Prediction

Conformal Prediction for classification

Conformal Prediction for regression

Conformalized Quantile Regression

Conformalizing uncertainty estimates

Exchangeability in Conformal Prediction and time series forecasting

Forecasting with Conformal Prediction

Road less traveled in time series forecasting

Intermittent or sparse demand forecasting

Interpretability

Cold-start forecasting

Hierarchical forecasting

Summary

References

Further reading

Part 4: Mechanics of Forecasting

Multi-Step Forecasting

Why multi-step forecasting?

Standard notation

Recursive strategy

Training regime

Forecasting regime

Direct strategy

Training regime

Forecasting regime

The Joint strategy

Training regime

Forecasting regime

Hybrid strategies

DirRec strategy

Training regime

Forecasting regime

Iterative block-wise direct strategy

Training regime

Forecasting regime

Rectify strategy

Training regime

Forecasting regime

RecJoint

Training regime

Forecasting regime

How to choose a multi-step forecasting strategy

Summary

References

Evaluating Forecast Errors—A Survey of Forecast Metrics

Technical requirements

Taxonomy of forecast error measures

Intrinsic metrics

Absolute error

Squared error

Percent error

Symmetric error

Other intrinsic metrics

Extrinsic metrics

Relative error

Scaled error

Other extrinsic metrics

Investigating the error measures

Loss curves and complementarity

Absolute error

Squared error

Percent error

Symmetric error

Extrinsic errors

Bias toward over- or under-forecasting

Experimental study of the error measures

Using Spearman’s rank correlation

Guidelines for choosing a metric

Summary

References

Further reading

Evaluating Forecasts—Validation Strategies

Technical requirements

Model validation

Holdout strategies

Window strategy

Calibration strategy

Sampling strategy

Cross-validation strategies

Choosing a validation strategy

Validation strategies for datasets with multiple time series

Summary

References

Further reading

Other Books You May Enjoy

Index

Landmarks

Cover

Index

Share your thoughts

Once you’ve read Modern Time Series Forecasting with Python, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Making the Most Out of This Book – Get to Know Your Free Benefits

Unlock exclusive free benefits that come with your purchase, thoughtfully crafted to supercharge your learning journey and help you learn without limits.

https://www.packtpub.com/unlock/9781835883181

Note: Have your purchase invoice ready before you begin.

Figure 1.1: Next-Gen Reader, AI Assistant (Beta), and Free PDF access

Enhanced reading experience with our Next-gen Reader:

Multi-device progress sync: Learn from any device with seamless progress sync.

Highlighting and Notetaking: Turn your reading into lasting knowledge.

Bookmarking: Revisit your most important learnings anytime.

Dark mode: Focus with minimal eye strain by switching to dark or sepia modes.

Learn smarter using our AI assistant (Beta):

Summarize it: Summarize key sections or an entire chapter.

AI code explainers: In Packt Reader, click the “Explain” button above each code block for AI-powered code explanations.

Note: AI Assistant is part of next-gen Packt Reader and is still in beta.

Learn anytime, anywhere:

Access your content offline with DRM-free PDF and ePub versions—compatible with your favorite e-readers.

Unlock Your Book’s Exclusive Benefits

Your copy of this book comes with the following exclusive benefits:

Next-gen Packt Reader

AI assistant (beta)

DRM-free PDF/ePub downloads

Use the following guide to unlock them if you haven’t already. The process takes just a few minutes and needs to be done only once.

How to unlock these benefits in three easy steps

Step 1

Have your purchase invoice for this book ready, as you’ll need it in Step 3. If you received a physical invoice, scan it on your phone and have it ready as either a PDF, JPG, or PNG.

For more help on finding your invoice, visit https://www.packtpub.com/unlock-benefits/help.

Note: Bought this book directly from Packt? You don’t need an invoice. After completing Step 2, you can jump straight to your exclusive content.

Step 2

Scan the following QR code or visit https://www.packtpub.com/unlock/9781835883181:

Step 3

Sign in to your Packt account or create a new one for free. Once you’re logged in, upload your invoice. It can be in PDF, PNG, or JPG format and must be no larger than 10 MB. Follow the rest of the instructions on the screen to complete the process.

Need help?

If you get stuck and need help, visit https://www.packtpub.com/unlock-benefits/help for a detailed FAQ on how to find your invoices and more. The following QR code will take you to the help page directly:

Note: If you are still facing issues, reach out to [email protected].

Part 1

Getting Familiar with Time Series

We dip our toes into time series forecasting by understanding what a time series is, how to process and manipulate time series data, and how to analyze and visualize time series data. This part also covers classical time series forecasting methods, such as ARIMA, to serve as strong baselines.

This part comprises the following chapters:

Chapter 1, Introducing Time SeriesChapter 2, Acquiring and Processing Time Series DataChapter 3, Analyzing and Visualizing Time Series DataChapter 4, Setting a Strong Baseline Forecast