Mathematics of Machine Learning - Tivadar Danka - E-Book

Mathematics of Machine Learning E-Book

Tivadar Danka

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Mathematics of Machine Learning provides a rigorous yet accessible introduction to the mathematical underpinnings of machine learning, designed for engineers, developers, and data scientists ready to elevate their technical expertise. With this book, you’ll explore the core disciplines of linear algebra, calculus, and probability theory essential for mastering advanced machine learning concepts.
PhD mathematician turned ML engineer Tivadar Danka—known for his intuitive teaching style that has attracted 100k+ followers—guides you through complex concepts with clarity, providing the structured guidance you need to deepen your theoretical knowledge and enhance your ability to solve complex machine learning problems. Balancing theory with application, this book offers clear explanations of mathematical constructs and their direct relevance to machine learning tasks. Through practical Python examples, you’ll learn to implement and use these ideas in real-world scenarios, such as training machine learning models with gradient descent or working with vectors, matrices, and tensors.
By the end of this book, you’ll have gained the confidence to engage with advanced machine learning literature and tailor algorithms to meet specific project requirements.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 817

Veröffentlichungsjahr: 2025

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Mathematics of Machine Learning

Master linear algebra, calculus, and probability for machine learning

Tivadar Danka

Mathematics of Machine Learning

Copyright © 2025 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Portfolio Director: Sunith Shetty

Relationship Lead: Tushar Gupta

Project Manager: Amit Ramadas

Content Engineer: Deepayan Bhattacharjee

Technical Editor: Kushal Sharma

Copy Editor: Safis Editing

Indexer: Hemangini Bari

Proofreader: Deepayan Bhattacharjee

Production Designer: Ganesh Bhadwalkar

Growth Leads: Merlyn M Shelley &Bhavesh Amin

Marketing Owner: Ankur Mulasi

First published: May 2025

Production reference: 1210525

Published by Packt Publishing Ltd. Grosvenor House 11 St Paul’s Square Birmingham B3 1RB, UK.

ISBN 978-1-83702-787-3

www.packtpub.com

This book is dedicated to my mother, whom I lost while making this book.

Thanks, Mom! You are inside every line I write.

– Tivadar Danka

Foreword

I met Tivadar during Covid. We were all stuck at home, unsure what to do with all the extra time, so we started talking about building something together.

I wanted to teach people Machine Learning. I had this idea about building a website that would ask random questions for people to answer. I wanted the site to do a hundred different things, but one thing was non-negotiable: I wanted people to leave feeling they had learned something different.

Tivadar was the answer to that.

Machine Learning is tough, and unfortunately, most educational content you find online suffers from chronic handwaving syndrome: overused buzzwords, skipped intuition, and more confusion than when you started.

At the time, Tivadar was already writing online about math. He wasn’t the only one, but he was different. He was taking seemingly mundane topics and telling stories around them that were surprisingly effective.

There wasn’t any handwaving or burying people under a mountain of theoretical ideas. The writing was different, sharp, and fresh.

I had never been excited about math before. I read every single one of Tivadar’s posts. I wasn’t just learning the rules, I was learning how to think. And, shockingly, I was entertained.

I had never seen that combination before.

I asked Tivadar to help me with the site, and he did – for a while – until he decided to move on to start writing this book. I remember telling him I understood, but I was secretly sad – really sad.

Today, I’m thrilled this happened the way it did.

Mathematics of Machine Learning is the inevitable consequence of those short posts that excited me about math for the first time. It’s not just the best book I’ve read on the subject, it’s the one I wish had existed when I started.

This book does something rare: it teaches you the math behind machine learning without boring you with vague concepts—or making you forget why you showed up in the first place.

The book is laser-focused on what you need and says nothing about what you don’t. The explanations are vintage Tivadar: sharp, detailed, and entertaining. You can’t just read or memorize them; you’ll understand them.

I’ve been reading this book since it was an idea and a bunch of notes and sketches. I’ve watched it grow from online posts to something polished and powerful. And I’ve learned a lot – not just about math, but about how to explain math.

I’ll leave you to it. You’re in for a treat. Enjoy the journey – I know I did.

Santiago Valdarrama,Founder of ml.school

Contributors

About the author

Tivadar Danka is an independent thinker, who believes that the truth value of any proposition is independent of the titles, awards, qualifications, and affiliations of the one asserting it. If you are looking for confirmation that you made a good purchase with this book, start at Chapter 1.

Yes, that’s really a reference to the first chapter in the author bio; that’s where the important part begins.

About the reviewers

Matthew Kehoe earned a PhD in Computational Mathematics from the University of Illinois at Chicago, where he specialized in numerical partial differential equations and inverse electromagnetic scattering theory. Following two graduate internships with the National Science Foundation and several years in software development and technical consulting, he now serves as a senior researcher leading projects in radar signal processing, scientific machine learning, and natural language processing. In addition to his applied research, he maintains a strong interest in analytic number theory and has begun writing a book on computing zeros of the Riemann zeta function.

Shravan Patankar, PhD, is a researcher and software engineer with deep expertise in artificial intelligence, machine learning, and data science. He earned his PhD in mathematics from the University of Illinois at Chicago, where his research led to three peer-reviewed publications, including two in top mathematical journals and one in Scientific Reports (Nature) on COVID-19 death estimates. With a strong foundation in mathematical reasoning and statistical modeling, Shravan has applied these principles across both academic and industry contexts.

Professionally, he is a Software Engineer in AI/ML at KPIT Technologies, where he works on software-defined vehicles and contributes to the safety of autonomous driving systems. Shravan has also served as an instructor and teaching assistant at UIC, teaching subjects from introductory calculus and statistics to applied Python programming. His collaborative and interdisciplinary work has been showcased at national conferences and international seminars. I would like to thank my mentors and peers for shaping my journey. I am especially grateful to my family and friends for their unwavering encouragement and support during the review process.

Join our community on Discord

Read this book alongside other users, deep learning experts, and the author himself. Ask questions, provide solutions to other readers, chat with the author via Ask Me Anything sessions, and much more. Scan the QR code or visit the link to join the community:https://packt.link/math

Contents

Introduction

What is this book about?

How to read this book

Conventions used

What this book covers

To get the most out of this book

Part 1: Linear Algebra

1 Vectors and Vector Spaces

1.1 What is a vector space?

1.1.1 Examples of vector spaces

1.2 The basis

1.2.1 Linear combinations and independence

1.2.2 Spans of vector sets

1.2.3 Bases, the minimal generating sets

1.2.4 Finite dimensional vector spaces

1.2.5 Why are bases so important?

1.2.6 The existence of bases

1.2.7 Subspaces

1.3 Vectors in practice

1.3.1 Tuples

1.3.2 Lists

1.3.3 NumPy arrays

1.3.4 NumPy arrays as vectors

1.3.5 Is NumPy really faster than Python?

1.4 Summary

1.5 Problems

2 The Geometric Structure of Vector Spaces

2.1 Norms and distances

2.1.1 Defining distances from norms

2.2 Inner products, angles, and lots of reasons to care about them

2.2.1 The generated norm

2.2.2 Orthogonality

2.2.3 The geometric interpretation of inner products

2.2.4 Orthogonal and orthonormal bases

2.2.5 The Gram-Schmidt orthogonalization process

2.2.6 The orthogonal complement

2.3 Summary

2.4 Problems

3 Linear Algebra in Practice

3.1 Vectors in NumPy

3.1.1 Norms, distances, and dot products

3.1.2 The Gram-Schmidt orthogonalization process

3.2 Matrices, the workhorses of linear algebra

3.2.1 Manipulating matrices

3.2.2 Matrices as arrays

3.2.3 Matrices in NumPy

3.2.4 Matrix multiplication, revisited

3.2.5 Matrices and data

3.3 Summary

3.4 Problems

4 Linear Transformations

4.1 What is a linear transformation?

4.1.1 Linear transformations and matrices

4.1.2 Matrix operations revisited

4.1.3 Inverting linear transformations

4.1.4 The kernel and the image

4.2 Change of basis

4.2.1 The transformation matrix

4.3 Linear transformations in the Euclidean plane

4.3.1 Stretching

4.3.2 Rotations

4.3.3 Shearing

4.3.4 Reflection

4.3.5 Orthogonal projection

4.4 Determinants, or how linear transformations affect volume

4.4.1 How linear transformations scale the area

4.4.2 The multi-linearity of determinants

4.4.3 Fundamental properties of the determinants

4.5 Summary

4.6 Problems

5 Matrices and Equations

5.1 Linear equations

5.1.1 Gaussian elimination

5.1.2 Gaussian elimination by hand

5.1.3 When can we perform Gaussian elimination?

5.1.4 The time complexity of Gaussian elimination

5.1.5 When can a system of linear equations be solved?

5.1.6 Inverting matrices

5.2 The LU decomposition

5.2.1 Implementing the LU decomposition

5.2.2 Inverting a matrix, for real

5.2.3 How to actually invert matrices

5.3 Determinants in practice

5.3.1 The lesser of two evils

5.3.2 The recursive way

5.3.3 How to actually compute determinants

5.4 Summary

5.5 Problems

6 Eigenvalues and Eigenvectors

6.1 Eigenvalues of matrices

6.2 Finding eigenvalue-eigenvector pairs

6.2.1 The characteristic polynomial

6.2.2 Finding eigenvectors

6.3 Eigenvectors, eigenspaces, and their bases

6.4 Summary

6.5 Problems

7 Matrix Factorizations

7.1 Special transformations

7.1.1 The adjoint transformation

7.1.2 Orthogonal transformations

7.2 Self-adjoint transformations and the spectral decomposition theorem

7.3 The singular value decomposition

7.4 Orthogonal projections

7.4.1 Properties of orthogonal projections

7.4.2 Orthogonal projections are the optimal projections

7.5 Computing eigenvalues

7.5.1 Power iteration for calculating the eigenvectors of real symmetric matrices

7.5.2 Power iteration in practice

7.5.3 Power iteration for the rest of the eigenvectors

7.6 The QR algorithm

7.6.1 The QR decomposition

7.6.2 Iterating the QR decomposition

7.7 Summary

7.8 Problems

8 Matrices and Graphs

8.1 The directed graph of a nonnegative matrix

8.2 Benefits of the graph representation

8.2.1 The connectivity of graphs

8.3 The Frobenius normal form

8.3.1 Permutation matrices

8.3.2 Directed graphs and their strongly connected components

8.3.3 Putting graphs and permutation matrices together

8.4 Summary

8.5 Problems

References

Part 2: Calculus

9 Functions

9.1 Functions in theory

9.1.1 The mathematical definition of a function

9.1.2 Domain and image

9.1.3 Operations with functions

9.1.4 Mental models of functions

9.2 Functions in practice

9.2.1 Operations on functions

9.2.2 Functions as callable objects

9.2.3 Function base class

9.2.4 Composition in the object-oriented way

9.3 Summary

9.4 Problems

10 Numbers, Sequences, and Series

10.1 Numbers

10.1.1 Natural numbers and integers

10.1.2 Rational numbers

10.1.3 Real numbers

10.2 Sequences

10.2.1 Convergence

10.2.2 Properties of convergence

10.2.3 Famous convergent sequences

10.2.4 The role of convergence in machine learning

10.2.5 Divergent sequences

10.2.6 The big and small O notation

10.2.7 Real numbers are sequences

10.3 Series

10.3.1 Convergent and divergent series

10.3.2 Properties of series

10.3.3 Conditional and absolute convergence

10.3.4 Revisiting rearrangements

10.3.5 Convergence tests for series

10.3.6 The Cauchy product of series

10.4 Summary

10.5 Problems

11 Topology, Limits, and Continuity

11.1 Topology

11.1.1 Open and closed sets

11.1.2 Distance and topology

11.1.3 Sets and sequences

11.1.4 Bounded sets

11.1.5 Compact sets

11.2 Limits

11.2.1 Equivalent definitions of limits

11.3 Continuity

11.3.1 Properties of continuous functions

11.4 Summary

11.5 Problems

12 Differentiation

12.1 Differentiation in theory

12.1.1 Equivalent forms of differentiation

12.1.2 Differentiation and continuity

12.2 Differentiation in practice

12.2.1 Rules of differentiation

12.2.2 Derivatives of elementary functions

12.2.3 Higher-order derivatives

12.2.4 Extending the Function base class

12.2.5 The derivative of compositions

12.2.6 Numerical differentiation

12.3 Summary

12.4 Problems

13 Optimization

13.1 Minima, maxima, and derivatives

13.1.1 Local minima and maxima

13.1.2 Characterization of optima with higher order derivatives

13.1.3 Mean value theorems

13.2 The basics of gradient descent

13.2.1 Derivatives, revisited

13.2.2 The gradient descent algorithm

13.2.3 Implementing gradient descent

13.2.4 Drawbacks and caveats

13.3 Why does gradient descent work?

13.3.1 Differential equations 101

13.3.2 The (slightly more) general form of ODEs

13.3.3 A geometric interpretation of differential equations

13.3.4 A continuous version of gradient ascent

13.3.5 Gradient ascent as a discretized differential equation

13.3.6 Gradient ascent in action

13.4 Summary

13.5 Problems

14 Integration

14.1 Integration in theory

14.1.1 Partitions and their refinements

14.1.2 The Riemann integral

14.1.3 Integration as the inverse of differentiation

14.2 Integration in practice

14.2.1 Integrals and operations

14.2.2 Integration by parts

14.2.3 Integration by substitution

14.2.4 Numerical integration

14.2.5 Implementing the trapezoidal rule

14.3 Summary

14.4 Problems

Join our community on Discord

References

Part 3: Multivariable Calculus

15 Multivariable Functions

15.1 What is a multivariable function?

15.2 Linear functions in multiple variables

15.3 The curse of dimensionality

15.4 Summary

16 Derivatives and Gradients

16.1 Partial and total derivatives

16.1.1 The gradient

16.1.2 Higher order partial derivatives

16.1.3 The total derivative

16.1.4 Directional derivatives

16.1.5 Properties of the gradient

16.2 Derivatives of vector-valued functions

16.2.1 The derivatives of curves

16.2.2 The Jacobian and Hessian matrices

16.2.3 The total derivative for vector-vector functions

16.2.4 Derivatives and function operations

16.3 Summary

16.4 Problems

17 Optimization in Multiple Variables

17.1 Multivariable functions in code

17.2 Minima and maxima, revisited

17.3 Gradient descent in its full form

17.4 Summary

17.5 Problems

References

Part 4: Probability Theory

18 What is Probability?

18.1 The language of thinking

18.1.1 Thinking in absolutes

18.1.2 Thinking in probabilities

18.2 The axioms of probability

18.2.1 Event spaces and σ-algebras

18.2.2 Describing σ-algebras

18.2.3 σ-algebras over real numbers

18.2.4 Probability measures

18.2.5 Fundamental properties of probability

18.2.6 Probability spaces on ℝn

18.2.7 How to interpret probability

18.3 Conditional probability

18.3.1 Independence

18.3.2 The law of total probability revisited

18.3.3 The Bayes theorem

18.3.4 The Bayesian interpretation of probability

18.3.5 The probabilistic inference process

18.3.6 The Monty Hall paradox

18.4 Summary

18.5 Problems

19 Random Variables and Distributions

19.1 Random variables

19.1.1 Discrete random variables

19.1.2 Real-valued random variables

19.1.3 Random variables in general

19.1.4 Behind the definition of random variables

19.1.5 Independence of random variables

19.2 Discrete distributions

19.2.1 The Bernoulli distribution

19.2.2 The binomial distribution

19.2.3 The geometric distribution

19.2.4 The uniform distribution

19.2.5 The single-point distribution

19.2.6 Law of total probability, revisited once more

19.2.7 Sums of discrete random variables

19.3 Real-valued distributions

19.3.1 The cumulative distribution function

19.3.2 Properties of the distribution function

19.3.3 Cumulative distribution functions for discrete random variables

19.3.4 The uniform distribution

19.3.5 The exponential distribution

19.3.6 The normal distribution

19.4 Density functions

19.4.1 Density functions in practice

19.4.2 Classification of real-valued random variables

19.5 Summary

19.6 Problems

20 The Expected Value

20.1 Discrete random variables

20.1.1 The expected value in poker

20.2 Continuous random variables

20.3 Properties of the expected value

20.4 Variance

20.4.1 Covariance and correlation

20.5 The law of large numbers

20.5.1 Tossing coins…

20.5.2 …rolling dice…

20.5.3 …and all the rest

20.5.4 The weak law of large numbers

20.5.5 The strong law of large numbers

20.6 Information theory

20.6.1 Guess the number

20.6.2 Guess the number 2: Electric Boogaloo

20.6.3 Information and entropy

20.6.4 Differential entropy

20.7 The Maximum Likelihood Estimation

20.7.1 Probabilistic modeling 101

20.7.2 Modeling heights

20.7.3 The general method

20.7.4 The German tank problem

20.8 Summary

20.9 Problems

References

Part 5: Appendix

Appendix A It’s Just Logic

A.1 Mathematical logic 101

A.2 Logical connectives

A.3 The propositional calculus

A.4 Variables and predicates

A.5 Existential and universal quantification

A.6 Problems

Appendix B The Structure of Mathematics

B.1 What is a definition?

B.2 What is a theorem?

B.3 What is a proof?

B.4 Equivalences

B.5 Proof techniques

B.5.1 Proof by induction

B.5.2 Proof by contradiction

B.5.3 Contraposition

Appendix C Basics of Set Theory

C.1 What is a set?

C.2 Operations on sets

C.2.1 Union, intersection, difference

C.2.2 De Morgan’s laws

C.3 The Cartesian product

C.4 The cardinality of sets

C.5 The Russell paradox (optional)

Appendix D Complex Numbers

D.1 The definition of complex numbers

D.2 The geometric representation

D.3 The fundamental theorem of algebra

D.4 Why are complex numbers important?

Other Books You May Enjoy

Index

Landmarks

Title Page

Cover

Table of Contents

1

Linear Algebra

This part comprises the following chapters:

Chapter 

1

, Vectors and Vector Spaces

Chapter 

2

, The Geometric Structure of Vector Spaces

Chapter 

3

, Linear Algebra in Practice

Chapter 

4

, Linear Transformations

Chapter 

5

, Matrices and Equations

Chapter 

6

, Eigenvalues and Eignevectors

Chapter 

7

, Matrix Factorizations

Chapter 

8

, Matrices and Graphs