35,99 €
The ability to use algorithms to solve real-world problems is a must-have skill for any developer or programmer. This book will help you not only to develop the skills to select and use an algorithm to tackle problems in the real world but also to understand how it works.
You'll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, with the help of practical examples. As you advance, you'll learn about linear programming, page ranking, and graphs, and will then work with machine learning algorithms to understand the math and logic behind them.
Case studies will show you how to apply these algorithms optimally before you focus on deep learning algorithms and learn about different types of deep learning models along with their practical use.
You will also learn about modern sequential models and their variants, algorithms, methodologies, and architectures that are used to implement Large Language Models (LLMs) such as ChatGPT.
Finally, you'll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks.
By the end of this programming book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 692
Veröffentlichungsjahr: 2023
50 Algorithms Every Programmer Should Know
Second Edition
Tackle computer science challenges with classic to modern algorithms in machine learning, software design, data systems, and cryptography
Imran Ahmad, PhD
BIRMINGHAM—MUMBAI
50 Algorithms Every Programmer Should Know
Second Edition
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Senior Publishing Product Manager: Denim Pinto
Acquisition Editor – Peer Reviews: Tejas Mhasvekar
Project Editor: Rianna Rodrigues
Content Development Editors: Rebecca Robinson and Matthew Davies
Copy Editor: Safis Editing
Technical Editor: Karan Sonawane
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Presentation Designer: Rajesh Shirsath
Developer Relations Marketing Executive: Vipanshu Parashar
First published: June 2020
Second edition: September 2023
Production reference: 2191023
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80324-776-2
www.packt.com
In 2014, I enthusiastically embraced my new role as a data scientist, despite having a Ph.D in economics. Some might see this as a stark shift, but to me, it was a natural progression. However, traditional views of economics might suggest that econometricians and data scientists are on separate tracks.
At the outset of my data science adventure, I waded through a sea of online materials. The sheer volume made pinpointing the right resources akin to finding a diamond in the rough. Too often, content lacked practical insights relevant to my position, causing occasional bouts of disillusionment.
One beacon of clarity in my journey was my senior colleague, Imran. His consistent guidance and mentorship were transformative. He pointed me to resources that elevated my understanding, always generously sharing his deep knowledge. He had a gift for making complex topics understandable.
Beyond his expertise as a data scientist, Imran stands out as a visionary, leader, and adept engineer. He thrives on identifying innovative solutions, especially when faced with adversity. Challenges seem to invigorate him. With natural leadership ability, he navigates intricate projects with ease. His remarkable contributions to AI and machine learning are commendable. What’s more, his talent for connecting with audiences, often laced with humor, sets him apart.
This expertise shines brightly in 50 Algorithms Every Programmer Should Know. The book goes beyond listing algorithms; it reflects Imran’s ability to make intricate subjects relatable. Real-life applications range from predicting the weather to building movie recommendation engines.
The book stands out for its holistic approach to algorithms—not just the methodology but the reasoning behind them. It’s a treasure trove for those who champion responsible AI, emphasizing the importance of data transparency and bias awareness.
50 Algorithms Every Programmer Should Know is a must-have in a data scientist’s arsenal. If you’re venturing into data science or aiming to enhance your skill set, this book is a solid stepping stone.
Somaieh Nikpoor, PhD
Lead – Data Science and AI, Government of Canada.
Adjunct Professor, Sprott School of Business, Carleton University
Imran Ahmad, PhD currently lends his expertise as a data scientist for the Advanced Analytics Solution Center (A2SC) within the Canadian Federal Government, where he harnesses machine learning algorithms for mission-critical applications.
In his 2010 doctoral thesis, he introduced a linear programming-based algorithm tailored for optimal resource assignment in expansive cloud computing landscapes. Later, in 2017, Dr. Ahmad pioneered the development of a real-time analytics framework, StreamSensing. This tool has become the cornerstone of several of his research papers, leveraging it to process multimedia data within various machine learning paradigms.
Outside of his governmental role, Dr. Ahmad holds a visiting professorship at Carleton University in Ottawa. Over the past several years, he has been also recognized as an authorized instructor for both Google Cloud and AWS.
I’m deeply grateful to my wife, Naheed, my son, Omar, and my daughter, Anum, for their unwavering support. A special nod to my parents, notably my father, Inayatuallah, for his relentless encouragement to continue learning. Further appreciation goes to Karan Sonawane, Rianna Rodrigues, and Denim from Packt for their invaluable contributions.
Aishwarya Srinivasan previously worked as a data scientist on the Google Cloud AI Services team where she worked to build machine learning solutions for customer use cases. She holds a post-graduate degree in data science from Columbia University and has over 450,000 followers on LinkedIn. She was spotlighted as a LinkedIn Top Voice for data science influencers (2020) and has been recognized as a Women in AI Trailblazer of the Year.
Tarek Ziadé is a programmer based in Burgundy, France. He has worked at several major software companies, including Mozilla and Elastic, where he has built web services and tools for developers. Tarek founded the French Python user group, Afpy, and has written several best-selling books about Python and web services.
I would like to thank my family: Freya, Suki, Milo, Amina, and Martine. They have always supported me.
Brian Spiering started his coding career in his elementary school computer lab, hacking BASIC to make programs that entertained his peers and annoyed authority figures. Much later, Brian earned a PhD in cognitive psychology from the University of California, Santa Barbara. Brian currently teaches programming and artificial intelligence.
To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:
https://packt.link/WHLel
Preface
Who this book is for
What this book covers
Get in touch
Section 1: Fundamentals and Core Algorithms
Overview of Algorithms
What is an algorithm?
The phases of an algorithm
Development environment
Python packages
The SciPy ecosystem
Using Jupyter Notebook
Algorithm design techniques
The data dimension
The compute dimension
Performance analysis
Space complexity analysis
Time complexity analysis
Estimating the performance
The best case
The worst case
The average case
Big O notation
Constant time (O(1)) complexity
Linear time (O(n)) complexity
Quadratic time (O(n2)) complexity
Logarithmic time (O(logn)) complexity
Selecting an algorithm
Validating an algorithm
Exact, approximate, and randomized algorithms
Explainability
Summary
Data Structures Used in Algorithms
Exploring Python built-in data types
Lists
Using lists
Modifying lists: append and pop operations
The range() function
The time complexity of lists
Tuples
The time complexity of tuples
Dictionaries and sets
Dictionaries
Sets
Time complexity analysis for sets
When to use a dictionary and when to use a set
Using Series and DataFrames
Series
DataFrame
Creating a subset of a DataFrame
Matrices
Matrix operations
Big O notation and matrices
Exploring abstract data types
Vector
Time complexity of vectors
Stacks
Time complexity of stack operations
Practical example
Queues
Time complexity analysis for queues
The basic idea behind the use of stacks and queues
Tree
Terminology
Types of trees
Practical examples
Summary
Sorting and Searching Algorithms
Introducing sorting algorithms
Swapping variables in Python
Bubble sort
Understanding the logic behind bubble sort
Optimizing bubble sort
Performance analysis of the bubble sort algorithm
Insertion sort
Performance analysis of the insertion sort algorithm
Merge sort
Shell sort
Performance analysis of the Shell sort algorithm
Selection sort
Performance analysis of the selection sort algorithm
Choosing a sorting algorithm
Introduction to searching algorithms
Linear search
Performance analysis of the linear search algorithm
Binary search
Performance analysis of the binary search algorithm
Interpolation search
Performance analysis of the interpolation search algorithm
Practical applications
Summary
Designing Algorithms
Introducing the basic concepts of designing an algorithm
Concern 1: correctness: will the designed algorithm produce the result we expect?
Concern 2: performance: is this the optimal way to get these results?
Characterizing the complexity of the problem
Exploring the relationship between P and NP
Introducing NP-complete and NP-hard
Concern 3 – scalability: how is the algorithm going to perform on larger datasets?
The elasticity of the cloud and algorithmic scalability
Understanding algorithmic strategies
Understanding the divide-and-conquer strategy
A practical example – divide-and-conquer applied to Apache Spark
Understanding the dynamic programming strategy
Components of dynamic programming
Conditions for using dynamic programming
Understanding greedy algorithms
Conditions for using greedy programming
A practical application – solving the TSP
Using a brute-force strategy
Using a greedy algorithm
Comparison of Three Strategies
Presenting the PageRank algorithm
Problem definition
Implementing the PageRank algorithm
Understanding linear programming
Formulating a linear programming problem
Defining the objective function
Specifying constraints
A practical application – capacity planning with linear programming
Summary
Graph Algorithms
Understanding graphs: a brief introduction
Graphs: the backbone of modern data networks
Real-world applications
The basics of a graph: vertices (or nodes)
Graph theory and network analysis
Representations of graphs
Graph mechanics and types
Ego-centered networks
Basics of egonets
One-hop, two-hop, and beyond
Applications of egonets
Introducing network analysis theory
Understanding the shortest path
Creating a neighborhood
Triangles
Density
Understanding centrality measures
Degree
Betweenness
Fairness and closeness
Eigenvector centrality
Calculating centrality metrics using Python
1. Setting the foundation: libraries and data
2. Crafting the graph
3. Painting a picture: visualizing the graph
Social network analysis
Understanding graph traversals
BFS
Constructing the adjacency list
BFS algorithm implementation
Using BFS for specific searches
DFS
Case study: fraud detection using SNA
Introduction
What is fraud in this context?
Conducting simple fraud analytics
Presenting the watchtower fraud analytics methodology
Scoring negative outcomes
Degree of suspicion
Summary
Section 2: Machine Learning Algorithms
Unsupervised Machine Learning Algorithms
Introducing unsupervised learning
Unsupervised learning in the data-mining lifecycle
Phase 1: Business understanding
Phase 2: Data understanding
Phase 3: Data preparation
Phase 4: Modeling
Phase 5: Evaluation
Phase 6: Deployment
Current research trends in unsupervised learning
Practical examples
Marketing segmentation using unsupervised learning
Understanding clustering algorithms
Quantifying similarities
Euclidean distance
Manhattan distance
Cosine distance
k-means clustering algorithm
The logic of k-means clustering
Initialization
The steps of the k-means algorithm
Stop condition
Coding the k-means algorithm
Limitation of k-means clustering
Hierarchical clustering
Steps of hierarchical clustering
Coding a hierarchical clustering algorithm
Understanding DBSCAN
Creating clusters using DBSCAN in Python
Evaluating the clusters
Application of clustering
Dimensionality reduction
Principal component analysis
Limitations of PCA
Association rules mining
Examples of use
Market basket analysis
Association rules mining
Types of rules
Trivial rules
Inexplicable rules
Actionable rules
Ranking rules
Support
Confidence
Lift
Algorithms for association analysis
Apriori algorithm
Limitations of the apriori algorithm
FP-growth algorithm
Populating the FP-tree
Mining frequent patterns
Code for using FP-growth
Summary
Traditional Supervised Learning Algorithms
Understanding supervised machine learning
Formulating supervised machine learning problems
Understanding enabling conditions
Differentiating between classifiers and regressors
Understanding classification algorithms
Presenting the classifiers challenge
The problem statement
Feature engineering using a data processing pipeline
Scaling the features
Evaluating the classifiers
Confusion matrices
Understanding recall and precision
Understanding the recall and precision trade-off
Understanding overfitting
Specifying the phases of classifiers
Decision tree classification algorithm
Understanding the decision tree classification algorithm
The strengths and weaknesses of decision tree classifiers
Use cases
Understanding the ensemble methods
Implementing gradient boosting with the XGBoost algorithm
Differentiating the Random Forest algorithm from ensemble boosting
Using the Random Forest algorithm for the classifiers challenge
Logistic regression
Assumptions
Establishing the relationship
The loss and cost functions
When to use logistic regression
Using the logistic regression algorithm for the classifiers challenge
The SVM algorithm
Using the SVM algorithm for the classifiers challenge
Understanding the Naive Bayes algorithm
Bayes’ theorem
Calculating probabilities
Multiplication rules for AND events
The general multiplication rule
Addition rules for OR events
Using the Naive Bayes algorithm for the classifiers challenge
For classification algorithms, the winner is...
Understanding regression algorithms
Presenting the regressors challenge
The problem statement of the regressors challenge
Exploring the historical dataset
Feature engineering using a data processing pipeline
Linear regression
Simple linear regression
Evaluating the regressors
Multiple regression
Using the linear regression algorithm for the regressors challenge
When is linear regression used?
The weaknesses of linear regression
The regression tree algorithm
Using the regression tree algorithm for the regressors challenge
The gradient boost regression algorithm
Using the gradient boost regression algorithm for the regressors challenge
For regression algorithms, the winner is...
Practical example – how to predict the weather
Summary
Neural Network Algorithms
The evolution of neural networks
Historical background
AI winter and the dawn of AI spring
Understanding neural networks
Understanding perceptrons
Understanding the intuition behind neural networks
Understanding layered deep learning architectures
Developing an intuition for hidden layers
How many hidden layers should be used?
Mathematical basis of neural network
Training a neural network
Understanding the anatomy of a neural network
Defining gradient descent
Activation functions
Step function
Sigmoid function
ReLU
Leaky ReLU
Hyperbolic tangent (tanh)
Softmax
Tools and frameworks
Keras
Backend engines of Keras
Low-level layers of the deep learning stack
Defining hyperparameters
Defining a Keras model
Choosing a sequential or functional model
Understanding TensorFlow
Presenting TensorFlow’s basic concepts
Understanding Tensor mathematics
Understanding the types of neural networks
Convolutional neural networks
Convolution
Pooling
Generative Adversarial Networks
Using transfer learning
Case study – using deep learning for fraud detection
Methodology
Summary
Algorithms for Natural Language Processing
Introducing NLP
Understanding NLP terminology
Text preprocessing in NLP
Tokenization
Cleaning data
Cleaning data using Python
Understanding the Term Document Matrix
Using TF-IDF
Summary and discussion of results
Introduction to word embedding
Implementing word embedding with Word2Vec
Interpreting similarity scores
Advantages and disadvantages of Word2Vec
Case study: Restaurant review sentiment analysis
Importing required libraries and loading the dataset
Building a clean corpus: Preprocessing text data
Converting text data into numerical features
Analyzing the results
Applications of NLP
Summary
Understanding Sequential Models
Understanding sequential data
Types of sequence models
One-to-many
Many-to-one
Many-to-many
Data representation for sequential models
Introducing RNNs
Understanding the architecture of RNNs
Understanding the memory cell and hidden state
Understanding the characteristics of the input variable
Training the RNN at the first timestep
The activation function in action
Training the RNN for a whole sequence
Calculating the output for each timestep
Backpropagation through time
Predicting with RNNs
Limitations of basic RNNs
Vanishing gradient problem
Inability to look ahead in the sequence
GRU
Introducing the update gate
Implementing the update gate
Updating the hidden cell
Running GRUs for multiple timesteps
Introducing LSTM
Introducing the forget gate
The candidate cell state
The update gate
Calculating memory state
The output gate
Putting everything together
Coding sequential models
Loading the dataset
Preparing the data
Creating the model
Training the model
Viewing some incorrect predictions
Summary
Advanced Sequential Modeling Algorithms
The evolution of advanced sequential modeling techniques
Exploring autoencoders
Coding an autoencoder
Setting up the environment
Data preparation
Model architecture
Compilation
Training
Prediction
Visualization
Understanding the Seq2Seq model
Encoder
Thought vector
Decoder or writer
Special tokens in Seq2Seq
The information bottleneck dilemma
Understanding the attention mechanism
What is attention in neural networks?
Basic idea
Example
Three key aspects of attention mechanisms
A deeper dive into attention mechanisms
The challenges of attention mechanisms
Delving into self-attention
Attention weights
Encoder: bidirectional RNNs
Thought vector
Decoder: regular RNNs
Training versus inference
Transformers: the evolution in neural networks after self-attention
Why transformers shine
A Python code breakdown
Understanding the output
LLMs
Understanding attention in LLMs
Exploring the powerhouses of NLP: GPT and BERT
2018’s LLM pioneers: GPT and BERT
Using deep and wide models to create powerful LLMs
Bottom of Form
Summary
Section 3: Advanced Topics
Recommendation Engines
Introducing recommendation systems
Types of recommendation engines
Content-based recommendation engines
Determining similarities in unstructured documents
Collaborative filtering recommendation engines
Issues related to collaborative filtering
Hybrid recommendation engines
Generating a similarity matrix of the items
Generating reference vectors of the users
Generating recommendations
Evolving the recommendation system
Understanding the limitations of recommendation systems
The cold start problem
Metadata requirements
The data sparsity problem
The double-edged sword of social influence in recommendation systems
Areas of practical applications
Netflix’s mastery of data-driven recommendations
The evolution of Amazon’s recommendation system
Practical example – creating a recommendation engine
1. Setting up the framework
2. Data loading: ingesting reviews and titles
3. Merging data: crafting a comprehensive view
4. Descriptive analysis: gleaning insights from ratings
5. Structuring for recommendations: crafting the matrix
6. Putting the engine to test: recommending movies
Finding movies correlating with Avatar (2009)
10,000 BC (2008) -0.075431 Understanding correlation
Evaluating the model
Retraining over time: incorporating user feedback
Summary
Algorithmic Strategies for Data Handling
Introduction to data algorithms
Significance of CAP theorem in context of data algorithms
Storage in distributed environments
Connecting CAP theorem and data compression
Presenting the CAP theorem
CA systems
AP systems
CP systems
Decoding data compression algorithms
Lossless compression techniques
Huffman coding: Implementing variable-length coding
Understanding dictionary-based compression LZ77
Advanced lossless compression formats
Practical example: Data management in AWS: A focus on CAP theorem and compression algorithms
1. Applying the CAP theorem
2. Using compression algorithms
3. Quantifying the benefits
Summary
Cryptography
Introduction to cryptography
Understanding the importance of the weakest link
The basic terminology
Understanding the security requirements
Step 1: Identifying the entities
Step 2: Establishing the security goals
Step 3: Understanding the sensitivity of the data
Understanding the basic design of ciphers
Presenting substitution ciphers
Cryptanalysis of substitution ciphers
Understanding transposition ciphers
Understanding the types of cryptographic techniques
Using the cryptographic hash function
Implementing cryptographic hash functions
An application of the cryptographic hash function
Choosing between MD5 and SHA
Using symmetric encryption
Coding symmetric encryption
The advantages of symmetric encryption
The problems with symmetric encryption
Asymmetric encryption
The SSL/TLS handshaking algorithm
Public key infrastructure
Blockchain and cryptography
Example: security concerns when deploying a machine learning model
MITM attacks
How to prevent MITM attacks
Avoiding masquerading
Data and model encryption
Summary
Large-Scale Algorithms
Introduction to large-scale algorithms
Characterizing performant infrastructure for large-scale algorithms
Elasticity
Characterizing a well-designed, large-scale algorithm
Load balancing
ELB: Combining elasticity and load balancing
Strategizing multi-resource processing
Understanding theoretical limitations of parallel computing
Amdahl’s law
Deriving Amdahl’s law
CUDA: Unleashing the potential of GPU architectures in parallel computing
Bottom of form
Parallel processing in LLMs: A case study in Amdahl’s law and diminishing returns
Rethinking data locality
Benefiting from cluster computing using Apache Spark
How Apache Spark empowers large-scale algorithm processing
Distributed computing
In-memory processing
Using large-scale algorithms in cloud computing
Example
Summary
Practical Considerations
Challenges facing algorithmic solutions
Expecting the unexpected
Failure of Tay, the Twitter AI bot
The explainability of an algorithm
Machine learning algorithms and explainability
Presenting strategies for explainability
Understanding ethics and algorithms
Problems with learning algorithms
Understanding ethical considerations
Factors affecting algorithmic solutions
Considering inconclusive evidence
Traceability
Misguided evidence
Unfair outcomes
Reducing bias in models
When to use algorithms
Understanding black swan events and their implications on algorithms
Summary
Other Books You May Enjoy
Index
Cover
Index
In the realm of computing, from foundational theories to hands-on applications, algorithms are the driving force. In this updated edition, we delve even further into the dynamic world of algorithms, broadening our scope to tackle pressing, real-world issues. Starting with the rudiments of algorithms, we journey through a myriad of design techniques, leading to intricate areas like linear programming, page ranking, graphs, and a more profound exploration of machine learning. To ensure we’re at the forefront of technological advancements, we’ve incorporated substantial discussions on sequential networks, LLMs, LSTM, GRUs, and now, cryptography and the deployment of large-scale algorithms in cloud computing environments.
The significance of algorithms in recommendation systems, a pivotal element in today’s digital age, is also meticulously detailed. To effectively wield these algorithms, understanding their underlying math and logic is paramount. Our hands-on case studies, ranging from weather forecasts and tweet analyses to film recommendations and delving into the nuances of LLMs, exemplify their practical applications.
Equipped with the insights from this book, our goal is to bolster your confidence in deploying algorithms to tackle modern computational challenges. Step into this expanded journey of deciphering and leveraging algorithms in today’s evolving digital landscape.
If you’re a programmer or developer keen on harnessing algorithms to solve problems and craft efficient code, this book is for you. From classic, widely-used algorithms to the latest in data science, machine learning, and cryptography, this guide covers a comprehensive spectrum. While familiarity with Python programming is beneficial, it’s not mandatory.
A foundation in any programming language will serve you well. Moreover, even if you’re not a programmer but have some technical inclination, you’ll gain insights into the expansive world of problem-solving algorithms from this book.
Chapter 1, Overview of Algorithms, provides insight into the fundamentals of algorithms. It starts with the basic concepts of algorithms, how people started using algorithms to formulate problems, and the limitations of different algorithms. As Python is used in this book to write the algorithms, how to set up a Python environment to run the examples is explained. We will then look at how an algorithm’s performance can be quantified and compared against other algorithms.
Chapter 2, Data Structures Used in Algorithms, discusses data structures in the context of algorithms. As we are using Python in this book, this chapter focuses on Python data structures, but the concepts presented can be used in other languages such as Java and C++. This chapter will show you how Python handles complex data structures and which structures should be used for certain types of data.
Chapter 3, Sorting and Searching Algorithms, starts by presenting different types of sorting algorithms and various approaches for designing them. Then, following practical examples, searching algorithms are also discussed.
Chapter 4, Designing Algorithms, covers the choices available to us when designing algorithms, discussing the importance of characterizing the problem that we are trying to solve. Next, it uses the famous Traveling Salesperson Problem (TSP) as a use case and applies the design techniques that we will be presenting. It also introduces linear programming and discusses its applications.
Chapter 5, Graph Algorithms, covers the ways we can capture graphs to represent data structures. It covers some foundational theories, techniques, and methods relating to graph algorithms, such as network theory analysis and graph traversals. We will investigate a case study using graph algorithms to delve into fraud analytics.
Chapter 6, UnsupervisedMachine Learning Algorithms, explains how unsupervised learning can be applied to real-world problems. We will learn about its basic algorithms and methodologies, such as clustering algorithms, dimensionality reduction, and association rule mining.
Chapter 7, Traditional Supervised Learning Algorithms, delves into the essentials of supervised machine learning, featuring classifiers and regressors. We will explore their capabilities using real-world problems as case studies. Six distinct classification algorithms are presented, followed by three regression techniques. Lastly, we’ll compare their results to encapsulate the key takeaways from this discussion.
Chapter 8, Neural Network Algorithms, introduces the main concepts and components of a typical neural network. It then presents the various types of neural networks and the activation functions used in them. The backpropagation algorithm is discussed in detail, which is the most widely used algorithm for training a neural network. Finally, we will learn how to use deep learning to flag fraudulent documents by way of a real-world example application.
Chapter 9, Algorithms for Natural Language Processing, introduces algorithms for natural language processing (NLP). It introduces the fundamentals of NLP and how to prepare data for NLP tasks. After that, it explains the concepts of vectorizing textual data and word embeddings. Finally, we present a detailed use case.
Chapter 10, Understanding Sequential Models, looks into training neural networks for sequential data. It covers the core principles of sequential models, providing an introductory overview of their techniques and methodologies. It will then consider how deep learning can improve NLP techniques.
Chapter 11, Advanced Sequential Modeling Algorithms, considers the limitations of sequential models and how sequential modeling has evolved to overcome these limitations. It delves deeper into the advanced aspects of sequential models to understand the creation of complex configurations. It starts by breaking down key elements, such as autoencoders and Sequence-to-Sequence (Seq2Seq) models. Next, it looks into attention mechanism and transformers, which are pivotal in the development of Large Language Models (LLMs), which we will then study.
Chapter 12, Recommendation Engines, covers the main types of recommendation engines and the inner workings of each. These systems are adept at suggesting tailored items or products to users, but they’re not without their challenges. We’ll discuss both their strengths and the limitations they present. Finally, we will learn how to use recommendation engines to solve a real-world problem.
Chapter 13, Algorithmic Strategies for Data Handling, introduces data algorithms and the basic concepts behind the classification of data. We will look at the data storage and data compression algorithms used to efficiently manage data, helping us to understand the trade-offs involved in designing and implementing data-centric algorithms.
Chapter 14, Cryptography, introduces you to algorithms related to cryptography. We will start by presenting the background of cryptography before discussing symmetric encryption algorithms. We will learn about the Message-Digest 5 (MD5) algorithm and the Secure Hash Algorithm (SHA), presenting the limitations and weaknesses of each. Then, we will discuss asymmetric encryption algorithms and how they are used to create digital certificates. Finally, we will present a practical example that summarizes all of these techniques.
Chapter 15, Large-Scale Algorithms, starts by introducing large-scale algorithms and the efficient infrastructure required to support them. We will explore various strategies for managing multi-resource processing. We will examine the limitations of parallel processing, as outlined by Amdahl’s law, and investigate the use of Graphics Processing Units (GPUs). Upon completing this chapter, you will have gained a solid foundation in the fundamental strategies essential for designing large-scale algorithms.
Chapter 16,Practical Considerations, presents the issues around the explainability of an algorithm, which is the degree to which the internal mechanics of an algorithm can be explained in understandable terms. Then, we will present the ethics of using an algorithm and the possibility of creating biases when implementing them. Next, the techniques for handling NP-hard problems will be discussed. Finally, we will investigate factors that should be considered before choosing an algorithm.
The code bundle for the book is also hosted on GitHub at https://github.com/cloudanum/50Algorithms. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out! You can also find the same code bundle on Google Drive at http://code.50algo.com.
We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/UBw6g.
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames and dummy URLs. Here is an example: “Let’s try to create a simple graph using the networtx package in Python.”
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, new terms appear in text like this: “Python is also one of the languages that you can use in various cloud computing infrastructures, such as Amazon Web Services (AWS) and Google Cloud Platform (GCP).”
Warnings or important notes appear like this.
Tips and tricks appear like this
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.
Once you’ve read 50 Algorithms Every Programmer Should Know - Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link belowhttps://packt.link/free-ebook/9781803247762
Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directlyThis section introduces the core aspects of algorithms. We will explore what an algorithm is and how to design one. We will also learn about the data structures used in algorithms. This section also introduces sorting and searching algorithms along with algorithms to solve graphical problems. The chapters included in this section are:
Chapter 1, Overview of AlgorithmsChapter 2, Data Structures Used in AlgorithmsChapter 3, Sorting and Searching AlgorithmsChapter 4, Designing AlgorithmsChapter 5, Graph Algorithms