50 Algorithms Every Programmer Should Know - Imran Ahmad - E-Book

50 Algorithms Every Programmer Should Know E-Book

Imran Ahmad

0,0
35,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The ability to use algorithms to solve real-world problems is a must-have skill for any developer or programmer. This book will help you not only to develop the skills to select and use an algorithm to tackle problems in the real world but also to understand how it works.

You'll start with an introduction to algorithms and discover various algorithm design techniques, before exploring how to implement different types of algorithms, with the help of practical examples. As you advance, you'll learn about linear programming, page ranking, and graphs, and will then work with machine learning algorithms to understand the math and logic behind them.

Case studies will show you how to apply these algorithms optimally before you focus on deep learning algorithms and learn about different types of deep learning models along with their practical use.

You will also learn about modern sequential models and their variants, algorithms, methodologies, and architectures that are used to implement Large Language Models (LLMs) such as ChatGPT.

Finally, you'll become well versed in techniques that enable parallel processing, giving you the ability to use these algorithms for compute-intensive tasks.

By the end of this programming book, you'll have become adept at solving real-world computational problems by using a wide range of algorithms.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 692

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



50 Algorithms Every Programmer Should Know

Second Edition

Tackle computer science challenges with classic to modern algorithms in machine learning, software design, data systems, and cryptography

Imran Ahmad, PhD

BIRMINGHAM—MUMBAI

50 Algorithms Every Programmer Should Know

Second Edition

Copyright © 2023 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Senior Publishing Product Manager: Denim Pinto

Acquisition Editor – Peer Reviews: Tejas Mhasvekar

Project Editor: Rianna Rodrigues

Content Development Editors: Rebecca Robinson and Matthew Davies

Copy Editor: Safis Editing

Technical Editor: Karan Sonawane

Proofreader: Safis Editing

Indexer: Pratik Shirodkar

Presentation Designer: Rajesh Shirsath

Developer Relations Marketing Executive: Vipanshu Parashar

First published: June 2020

Second edition: September 2023

Production reference: 2191023

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK.

ISBN 978-1-80324-776-2

www.packt.com

Foreword

In 2014, I enthusiastically embraced my new role as a data scientist, despite having a Ph.D in economics. Some might see this as a stark shift, but to me, it was a natural progression. However, traditional views of economics might suggest that econometricians and data scientists are on separate tracks.

At the outset of my data science adventure, I waded through a sea of online materials. The sheer volume made pinpointing the right resources akin to finding a diamond in the rough. Too often, content lacked practical insights relevant to my position, causing occasional bouts of disillusionment.

One beacon of clarity in my journey was my senior colleague, Imran. His consistent guidance and mentorship were transformative. He pointed me to resources that elevated my understanding, always generously sharing his deep knowledge. He had a gift for making complex topics understandable.

Beyond his expertise as a data scientist, Imran stands out as a visionary, leader, and adept engineer. He thrives on identifying innovative solutions, especially when faced with adversity. Challenges seem to invigorate him. With natural leadership ability, he navigates intricate projects with ease. His remarkable contributions to AI and machine learning are commendable. What’s more, his talent for connecting with audiences, often laced with humor, sets him apart.

This expertise shines brightly in 50 Algorithms Every Programmer Should Know. The book goes beyond listing algorithms; it reflects Imran’s ability to make intricate subjects relatable. Real-life applications range from predicting the weather to building movie recommendation engines.

The book stands out for its holistic approach to algorithms—not just the methodology but the reasoning behind them. It’s a treasure trove for those who champion responsible AI, emphasizing the importance of data transparency and bias awareness.

50 Algorithms Every Programmer Should Know is a must-have in a data scientist’s arsenal. If you’re venturing into data science or aiming to enhance your skill set, this book is a solid stepping stone.

Somaieh Nikpoor, PhD

Lead – Data Science and AI, Government of Canada.

Adjunct Professor, Sprott School of Business, Carleton University

Contributors

About the author

Imran Ahmad, PhD currently lends his expertise as a data scientist for the Advanced Analytics Solution Center (A2SC) within the Canadian Federal Government, where he harnesses machine learning algorithms for mission-critical applications.

In his 2010 doctoral thesis, he introduced a linear programming-based algorithm tailored for optimal resource assignment in expansive cloud computing landscapes. Later, in 2017, Dr. Ahmad pioneered the development of a real-time analytics framework, StreamSensing. This tool has become the cornerstone of several of his research papers, leveraging it to process multimedia data within various machine learning paradigms.

Outside of his governmental role, Dr. Ahmad holds a visiting professorship at Carleton University in Ottawa. Over the past several years, he has been also recognized as an authorized instructor for both Google Cloud and AWS.

I’m deeply grateful to my wife, Naheed, my son, Omar, and my daughter, Anum, for their unwavering support. A special nod to my parents, notably my father, Inayatuallah, for his relentless encouragement to continue learning. Further appreciation goes to Karan Sonawane, Rianna Rodrigues, and Denim from Packt for their invaluable contributions.

About the reviewers

Aishwarya Srinivasan previously worked as a data scientist on the Google Cloud AI Services team where she worked to build machine learning solutions for customer use cases. She holds a post-graduate degree in data science from Columbia University and has over 450,000 followers on LinkedIn. She was spotlighted as a LinkedIn Top Voice for data science influencers (2020) and has been recognized as a Women in AI Trailblazer of the Year.

Tarek Ziadé is a programmer based in Burgundy, France. He has worked at several major software companies, including Mozilla and Elastic, where he has built web services and tools for developers. Tarek founded the French Python user group, Afpy, and has written several best-selling books about Python and web services.

I would like to thank my family: Freya, Suki, Milo, Amina, and Martine. They have always supported me.

Brian Spiering started his coding career in his elementary school computer lab, hacking BASIC to make programs that entertained his peers and annoyed authority figures. Much later, Brian earned a PhD in cognitive psychology from the University of California, Santa Barbara. Brian currently teaches programming and artificial intelligence.

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/WHLel

Contents

Preface

Who this book is for

What this book covers

Get in touch

Section 1: Fundamentals and Core Algorithms

Overview of Algorithms

What is an algorithm?

The phases of an algorithm

Development environment

Python packages

The SciPy ecosystem

Using Jupyter Notebook

Algorithm design techniques

The data dimension

The compute dimension

Performance analysis

Space complexity analysis

Time complexity analysis

Estimating the performance

The best case

The worst case

The average case

Big O notation

Constant time (O(1)) complexity

Linear time (O(n)) complexity

Quadratic time (O(n2)) complexity

Logarithmic time (O(logn)) complexity

Selecting an algorithm

Validating an algorithm

Exact, approximate, and randomized algorithms

Explainability

Summary

Data Structures Used in Algorithms

Exploring Python built-in data types

Lists

Using lists

Modifying lists: append and pop operations

The range() function

The time complexity of lists

Tuples

The time complexity of tuples

Dictionaries and sets

Dictionaries

Sets

Time complexity analysis for sets

When to use a dictionary and when to use a set

Using Series and DataFrames

Series

DataFrame

Creating a subset of a DataFrame

Matrices

Matrix operations

Big O notation and matrices

Exploring abstract data types

Vector

Time complexity of vectors

Stacks

Time complexity of stack operations

Practical example

Queues

Time complexity analysis for queues

The basic idea behind the use of stacks and queues

Tree

Terminology

Types of trees

Practical examples

Summary

Sorting and Searching Algorithms

Introducing sorting algorithms

Swapping variables in Python

Bubble sort

Understanding the logic behind bubble sort

Optimizing bubble sort

Performance analysis of the bubble sort algorithm

Insertion sort

Performance analysis of the insertion sort algorithm

Merge sort

Shell sort

Performance analysis of the Shell sort algorithm

Selection sort

Performance analysis of the selection sort algorithm

Choosing a sorting algorithm

Introduction to searching algorithms

Linear search

Performance analysis of the linear search algorithm

Binary search

Performance analysis of the binary search algorithm

Interpolation search

Performance analysis of the interpolation search algorithm

Practical applications

Summary

Designing Algorithms

Introducing the basic concepts of designing an algorithm

Concern 1: correctness: will the designed algorithm produce the result we expect?

Concern 2: performance: is this the optimal way to get these results?

Characterizing the complexity of the problem

Exploring the relationship between P and NP

Introducing NP-complete and NP-hard

Concern 3 – scalability: how is the algorithm going to perform on larger datasets?

The elasticity of the cloud and algorithmic scalability

Understanding algorithmic strategies

Understanding the divide-and-conquer strategy

A practical example – divide-and-conquer applied to Apache Spark

Understanding the dynamic programming strategy

Components of dynamic programming

Conditions for using dynamic programming

Understanding greedy algorithms

Conditions for using greedy programming

A practical application – solving the TSP

Using a brute-force strategy

Using a greedy algorithm

Comparison of Three Strategies

Presenting the PageRank algorithm

Problem definition

Implementing the PageRank algorithm

Understanding linear programming

Formulating a linear programming problem

Defining the objective function

Specifying constraints

A practical application – capacity planning with linear programming

Summary

Graph Algorithms

Understanding graphs: a brief introduction

Graphs: the backbone of modern data networks

Real-world applications

The basics of a graph: vertices (or nodes)

Graph theory and network analysis

Representations of graphs

Graph mechanics and types

Ego-centered networks

Basics of egonets

One-hop, two-hop, and beyond

Applications of egonets

Introducing network analysis theory

Understanding the shortest path

Creating a neighborhood

Triangles

Density

Understanding centrality measures

Degree

Betweenness

Fairness and closeness

Eigenvector centrality

Calculating centrality metrics using Python

1. Setting the foundation: libraries and data

2. Crafting the graph

3. Painting a picture: visualizing the graph

Social network analysis

Understanding graph traversals

BFS

Constructing the adjacency list

BFS algorithm implementation

Using BFS for specific searches

DFS

Case study: fraud detection using SNA

Introduction

What is fraud in this context?

Conducting simple fraud analytics

Presenting the watchtower fraud analytics methodology

Scoring negative outcomes

Degree of suspicion

Summary

Section 2: Machine Learning Algorithms

Unsupervised Machine Learning Algorithms

Introducing unsupervised learning

Unsupervised learning in the data-mining lifecycle

Phase 1: Business understanding

Phase 2: Data understanding

Phase 3: Data preparation

Phase 4: Modeling

Phase 5: Evaluation

Phase 6: Deployment

Current research trends in unsupervised learning

Practical examples

Marketing segmentation using unsupervised learning

Understanding clustering algorithms

Quantifying similarities

Euclidean distance

Manhattan distance

Cosine distance

k-means clustering algorithm

The logic of k-means clustering

Initialization

The steps of the k-means algorithm

Stop condition

Coding the k-means algorithm

Limitation of k-means clustering

Hierarchical clustering

Steps of hierarchical clustering

Coding a hierarchical clustering algorithm

Understanding DBSCAN

Creating clusters using DBSCAN in Python

Evaluating the clusters

Application of clustering

Dimensionality reduction

Principal component analysis

Limitations of PCA

Association rules mining

Examples of use

Market basket analysis

Association rules mining

Types of rules

Trivial rules

Inexplicable rules

Actionable rules

Ranking rules

Support

Confidence

Lift

Algorithms for association analysis

Apriori algorithm

Limitations of the apriori algorithm

FP-growth algorithm

Populating the FP-tree

Mining frequent patterns

Code for using FP-growth

Summary

Traditional Supervised Learning Algorithms

Understanding supervised machine learning

Formulating supervised machine learning problems

Understanding enabling conditions

Differentiating between classifiers and regressors

Understanding classification algorithms

Presenting the classifiers challenge

The problem statement

Feature engineering using a data processing pipeline

Scaling the features

Evaluating the classifiers

Confusion matrices

Understanding recall and precision

Understanding the recall and precision trade-off

Understanding overfitting

Specifying the phases of classifiers

Decision tree classification algorithm

Understanding the decision tree classification algorithm

The strengths and weaknesses of decision tree classifiers

Use cases

Understanding the ensemble methods

Implementing gradient boosting with the XGBoost algorithm

Differentiating the Random Forest algorithm from ensemble boosting

Using the Random Forest algorithm for the classifiers challenge

Logistic regression

Assumptions

Establishing the relationship

The loss and cost functions

When to use logistic regression

Using the logistic regression algorithm for the classifiers challenge

The SVM algorithm

Using the SVM algorithm for the classifiers challenge

Understanding the Naive Bayes algorithm

Bayes’ theorem

Calculating probabilities

Multiplication rules for AND events

The general multiplication rule

Addition rules for OR events

Using the Naive Bayes algorithm for the classifiers challenge

For classification algorithms, the winner is...

Understanding regression algorithms

Presenting the regressors challenge

The problem statement of the regressors challenge

Exploring the historical dataset

Feature engineering using a data processing pipeline

Linear regression

Simple linear regression

Evaluating the regressors

Multiple regression

Using the linear regression algorithm for the regressors challenge

When is linear regression used?

The weaknesses of linear regression

The regression tree algorithm

Using the regression tree algorithm for the regressors challenge

The gradient boost regression algorithm

Using the gradient boost regression algorithm for the regressors challenge

For regression algorithms, the winner is...

Practical example – how to predict the weather

Summary

Neural Network Algorithms

The evolution of neural networks

Historical background

AI winter and the dawn of AI spring

Understanding neural networks

Understanding perceptrons

Understanding the intuition behind neural networks

Understanding layered deep learning architectures

Developing an intuition for hidden layers

How many hidden layers should be used?

Mathematical basis of neural network

Training a neural network

Understanding the anatomy of a neural network

Defining gradient descent

Activation functions

Step function

Sigmoid function

ReLU

Leaky ReLU

Hyperbolic tangent (tanh)

Softmax

Tools and frameworks

Keras

Backend engines of Keras

Low-level layers of the deep learning stack

Defining hyperparameters

Defining a Keras model

Choosing a sequential or functional model

Understanding TensorFlow

Presenting TensorFlow’s basic concepts

Understanding Tensor mathematics

Understanding the types of neural networks

Convolutional neural networks

Convolution

Pooling

Generative Adversarial Networks

Using transfer learning

Case study – using deep learning for fraud detection

Methodology

Summary

Algorithms for Natural Language Processing

Introducing NLP

Understanding NLP terminology

Text preprocessing in NLP

Tokenization

Cleaning data

Cleaning data using Python

Understanding the Term Document Matrix

Using TF-IDF

Summary and discussion of results

Introduction to word embedding

Implementing word embedding with Word2Vec

Interpreting similarity scores

Advantages and disadvantages of Word2Vec

Case study: Restaurant review sentiment analysis

Importing required libraries and loading the dataset

Building a clean corpus: Preprocessing text data

Converting text data into numerical features

Analyzing the results

Applications of NLP

Summary

Understanding Sequential Models

Understanding sequential data

Types of sequence models

One-to-many

Many-to-one

Many-to-many

Data representation for sequential models

Introducing RNNs

Understanding the architecture of RNNs

Understanding the memory cell and hidden state

Understanding the characteristics of the input variable

Training the RNN at the first timestep

The activation function in action

Training the RNN for a whole sequence

Calculating the output for each timestep

Backpropagation through time

Predicting with RNNs

Limitations of basic RNNs

Vanishing gradient problem

Inability to look ahead in the sequence

GRU

Introducing the update gate

Implementing the update gate

Updating the hidden cell

Running GRUs for multiple timesteps

Introducing LSTM

Introducing the forget gate

The candidate cell state

The update gate

Calculating memory state

The output gate

Putting everything together

Coding sequential models

Loading the dataset

Preparing the data

Creating the model

Training the model

Viewing some incorrect predictions

Summary

Advanced Sequential Modeling Algorithms

The evolution of advanced sequential modeling techniques

Exploring autoencoders

Coding an autoencoder

Setting up the environment

Data preparation

Model architecture

Compilation

Training

Prediction

Visualization

Understanding the Seq2Seq model

Encoder

Thought vector

Decoder or writer

Special tokens in Seq2Seq

The information bottleneck dilemma

Understanding the attention mechanism

What is attention in neural networks?

Basic idea

Example

Three key aspects of attention mechanisms

A deeper dive into attention mechanisms

The challenges of attention mechanisms

Delving into self-attention

Attention weights

Encoder: bidirectional RNNs

Thought vector

Decoder: regular RNNs

Training versus inference

Transformers: the evolution in neural networks after self-attention

Why transformers shine

A Python code breakdown

Understanding the output

LLMs

Understanding attention in LLMs

Exploring the powerhouses of NLP: GPT and BERT

2018’s LLM pioneers: GPT and BERT

Using deep and wide models to create powerful LLMs

Bottom of Form

Summary

Section 3: Advanced Topics

Recommendation Engines

Introducing recommendation systems

Types of recommendation engines

Content-based recommendation engines

Determining similarities in unstructured documents

Collaborative filtering recommendation engines

Issues related to collaborative filtering

Hybrid recommendation engines

Generating a similarity matrix of the items

Generating reference vectors of the users

Generating recommendations

Evolving the recommendation system

Understanding the limitations of recommendation systems

The cold start problem

Metadata requirements

The data sparsity problem

The double-edged sword of social influence in recommendation systems

Areas of practical applications

Netflix’s mastery of data-driven recommendations

The evolution of Amazon’s recommendation system

Practical example – creating a recommendation engine

1. Setting up the framework

2. Data loading: ingesting reviews and titles

3. Merging data: crafting a comprehensive view

4. Descriptive analysis: gleaning insights from ratings

5. Structuring for recommendations: crafting the matrix

6. Putting the engine to test: recommending movies

Finding movies correlating with Avatar (2009)

10,000 BC (2008) -0.075431 Understanding correlation

Evaluating the model

Retraining over time: incorporating user feedback

Summary

Algorithmic Strategies for Data Handling

Introduction to data algorithms

Significance of CAP theorem in context of data algorithms

Storage in distributed environments

Connecting CAP theorem and data compression

Presenting the CAP theorem

CA systems

AP systems

CP systems

Decoding data compression algorithms

Lossless compression techniques

Huffman coding: Implementing variable-length coding

Understanding dictionary-based compression LZ77

Advanced lossless compression formats

Practical example: Data management in AWS: A focus on CAP theorem and compression algorithms

1. Applying the CAP theorem

2. Using compression algorithms

3. Quantifying the benefits

Summary

Cryptography

Introduction to cryptography

Understanding the importance of the weakest link

The basic terminology

Understanding the security requirements

Step 1: Identifying the entities

Step 2: Establishing the security goals

Step 3: Understanding the sensitivity of the data

Understanding the basic design of ciphers

Presenting substitution ciphers

Cryptanalysis of substitution ciphers

Understanding transposition ciphers

Understanding the types of cryptographic techniques

Using the cryptographic hash function

Implementing cryptographic hash functions

An application of the cryptographic hash function

Choosing between MD5 and SHA

Using symmetric encryption

Coding symmetric encryption

The advantages of symmetric encryption

The problems with symmetric encryption

Asymmetric encryption

The SSL/TLS handshaking algorithm

Public key infrastructure

Blockchain and cryptography

Example: security concerns when deploying a machine learning model

MITM attacks

How to prevent MITM attacks

Avoiding masquerading

Data and model encryption

Summary

Large-Scale Algorithms

Introduction to large-scale algorithms

Characterizing performant infrastructure for large-scale algorithms

Elasticity

Characterizing a well-designed, large-scale algorithm

Load balancing

ELB: Combining elasticity and load balancing

Strategizing multi-resource processing

Understanding theoretical limitations of parallel computing

Amdahl’s law

Deriving Amdahl’s law

CUDA: Unleashing the potential of GPU architectures in parallel computing

Bottom of form

Parallel processing in LLMs: A case study in Amdahl’s law and diminishing returns

Rethinking data locality

Benefiting from cluster computing using Apache Spark

How Apache Spark empowers large-scale algorithm processing

Distributed computing

In-memory processing

Using large-scale algorithms in cloud computing

Example

Summary

Practical Considerations

Challenges facing algorithmic solutions

Expecting the unexpected

Failure of Tay, the Twitter AI bot

The explainability of an algorithm

Machine learning algorithms and explainability

Presenting strategies for explainability

Understanding ethics and algorithms

Problems with learning algorithms

Understanding ethical considerations

Factors affecting algorithmic solutions

Considering inconclusive evidence

Traceability

Misguided evidence

Unfair outcomes

Reducing bias in models

When to use algorithms

Understanding black swan events and their implications on algorithms

Summary

Other Books You May Enjoy

Index

Landmarks

Cover

Index

Preface

In the realm of computing, from foundational theories to hands-on applications, algorithms are the driving force. In this updated edition, we delve even further into the dynamic world of algorithms, broadening our scope to tackle pressing, real-world issues. Starting with the rudiments of algorithms, we journey through a myriad of design techniques, leading to intricate areas like linear programming, page ranking, graphs, and a more profound exploration of machine learning. To ensure we’re at the forefront of technological advancements, we’ve incorporated substantial discussions on sequential networks, LLMs, LSTM, GRUs, and now, cryptography and the deployment of large-scale algorithms in cloud computing environments.

The significance of algorithms in recommendation systems, a pivotal element in today’s digital age, is also meticulously detailed. To effectively wield these algorithms, understanding their underlying math and logic is paramount. Our hands-on case studies, ranging from weather forecasts and tweet analyses to film recommendations and delving into the nuances of LLMs, exemplify their practical applications.

Equipped with the insights from this book, our goal is to bolster your confidence in deploying algorithms to tackle modern computational challenges. Step into this expanded journey of deciphering and leveraging algorithms in today’s evolving digital landscape.

Who this book is for

If you’re a programmer or developer keen on harnessing algorithms to solve problems and craft efficient code, this book is for you. From classic, widely-used algorithms to the latest in data science, machine learning, and cryptography, this guide covers a comprehensive spectrum. While familiarity with Python programming is beneficial, it’s not mandatory.

A foundation in any programming language will serve you well. Moreover, even if you’re not a programmer but have some technical inclination, you’ll gain insights into the expansive world of problem-solving algorithms from this book.

What this book covers

Section 1: Fundamentals and Core Algorithms

Chapter 1, Overview of Algorithms, provides insight into the fundamentals of algorithms. It starts with the basic concepts of algorithms, how people started using algorithms to formulate problems, and the limitations of different algorithms. As Python is used in this book to write the algorithms, how to set up a Python environment to run the examples is explained. We will then look at how an algorithm’s performance can be quantified and compared against other algorithms.

Chapter 2, Data Structures Used in Algorithms, discusses data structures in the context of algorithms. As we are using Python in this book, this chapter focuses on Python data structures, but the concepts presented can be used in other languages such as Java and C++. This chapter will show you how Python handles complex data structures and which structures should be used for certain types of data.

Chapter 3, Sorting and Searching Algorithms, starts by presenting different types of sorting algorithms and various approaches for designing them. Then, following practical examples, searching algorithms are also discussed.

Chapter 4, Designing Algorithms, covers the choices available to us when designing algorithms, discussing the importance of characterizing the problem that we are trying to solve. Next, it uses the famous Traveling Salesperson Problem (TSP) as a use case and applies the design techniques that we will be presenting. It also introduces linear programming and discusses its applications.

Chapter 5, Graph Algorithms, covers the ways we can capture graphs to represent data structures. It covers some foundational theories, techniques, and methods relating to graph algorithms, such as network theory analysis and graph traversals. We will investigate a case study using graph algorithms to delve into fraud analytics.

Section 2: Machine Learning Algorithms

Chapter 6, UnsupervisedMachine Learning Algorithms, explains how unsupervised learning can be applied to real-world problems. We will learn about its basic algorithms and methodologies, such as clustering algorithms, dimensionality reduction, and association rule mining.

Chapter 7, Traditional Supervised Learning Algorithms, delves into the essentials of supervised machine learning, featuring classi­fiers and regressors. We will explore their capabilities using real-world problems as case studies. Six distinct classification algorithms are presented, followed by three regression techniques. Lastly, we’ll compare their results to encapsulate the key takeaways from this discussion.

Chapter 8, Neural Network Algorithms, introduces the main concepts and components of a typical neural network. It then presents the various types of neural networks and the activation functions used in them. The backpropagation algorithm is discussed in detail, which is the most widely used algorithm for training a neural network. Finally, we will learn how to use deep learning to flag fraudulent documents by way of a real-world example application.

Chapter 9, Algorithms for Natural Language Processing, introduces algorithms for natural language processing (NLP). It introduces the fundamentals of NLP and how to prepare data for NLP tasks. After that, it explains the concepts of vectorizing textual data and word embeddings. Finally, we present a detailed use case.

Chapter 10, Understanding Sequential Models, looks into training neural networks for sequential data. It covers the core principles of sequential models, providing an introductory overview of their techniques and methodologies. It will then consider how deep learning can improve NLP techniques.

Chapter 11, Advanced Sequential Modeling Algorithms, considers the limitations of sequential models and how sequential modeling has evolved to overcome these limitations. It delves deeper into the advanced aspects of sequential models to understand the creation of complex configurations. It starts by breaking down key elements, such as autoencoders and Sequence-to-Sequence (Seq2Seq) models. Next, it looks into attention mechanism and transformers, which are pivotal in the development of Large Language Models (LLMs), which we will then study.

Section 3: Advanced Topics

Chapter 12, Recommendation Engines, covers the main types of recommendation engines and the inner workings of each. These systems are adept at suggesting tailored items or products to users, but they’re not without their challenges. We’ll discuss both their strengths and the limitations they present. Finally, we will learn how to use recommendation engines to solve a real-world problem.

Chapter 13, Algorithmic Strategies for Data Handling, introduces data algorithms and the basic concepts behind the classification of data. We will look at the data storage and data compression algorithms used to efficiently manage data, helping us to understand the trade-offs involved in designing and implementing data-centric algorithms.

Chapter 14, Cryptography, introduces you to algorithms related to cryptography. We will start by presenting the background of cryptography before discussing symmetric encryption algorithms. We will learn about the Message-Digest 5 (MD5) algorithm and the Secure Hash Algorithm (SHA), presenting the limitations and weaknesses of each. Then, we will discuss asymmetric encryption algorithms and how they are used to create digital certificates. Finally, we will present a practical example that summarizes all of these techniques.

Chapter 15, Large-Scale Algorithms, starts by introducing large-scale algorithms and the efficient infrastructure required to support them. We will explore various strategies for managing multi-resource processing. We will examine the limitations of parallel processing, as outlined by Amdahl’s law, and investigate the use of Graphics Processing Units (GPUs). Upon completing this chapter, you will have gained a solid foundation in the fundamental strategies essential for designing large-scale algorithms.

Chapter 16,Practical Considerations, presents the issues around the explainability of an algorithm, which is the degree to which the internal mechanics of an algorithm can be explained in understandable terms. Then, we will present the ethics of using an algorithm and the possibility of creating biases when implementing them. Next, the techniques for handling NP-hard problems will be discussed. Finally, we will investigate factors that should be considered before choosing an algorithm.

Download the example code files

The code bundle for the book is also hosted on GitHub at https://github.com/cloudanum/50Algorithms. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out! You can also find the same code bundle on Google Drive at http://code.50algo.com.

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/UBw6g.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames and dummy URLs. Here is an example: “Let’s try to create a simple graph using the networtx package in Python.”

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, new terms appear in text like this: “Python is also one of the languages that you can use in various cloud computing infrastructures, such as Amazon Web Services (AWS) and Google Cloud Platform (GCP).”

Warnings or important notes appear like this.

Tips and tricks appear like this

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.

Share your thoughts

Once you’ve read 50 Algorithms Every Programmer Should Know - Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application. 

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/9781803247762

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

Section 1

Fundamentals and Core Algorithms

This section introduces the core aspects of algorithms. We will explore what an algorithm is and how to design one. We will also learn about the data structures used in algorithms. This section also introduces sorting and searching algorithms along with algorithms to solve graphical problems. The chapters included in this section are:

Chapter 1, Overview of AlgorithmsChapter 2, Data Structures Used in AlgorithmsChapter 3, Sorting and Searching AlgorithmsChapter 4, Designing AlgorithmsChapter 5, Graph Algorithms