20,99 €
Discover how algorithms shape and impact our digital world All data, big or small, starts with algorithms. Algorithms are mathematical equations that determine what we see--based on our likes, dislikes, queries, views, interests, relationships, and more--online. They are, in a sense, the electronic gatekeepers to our digital, as well as our physical, world. This book demystifies the subject of algorithms so you can understand how important they are business and scientific decision making. Algorithms for Dummies is a clear and concise primer for everyday people who are interested in algorithms and how they impact our digital lives. Based on the fact that we already live in a world where algorithms are behind most of the technology we use, this book offers eye-opening information on the pervasiveness and importance of this mathematical science--how it plays out in our everyday digestion of news and entertainment, as well as in its influence on our social interactions and consumerism. Readers even learn how to program an algorithm using Python! * Become well-versed in the major areas comprising algorithms * Examine the incredible history behind algorithms * Get familiar with real-world applications of problem-solving procedures * Experience hands-on development of an algorithm from start to finish with Python If you have a nagging curiosity about why an ad for that hammock you checked out on Amazon is appearing on your Facebook page, you'll find Algorithm for Dummies to be an enlightening introduction to this integral realm of math, science, and business.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 636
Veröffentlichungsjahr: 2017
Algorithms For Dummies®
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright © 2017 by John Wiley & Sons, Inc., Hoboken, New Jersey
Media and software compilation copyright © 2017 by John Wiley & Sons, Inc. All rights reserved.
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit https://hub.wiley.com/community/support/dummies.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Control Number: 2017936606
ISBN 978-1-119-33049-3 (pbk); ISBN 978-1-119-33053-0 (ebk); ISBN 978-1-119-33052-3 (ebk)
Table of Contents
Cover
Introduction
About This Book
Foolish Assumptions
Icons Used in This Book
Beyond the Book
Where to Go from Here
Part 1: Getting Started
Chapter 1: Introducing Algorithms
Describing Algorithms
Using Computers to Solve Problems
Distinguishing between Issues and Solutions
Structuring Data to Obtain a Solution
Chapter 2: Considering Algorithm Design
Starting to Solve a Problem
Dividing and Conquering
Learning that Greed Can Be Good
Computing Costs and Following Heuristics
Evaluating Algorithms
Chapter 3: Using Python to Work with Algorithms
Considering the Benefits of Python
Looking at the Python Distributions
Installing Python on Linux
Installing Python on MacOS
Installing Python on Windows
Downloading the Datasets and Example Code
Chapter 4: Introducing Python for Algorithm Programming
Working with Numbers and Logic
Creating and Using Strings
Interacting with Dates
Creating and Using Functions
Using Conditional and Loop Statements
Storing Data Using Sets, Lists, and Tuples
Defining Useful Iterators
Indexing Data Using Dictionaries
Chapter 5: Performing Essential Data Manipulations Using Python
Performing Calculations Using Vectors and Matrixes
Creating Combinations the Right Way
Getting the Desired Results Using Recursion
Performing Tasks More Quickly
Part 2: Understanding the Need to Sort and Search
Chapter 6: Structuring Data
Determining the Need for Structure
Stacking and Piling Data in Order
Working with Trees
Representing Relations in a Graph
Chapter 7: Arranging and Searching Data
Sorting Data Using Mergesort and Quicksort
Using Search Trees and the Heap
Relying on Hashing
Part 3: Exploring the World of Graphs
Chapter 8: Understanding Graph Basics
Explaining the Importance of Networks
Defining How to Draw a Graph
Measuring Graph Functionality
Putting a Graph in Numeric Format
Chapter 9: Reconnecting the Dots
Traversing a Graph Efficiently
Sorting the Graph Elements
Reducing to a Minimum Spanning Tree
Finding the Shortest Route
Chapter 10: Discovering Graph Secrets
Envisioning Social Networks as Graphs
Navigating a Graph
Chapter 11: Getting the Right Web page
Finding the World in a Search Engine
Explaining the PageRank Algorithm
Implementing PageRank
Going Beyond the PageRank Paradigm
Part 4: Struggling with Big Data
Chapter 12: Managing Big Data
Transforming Power into Data
Streaming Flows of Data
Sketching an Answer from Stream Data
Chapter 13: Parallelizing Operations
Managing Immense Amounts of Data
Working Out Algorithms for MapReduce
Chapter 14: Compressing Data
Making Data Smaller
Part 5: Challenging Difficult Problems
Chapter 15: Working with Greedy Algorithms
Deciding When It Is Better to Be Greedy
Finding Out How Greedy Can Be Useful
Chapter 16: Relying on Dynamic Programming
Explaining Dynamic Programming
Discovering the Best Dynamic Recipes
Chapter 17: Using Randomized Algorithms
Defining How Randomization Works
Putting Randomness into your Logic
Chapter 18: Performing Local Search
Understanding Local Search
Presenting local search tricks
Solving satisfiability of Boolean circuits
Chapter 19: Employing Linear Programming
Using Linear Functions as a Tool
Using Linear Programming in Practice
Chapter 20: Considering Heuristics
Differentiating Heuristics
Routing Robots Using Heuristics
Explaining Path Finding Algorithms
Part 6: The Part of Tens
Chapter 21: Ten Algorithms That Are Changing the World
Using Sort Routines
Looking for Things with Search Routines
Shaking Things Up with Random Numbers
Performing Data Compression
Keeping Data Secret
Changing the Data Domain
Analyzing Links
Spotting Data Patterns
Dealing with Automation and Automatic Responses
Creating Unique Identifiers
Chapter 22: Ten Algorithmic Problems Yet to Solve
Dealing with Text Searches
Differentiating Words
Determining Whether an Application Will End
Creating and Using One-Way Functions
Multiplying Really Large Numbers
Dividing a Resource Equally
Reducing Edit Distance Calculation Time
Solving Problems Quickly
Playing the Parity Game
Understanding Spatial Issues
About the Authors
Connect with Dummies
End User License Agreement
Cover
Table of Contents
Begin Reading
i
ii
v
vi
vii
viii
ix
x
xi
xii
1
2
3
4
5
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
357
358
359
360
361
362
363
364
365
366
367
368
369
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
389
390
391
392
393
394
395
396
397
399
400
401
402
403
404
419
420
You need to learn about algorithms for school or work. Yet, all the books you’ve tried on the subject end up being more along the lines of really good sleep-inducing aids rather than texts to teach you something. Assuming that you can get past the arcane symbols obviously written by a demented two-year-old with a penchant for squiggles, you end up having no idea of why you’d even want to know anything about them. Most math texts are boring! However, Algorithms For Dummies is different. The first thing you’ll note is that this book has a definite lack of odd symbols (especially of the squiggly sort) floating about. Yes, you see a few (it is a math book, after all), but what you find instead are clear instructions for using algorithms that actually have names and a history behind them to perform useful tasks. You’ll encounter simple coding techniques that perform amazing things that will intrigue your friends and certainly make them jealous as you perform amazing feats of math that they can’t begin to understand. You get all this without having to strain your brain, even a little, and you won’t even fall asleep (well, unless you really want to do so).
Algorithms For Dummies is the math book that you wanted in college but didn’t get. You discover, for example, that algorithms aren’t new. After all, the Babylonians used algorithms to perform simple tasks as early as 1,600 BC. If the Babylonians could figure this stuff out, certainly you can, too! This book actually has three things that you won’t find in most math books:
Algorithms that have actual names and a historical basis so that you can remember the algorithm and know why someone took time to create it
Simple explanations of how the algorithm performs amazing feats of data manipulation, data analysis, or probability prediction
Code that shows how to use the algorithm without actually dealing with arcane symbols that no one without a math degree can understand
Part of the emphasis of this book is on using the right tools. This book uses Python to perform various tasks. Python has special features that make working with algorithms significantly easier. For example, Python provides access to a huge array of packages that let you do just about anything you can imagine, and more than a few that you can’t. However, unlike many texts that use Python, this one doesn’t bury you in packages. We use a select group of packages that provide great flexibility with a lot of functionality, but don’t require you to pay anything. You can go through this entire book without forking over a cent of your hard-earned money.
You also discover some interesting techniques in this book. The most important is that you don’t just see the algorithms used to perform tasks; you also get an explanation of how the algorithms work. Unlike many other books, Algorithms For Dummies enables you to fully understand what you’re doing, but without requiring you to have a PhD in math. Every one of the examples shows the expected output and tells you why that output is important. You aren’t left with the feeling that something is missing.
Of course, you might still be worried about the whole programming environment issue, and this book doesn’t leave you in the dark there, either. At the beginning, you find complete installation instructions for Anaconda, which is the Python language Integrated Development Environment (IDE) used for this book. In addition, quick primers (with references) help you understand the basic Python programming that you need to perform. The emphasis is on getting you up and running as quickly as possible, and to make examples straightforward and simple so that the code doesn’t become a stumbling block to learning.
To help you absorb the concepts, this book uses the following conventions:
Text that you’re meant to type just as it appears in the book is in
bold
. The exception is when you’re working through a step list: Because each step is bold, the text to type is not bold.
Words that we want you to type in that are also in
italics
are used as placeholders
,
which means that you need to replace them with something that works for you. For example, if you see “Type
Your Name
and press Enter,” you need to replace
Your Name
with your actual name.
We also use
italics
for terms we define. This means that you don’t have to rely on other sources to provide the definitions you need.
Web addresses and programming code appear in
monofont
. If you’re reading a digital version of this book on a device connected to the Internet, you can click the live link to visit that website, like this:
http://www.dummies.com
.
When you need to click command sequences, you see them separated by a special arrow, like this: File ⇒ New File, which tells you to click File and then New File.
You might find it difficult to believe that we’ve assumed anything about you — after all, we haven’t even met you yet! Although most assumptions are indeed foolish, we made certain assumptions to provide a starting point for the book.
The first assumption is that you’re familiar with the platform you want to use, because the book doesn’t provide any guidance in this regard. (Chapter 3 does, however, tell you how to install Anaconda; Chapter 4 provides a basic Python language overview; and Chapter 5 helps you understand how to perform the essential data manipulations using Python.) To give you the maximum information about Python with regard to algorithms, this book doesn’t discuss any platform-specific issues. You really do need to know how to install applications, use applications, and generally work with your chosen platform before you begin working with this book.
This book isn’t a math primer. Yes, you see lots of examples of complex math, but the emphasis is on helping you use Python to perform common tasks using algorithms rather than learning math theory. However, you do get explanations of many of the algorithms used in the book so that you can understand how the algorithms work. Chapters 1 and 2 guide you through a better understanding of precisely what you need to know in order to use this book successfully.
This book also assumes that you can access items on the Internet. Sprinkled throughout are numerous references to online material that will enhance your learning experience. However, these added sources are useful only if you actually find and use them.
As you read this book, you encounter icons in the margins that indicate material of interest (or not, as the case may be). Here’s what the icons mean:
Tips are nice because they help you save time or perform some task without a lot of extra work. The tips in this book are time-saving techniques or pointers to resources that you should try so that you can get the maximum benefit from Python, or in performing algorithm-related or data analysis–related tasks.
We don’t want to sound like angry parents or some kind of maniacs, but you should avoid doing anything that’s marked with a Warning icon. Otherwise, you might find that your application fails to work as expected, you get incorrect answers from seemingly bulletproof algorithms, or (in the worst-case scenario) you lose data.
Whenever you see this icon, think advanced tip or technique. You might find these tidbits of useful information just too boring for words, or they could contain the solution you need to get a program running. Skip these bits of information whenever you like.
If you don’t get anything else out of a particular chapter or section, remember the material marked by this icon. This text usually contains an essential process or a bit of information that you must know to work with Python, or to perform algorithm-related or data analysis–related tasks successfully.
This book isn’t the end of your Python or algorithm learning experience — it’s really just the beginning. We provide online content to make this book more flexible and better able to meet your needs. That way, as we receive email from you, we can address questions and tell you how updates to Python, or its associated add-ons affect book content. In fact, you gain access to all these cool additions:
Cheat sheet:
You remember using crib notes in school to make a better mark on a test, don’t you? You do? Well, a cheat sheet is sort of like that. It provides you with some special notes about tasks that you can do with Python, Anaconda, and algorithms that not every other person knows. To find the cheat sheet for this book, go to
www.dummies.com
and search for
Algorithms For Dummies Cheat Sheet.
It contains really neat information such as finding the algorithms that you commonly need to perform specific tasks.
Updates: Sometimes changes happen. For example, we might not have seen an upcoming change when we looked into our crystal ball during the writing of this book. In the past, this possibility simply meant that the book became outdated and less useful, but you can now find updates to the book at www.dummies.com/go/algorithmsfd.
In addition to these updates, check out the blog posts with answers to reader questions and demonstrations of useful book-related techniques at http://blog.johnmuellerbooks.com/.
Companion files:
Hey! Who really wants to type all the code in the book and reconstruct all those plots manually? Most readers prefer to spend their time actually working with Python, performing tasks using algorithms, and seeing the interesting things they can do, rather than typing. Fortunately for you, the examples used in the book are available for download, so all you need to do is read the book to learn algorithm usage techniques. You can find these files at
www.dummies.com/go/algorithmsfd
.
It’s time to start your algorithm learning adventure! If you’re completely new to algorithms, you should start with Chapter 1 and progress through the book at a pace that allows you to absorb as much of the material as possible. Make sure to read about Python because the book uses this language as needed for the examples.
If you’re a novice who’s in an absolute rush to get going with algorithms as quickly as possible, you can skip to Chapter 3 with the understanding that you may find some topics a bit confusing later. If you already have Anaconda installed, you can skim Chapter 3. To use this book, you must install Python version 3.4. The examples won’t work with the 2.x version of Python because this version doesn’t support some of the packages we use.
Readers who have some exposure to Python, and have the appropriate language versions installed, can save reading time by moving directly to Chapter 6. You can always go back to earlier chapters as necessary when you have questions. However, you do need to understand how each technique works before moving to the next one. Every technique, coding example, and procedure has important lessons for you, and you could miss vital content if you start skipping too much information.
Part 1
IN THIS PART …
Discover how you can use algorithms to perform practical tasks.
Understand how algorithms are put together.
Install and configure Python to work with algorithms.
Use Python to work with algorithms.
Perform basic algorithm manipulations using Python.
Chapter 1
IN THIS CHAPTER
Defining what is meant by algorithm
Relying on computers to use algorithms to provide solutions
Determining how issues differ from solutions
Performing data manipulation so that you can find a solution
If you’re in the majority of people, you’re likely confused as you open this book and begin your adventure with algorithms because most texts never tell you what an algorithm is, much less why you’d want to use one. Most texts assume that you already know something about algorithms and that you are reading about them to refine and elevate your knowledge. Interestingly enough, some books actually provide a confusing definition for algorithm that doesn’t really define it after all, and sometimes even equates it to some other form of abstract, numeric, or symbolic expression.
The first section of this chapter is dedicated to helping you understand precisely what the term algorithm means and why you benefit from knowing how to use algorithms. Far from being arcane, algorithms are actually used all over the place, and you have probably used or been helped by them for years without really knowing it. In truth, algorithms are becoming the spine that supports and regulates what is important in an increasingly complex and technological society like ours.
This chapter also discusses how you use computers to create solutions to problems using algorithms, how to distinguish between issues and solutions, and what you need to do to manipulate data to discover a solution. The goal of this chapter is to help you differentiate between algorithms and other tasks that people perform that they confuse with algorithms. In short, you discover why you really want to know about algorithms and how to apply them to data.
Even though people have solved algorithms manually for literally thousands of years, doing so can consume huge amounts of time and require many numeric computations, depending on the complexity of the problem you want to solve. Algorithms are all about finding solutions, and the speedier and easier, the better. A huge gap exists between mathematical algorithms historically created by geniuses of their time, such as Euclid, Newton, or Gauss, and modern algorithms created in universities as well as private research and development laboratories. The main reason for this gap is the use of computers. Using computers to solve problems by employing the appropriate algorithm speeds up the task significantly, which is the reason that the development of new algorithms has progressed so fast since the appearance of powerful computer systems. In fact, you may have noticed that more and more solutions to problems appear quickly today, in part, because computer power is both cheap and constantly increasing. Given their ability to solve problems using algorithms, computers (sometimes in the form of special hardware) are becoming ubiquitous.
When working with algorithms, you consider the inputs, desired outputs, and process (a sequence of actions) used to obtain a desired output from a given input. However, you can get the terminology wrong and view algorithms in the wrong way because you haven’t really considered how they work in a real-world setting. The third section of the chapter discusses algorithms in a real-world manner, that is, by viewing the terminologies used to understand algorithms and to present algorithms in a way that shows that the real-world is often less than perfect. Understanding how to describe an algorithm in a realistic manner also makes it possible to temper expectations to reflect the realities of what an algorithm can actually do.
This book views algorithms in many ways. However, because it provides an overview telling how algorithms are changing and enriching people’s lives, the focus is on algorithms used to manipulate data with a computer providing the required processing. With this in mind, the algorithms you work with in this book require data input in a specific form, which sometimes means changing the data to match the algorithm’s requirements. Data manipulation doesn’t change the content of the data. What it does do is change the presentation and form of the data so that an algorithm can help you see new patterns that weren’t apparent before (but were actually present in the data all along).
Sources of information about algorithms often present them in a way that proves confusing because they’re too sophisticated or downright incorrect. Although you may find other definitions, this book uses the following definitions for terms that people often confuse with algorithms (but aren’t):
Equation:
Numbers and symbols that, when taken as a whole, equate to a specific value. An equation always contains an equals sign so that you know that the numbers and symbols represent the specific value on the other side of the equals sign. Equations generally contain variable information presented as a symbol, but they’re not required to use variables.
Formula:
A combination of numbers and symbols used to express information or ideas. Formulas normally present mathematical or logical concepts, such as defining the Greatest Common Divisor (GCD) of two integers (the video at
https://www.khanacademy.org/math/in-sixth-grade-math/playing-numbers/highest-common-factor/v/greatest-common-divisor
tells how this works). Generally, they show the relationship between two or more variables. Most people see a formula as a special kind of equation.
Algorithm: A sequence of steps used to solve a problem. The sequence presents a unique method of addressing an issue by providing a particular solution. An algorithm need not represent mathematical or logical concepts, even though the presentations in this book often do fall into that category because people most commonly use algorithms in this manner. Some special formulas are also algorithms, such as the quadratic formula. In order for a process to represent an algorithm, it must be
Finite:
The algorithm must eventually solve the problem. This book discusses problems with a known solution so that you can evaluate whether an algorithm solves the problem correctly.
Well-defined:
The series of steps must be precise and present steps that are understandable. Especially because computers are involved in algorithm use, the computer must be able to understand the steps to create a usable algorithm.
Effective:
An algorithm must solve all cases of the problem for which someone defined it. An algorithm should always solve the problem it has to solve. Even though you should anticipate some failures, the incidence of failure is rare and occurs only in situations that are acceptable for the intended algorithm use.
With these definitions in mind, the following sections help to clarify the precise nature of algorithms. The goal isn’t to provide a precise definition for algorithms, but rather to help you understand how algorithms fit into the grand scheme of things so that you can develop your own understanding of what algorithms are and why they’re so important.
An algorithm always presents a series of steps and doesn’t necessarily perform these steps to solve a math formula. The scope of algorithms is incredibly large. You can find algorithms that solve problems in science, medicine, finance, industrial production and supply, and communication. Algorithms provide support for all parts of a person’s daily life. Any time a sequence of actions achieving something in our life is finite, well-defined, and effective, you can view it as an algorithm. For example, you can turn even something as trivial and simple as making toast into an algorithm. In fact, the making toast procedure often appears in computer science classes, as discussed at http://brianaspinall.com/now-thats-how-you-make-toast-using-computer-algorithms/.
Unfortunately, the algorithm on the site is flawed. The instructor never removes the toast from the wrapper and never plugs the toaster in, so the result is damaged plain bread still in its wrapper stuffed into a nonfunctional toaster (see the discussion at http://blog.johnmuellerbooks.com/2013/03/04/procedures-in-technical-writing/ for details). Even so, the idea is the correct one, yet it requires some slight, but essential, adjustments to make the algorithm finite and effective.
One of the most common uses of algorithms is as a means of solving formulas. For example, when working with the GCD of two integer values, you can perform the task manually by listing each of the factors for the two integers and then selecting the greatest factor that is common to both. For example, GCD(20, 25) is 5 because 5 is the largest number that divides into both 20 and 25. However, processing every GCD manually (which is actually a kind of algorithm) is time consuming and error prone, so the Greek mathematician Euclid (https://en.wikipedia.org/wiki/Euclid) created an algorithm to perform the task. You can see the Euclidean method demonstrated at https://www.khanacademy.org/computing/computer-science/cryptography/modarithmetic/a/the-euclidean-algorithm.
However, a single formula, which is a presentation of symbols and numbers used to express information or ideas, can have multiple solutions, each of which is an algorithm. In the case of GCD, another common algorithm is one created by Lehmer (see https://www.imsc.res.in/~kapil/crypto/notes/node11.html and https://en.wikipedia.org/wiki/Lehmer%27s_GCD_algorithm for details). Because you can solve any formula multiple ways, people often spend a great deal of time comparing algorithms to determine which one works best in a given situation. (See a comparison of Euclid to Lehmer at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.31.693&rep=rep1&type=pdf.)
Because our society and its accompanying technology are gaining momentum, running faster and faster, we need algorithms that can keep the pace. Scientific achievements such as sequencing the human genome were possible in our age because scientists found algorithms that run fast enough to complete the task. Measuring which algorithm is better in a given situation, or in an average usage situation, is really serious stuff and a topic of discussion among computer scientists.
When it comes to computer science, the same algorithm can see multiple presentations. For example, you can present the Euclidean algorithm in both a recursive and an iterative manner, as explained at http://cs.stackexchange.com/questions/1447/what-is-most-efficient-for-gcd. In short, algorithms present a method of solving formulas, but it would be a mistake to say that just one acceptable algorithm exists for any given formula or that there is only one acceptable presentation of an algorithm. Using algorithms to solve problems of various sorts has a long history — it isn’t something that has just happened.
Even if you limit your gaze to computer science, data science, artificial intelligence, and other technical areas, you find many kinds of algorithms — too many for a single book. For example, The Art of Computer Programming, by Donald E. Knuth (Addison-Wesley), spans 3,168 pages in four volumes (see http://www.amazon.com/exec/obidos/ASIN/0321751043/datacservip0f-20/) and still doesn’t manage to cover the topic (the author intended to write more volumes). However, here are some interesting uses for you to consider:
Searching:
Locating information or verifying that the information you see is the information you want is an essential task. Without this ability, it wouldn’t be possible to perform many tasks online, such as finding the website on the Internet selling the perfect coffee pot for your office.
Sorting:
Determining which order to use to present information is important because most people today suffer from information overload, and putting information in order is one way to reduce the onrush of data. You likely learned as a child that when you place your toys in order, it’s easier to find and play with a toy that interests you, compared to having toys scattered randomly everywhere. Imagine going to Amazon, finding that over a thousand coffee pots are for sale there, and yet not being able to sort them in order of price or most positive review. Moreover, many complex algorithms require data in the proper order to work dependably, therefore ordering is an important requisite for solving more problems.
Transforming:
Converting one sort of data to another sort of data is critical to understanding and using the data effectively. For example, you might understand imperial weights just fine, but all your sources use the metric system. Converting between the two systems helps you understand the data. Likewise, the Fast Fourier Transform (FFT) converts signals between the time domain and the frequency domain so that it becomes possible to make things like your Wi-Fi router work.
Scheduling:
Making the use of resources fair to all concerned is another way in which algorithms make their presence known in a big way. For example, timing lights at intersections are no longer simple devices that count down the seconds between light changes. Modern devices consider all sorts of issues, such as the time of day, weather conditions, and flow of traffic. Scheduling comes in many forms, however. For example, consider how your computer runs multiple tasks at the same time. Without a scheduling algorithm, the operating system might grab all the available resources and keep your application from doing any useful work.
Graph analysis:
Deciding on the shortest line between two points finds all sorts of uses. For example, in a routing problem, your GPS couldn’t function without this particular algorithm because it could never direct you along city streets using the shortest route from point A to point B.
Cryptography:
Keeping data safe is an ongoing battle with hackers constantly attacking data sources. Algorithms make it possible to analyze data, put it into some other form, and then return it to its original form later.
Pseudorandom number generation:
Imagine playing games that never varied. You start at the same place; perform the same steps, in the same manner, every time you play. Without the capability to generate seemingly random numbers, many computer tasks become impossible.
This list presents an incredibly short overview. People use algorithms for many different tasks and in many different ways, and constantly create new algorithms to solve both existing problems and new problems. The most important issue to consider when working with algorithms is that given a particular input, you should expect a specific output. Secondary issues include how many resources the algorithm requires to perform its task and how long it takes to complete the task. Depending on the kind of issue and the sort of algorithm used, you may also need to consider issues of accuracy and consistency.
The previous section mentions the toast algorithm for a specific reason. For some reason, making toast is probably the most popular algorithm ever created. Many grade-school children write their equivalent of the toast algorithm long before they can even solve the most basic math. It’s not hard to imagine how many variations of the toast algorithm exist and what the precise output is of each of them. The results likely vary by individual and the level of creativity employed. In short, algorithms appear in great variety and often in unexpected places.
Every task you perform on a computer involves algorithms. Some algorithms appear as part of the computer hardware. (They are embedded, thus you hear of embedded microprocessors.) The very act of booting a computer involves the use of an algorithm. You also find algorithms in operating systems, applications, and every other piece of software. Even users rely on algorithms. Scripts help direct users to perform tasks in a specific way, but those same steps could appear as written instructions or as part of an organizational policy statement.
Daily routines often devolve into algorithms. Think about how you spend your day. If you’re like most people, you perform essentially the same tasks every day in the same order, making your day an algorithm that solves the problem of how to live successfully while expending the least amount of energy possible. After all, that’s what a routine does; it makes us efficient.
Emergency procedures often rely on algorithms. You take the emergency card out of the packet in front of you in the plane. On it are a series of pictographs showing how to open the emergency door and extend the slide. In some cases, you might not even see words, but the pictures convey the procedure required to perform the task and solve the problem of getting out of the plane in a hurry. Throughout this book, you see the same three elements for every algorithm:
Describe the problem.
Create a series of steps to solve the problem (well defined).
Perform the steps to obtain a desired result (finite and effective).
The term computer sounds quite technical and possibly a bit overwhelming to some people, but people today are neck deep (possibly even deeper) in computers. You wear at least one computer, your smartphone, most of the time. If you have any sort of special device, such as a pacemaker, it also includes a computer. Your smart TV contains at least one computer, as does your smart appliance. A car can contain as many as 30 computers in the form of embedded microprocessors that regulate fuel consumption, engine combustion, transmission, steering, and stability (according to a New York Times article at http://www.nytimes.com/2010/02/05/technology/05electronics.html) and more lines of code than a jet fighter. The automated cars appearing in the car market will require even more embedded microprocessors and algorithms of greater complexity. A computer exists to solve problems quickly and with less effort than solving them manually. Consequently, it shouldn’t surprise you that this book uses still more computers to help you understand algorithms better.
Computers vary in a number of ways. The computer in your watch is quite small; the one on your desktop quite large. Supercomputers are immense and contain many smaller computers all tasked to work together to solve complex issues, such as tomorrow’s weather. The most complex algorithms rely on special computer functionality to obtain solutions to the issues people design them to solve. Yes, you could use lesser resources to perform the task, but the trade-off is waiting a lot longer for an answer or getting an answer that lacks sufficient accuracy to provide a useful solution. In some cases, you wait so long that the answer is no longer important. With the need for both speed and accuracy in mind, the following sections discuss some special computer features that can affect algorithms.
General-purpose processors, CPUs, started out as a means to solve problems using algorithms. However, their general-purpose nature also means that a CPU can perform a great many other tasks, such as moving data around or interacting with external devices. A general-purpose processor does many things well, which means that it can perform the steps required to complete an algorithm, but not necessarily fast. In fact, owners of early general-purpose processors could add math coprocessors (special math-specific chips) to their systems to gain a speed advantage (see http://www.computerhope.com/jargon/m/mathcopr.htm for details). Today, general-purpose processors have the math coprocessor embedded into them, so when you get an Intel i7 processor, you actually get multiple processors in a single package.
Interestingly enough, Intel still markets specialty processor add-ons, such as the Xeon Phi processor used with the Xeon chips (see http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-detail.html and https://en.wiki2.org/wiki/Intel_Xeon_Phi for details). You use the Xeon Phi chip alongside a Xeon chip when performing compute-intensive tasks such as machine learning (see Machine Learning For Dummies, by John Mueller and Luca Massaron, for details on how machine learning uses algorithms to determine how to perform various tasks that help you use data to predict the unknown and to organize information meaningfully).
You may wonder why this section mentions Graphics Processing Units (GPUs). After all, GPUs are supposed to take data, manipulate it in a special way, and then display a pretty picture onscreen. Any computer hardware can serve more than one purpose. It turns out that GPUs are particularly adept at performing data transformations, which is a key task for solving algorithms in many cases. A GPU is a special-purpose processor, but one with capabilities that lend themselves to faster algorithm execution. It shouldn’t surprise you to discover that people who create algorithms spend a lot of time thinking outside the box, which means that they often see methods of solving issues in nontraditional approaches.
The point is that CPUs and GPUs form the most commonly used chips for performing algorithm-related tasks. The first performs general-purpose tasks quite well, and the second specializes in providing support for math-intensive tasks, especially those that involve data transformations. Using multiple cores makes parallel processing (performing more than one algorithmic step at a time) possible. Adding multiple chips increases the number of cores available. Having more cores adds speed, but a number of factors keeps the speed gain to a minimum. Using two i7 chips won’t produce double the speed of just one i7 chip.
A math coprocessor and a GPU are two examples of common special-purpose chips in that you don’t see them used to perform tasks such as booting the system. However, algorithms often require the use of uncommon special-purpose chips to solve problems. This isn’t a hardware book, but spending a little time looking around can show you all sorts of interesting chips, such as the new artificial neurons that IBM is working on (see the story at http://www.computerworld.com/article/3103294/computer-processors/ibm-creates-artificial-neurons-from-phase-change-memory-for-cognitive-computing.html). Imagine performing algorithmic processing using memory that simulates the human brain. It would create an interesting environment for performing tasks that might not otherwise be possible today.
Neural networks, a technology that is used to simulate human thought and make deep learning techniques possible for machine learning scenarios, are now benefitting from the use of specialized chips, such as the Tesla P100 from NVidia (see the story at https://www.technologyreview.com/s/601195/a-2-billion-chip-to-accelerate-artificial-intelligence/ for details). These kinds of chips not only perform algorithmic processing extremely fast, but learn as they perform the tasks, making them faster still with each iteration. Learning computers will eventually power robots that can move (after a fashion) on their own, akin to the robots seen in the movie I Robot (see one such robot described at http://www.cbsnews.com/news/this-creepy-robot-is-powered-by-a-neural-network/). There are also special chips that perform tasks such as visual recognition (see https://www.technologyreview.com/s/537211/a-better-way-to-build-brain-inspired-chips/ for details).
No matter how they work, specialized processors will eventually power all sorts of algorithms that will have real-world consequences. You can already find many of these real-world applications in a relatively simple form. For example, imagine the tasks that a pizza-making robot would have to solve — the variables it would have to consider on a real-time basis. This sort of robot already exists (this is just one example of the many industrial robots used to produce material goods by employing algorithms), and you can bet that it relies on algorithms to describe what to do, as well as on special chips to ensure that the tasks are done quickly (see the story at http://www.bloomberg.com/news/articles/2016-06-24/inside-silicon-valley-s-robot-pizzeria).
Eventually, it might even be possible to use the human mind as a processor and output the information through a special interface. Some companies are now experimenting with putting processors directly into the human brain to enhance its ability to process information (see the story at https://www.washingtonpost.com/news/the-switch/wp/2016/08/15/putting-a-computer-in-your-brain-is-no-longer-science-fiction/ for details). Imagine a system in which humans can solve algorithms at the speed of computers, but with the creative “what if” potential of humans.
Unless you have unlimited funds, using some algorithms effectively may not be possible, even with specialized chips. In that case, you can network computers together. Using special software, one computer, a master, can use the processors of all slave computers running an agent (a kind of in-memory background application that makes the processor available). Using this approach, you can solve incredibly complex problems by offloading pieces of the problem to a number of slave computers. As each computer in the network solves its part of the problem, it sends the results back to the master, which puts the pieces together to create a consolidated answer, a technique called cluster computing.
Lest you think this is the stuff of science fiction, people are already using cluster computing techniques in all sorts of interesting ways. For example, the article at http://www.zdnet.com/article/build-your-own-supercomputer-out-of-raspberry-pi-boards/ details how you can build your own supercomputer by combining multiple Raspberry Pi (https://www.raspberrypi.org/) boards into a single cluster.
Distributed computing, another version of cluster computing (but with a looser organization) is also popular. In fact, you can find a list of distributed computing projects at http://www.distributedcomputing.info/projects.html. The list of projects includes some major endeavors, such as Search for Extraterrestrial Intelligence (SETI). You can also donate your computer’s extra processing power to work on a cure for cancer. The list of potential projects is amazing.
Networks also let you access other people’s processing power in an unattached form. For example, Amazon Web Services (AWS) and other vendors provide the means to use their computers to perform your work. A network connection can make the remote computers feel as if they’re part of your own network. The point is that you can use networking in all sorts of ways to create connections between computers to solve a variety of algorithms that would be too complicated to solve using just your system.
Part of solving an algorithm has nothing to do with processing power, creative thinking outside the box, or anything of a physical nature. To create a solution to most problems, you also need data on which to base a conclusion. For example, in the toast-making algorithm, you need to know about the availability of bread, a toaster, electricity to power the toaster, and so on before you can solve the problem of actually making toast. The data becomes important because you can’t finish the algorithm when missing even one element of the required solution. Of course, you may need additional input data as well. For example, the person wanting the toast may not like rye. If this is the case and all you have is rye bread to use, the presence of bread still won’t result in a successful result.
Data comes from all sorts of sources and in all kinds of forms. You can stream data from a source such as a real-time monitor, access a public data source, rely on private data in a database, scrape the data from websites, or get it in myriad other ways too numerous to mention here. The data may be static (unchanging) or dynamic (constantly changing). You may find that the data is complete or missing elements. The data may not appear in the right form (such as when you get imperial units and require metric units when solving a weight problem). The data may appear in a tabular format when you need it in some other form. It may reside in an unstructured way (for instance in a NoSQL database or just in a bunch of different data files) when you need the formal formatting of a relational database. In short, you need to know all sorts of things about the data used with your algorithm in order to solve problems with it.
Because data comes in so many forms and you need to work with it in so many ways, this book pays a lot of attention to data. Starting in Chapter 6, you discover just how data structure comes into play. Moving on to Chapter 7, you begin looking at how to search through data to find what you need. Chapters 12 through 14 help you work with big data. However, you can find some sort of data-specific information in just about every chapter of the book because without data, an algorithm can’t solve any problems.
Humans think about data in nonspecific ways and apply various rules to the same data to understand it in ways that computers never can. A computer’s view of data is structured, simple, uncompromising, and most definitely not creative. When humans prepare data for a computer to use, the data often interacts with the algorithms in unexpected ways and produces undesirable output. The problem is one in which the human fails to appreciate the limited view of data that a computer has. The following sections describe two aspects of data that you see illustrated in many of the chapters to follow.
A computer has a simple view of data, but it’s also a view that humans typically don’t understand. For one thing, everything is a number to a computer because computers aren’t designed to work with any other kind of data. Humans see characters on the computer display and assume that the computer interacts with the data in that manner, but the computer doesn’t understand the data or its implications. The letter A is simply the number 65 to the computer. In fact, it’s not truly even the number 65. The computer sees a series of electrical impulses that equate to a binary value of 0100 0001.
Computers also don’t understand the whole concept of uppercase and lowercase. To a human, the lowercase a is simply another form of the uppercase A, but to a computer they’re two different letters. A lowercase a appears as the number 97 to the computer (a binary value of 0110 0001).
If these simple sorts of single letter comparisons could cause such problems between humans and computers, it isn’t hard to imagine what happens when humans start assuming too much about other kinds of data. For example, a computer can’t hear or appreciate music. Yet, music comes out of the computer speakers. The same holds true for graphics. A computer sees a series of 0s and 1s, not a graphic containing a pretty scene of the countryside.
It’s important to consider data from the computer’s perspective when using algorithms. The computer sees only 0s and 1s, nothing else. Consequently, when you start working through the needs of the algorithm, you must view the data in that manner. You may actually find it beneficial to know that the computer’s view of data makes some solutions easier to find, not harder. You discover more about this oddity in viewing data as the book progresses.
Computers also have a strict idea about the form and structure of data. When you begin working with algorithms, you find that a large part of the job involves making the data appear in a form that the computer can use when using the algorithm to find a solution to an issue. Although a human can mentally see patterns in data that isn’t arranged precisely right, computers really do need the precision to find the same pattern. The benefit of this precision is that computers can often make new patterns visible. In fact, that’s one of the main reasons to use algorithms with computers — to help locate new patterns and then use those patterns to perform other tasks. For example, a computer may recognize a customer’s spending pattern so that you can use the information to generate more sales automatically.