45,99 €
The Python Book Discover the power of one of the fastest growing programming languages in the world with this insightful new resource The Python Book delivers an essential introductory guide to learning Python for anyone who works with data but does not have experience in programming. The author, an experienced data scientist and Python programmer, shows readers how to use Python for data analysis, exploration, cleaning, and wrangling. Readers will learn what in the Python language is important for data analysis, and why. The Python Book offers readers a thorough and comprehensive introduction to Python that is both simple enough to be ideal for a novice programmer, yet robust to be useful for those more experienced in the language. The book assists budding programmers to gradually increase their skills as they move through the book, always with an understanding of what they are covering and why it is useful. Used by major companies like Google, Facebook, Instagram, Spotify, and more, Python promises to remain central to the programming landscape for years to come. Containing a thorough discussion of Python programming topics like variables, equalities and comparisons, tuple and dictionary data types, while and for loops, and if statements, readers will also learn: * How to use highly useful Python programming libraries, including Pandas and Matplotlib * How to write Python functions and classes * How to write and use Python scripts * To deal with different data types within Python Perfect for statisticians, computer scientists, software programmers, and practitioners working in private industry and medicine, The Python Book will also be of interest to students in any of the aforementioned fields. As it assumes no programming experience or knowledge, the book is ideal for those who work with data and want to learn to use Python to enhance their work.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 221
Veröffentlichungsjahr: 2022
Cover
Title Page
Copyright
1 Introduction
2 Getting Started
3 Packages and Builtin Functions
4 Data Types
5 Operators
6 Dates
7 Lists
8 Tuples
9 Dictionaries
10 Sets
11 Loops, if, Else, and While
12 Strings
13 Regular Expressions
14 Dealing with Files
14.1 Excel
14.2 JSON
14.3 XML
15 Functions and Classes
16 Pandas
16.1 Numpy Arrays
16.2 Series
16.3 DataFrames
16.4 Merge, Join, and Concatenation
16.5 DataFrame Methods
16.6 Missing Data
16.7 Grouping
16.8 Reading in Files with Pandas
17 Plotting
17.1 Pandas
17.2 Matplotlib
17.3 Seaborn
18 APIs in Python
19 Web Scraping in Python
19.1 An Introduction to HTML
19.2 Web Scraping
20 Conclusion
Index
End User License Agreement
Chapter 2
Figure 2.1 Anaconda navigator.
Figure 2.2 Jupyter Notebook.
Figure 2.3 Jupyter Notebook example.
Figure 2.4 Qt Console.
Figure 2.5 Qt Console example.
Figure 2.6 Command line example.
Chapter 15
Figure 15.1 Spyder IDE.
Figure 15.2 Run file in Spyder.
Chapter 17
Figure 17.1 Line plot of sepal length.
Figure 17.2 Histogram of sepal length.
Figure 17.3 Boxplot of sepal length.
Figure 17.4 Density plot of sepal length.
Figure 17.5 Area plot of sepal length.
Figure 17.6 Histogram of sepal length.
Figure 17.7 KDE of sepal length.
Figure 17.8 Line plot of sepal length.
Figure 17.9 Box plot of iris data.
Figure 17.10 Density plot on iris data.
Figure 17.11 Line plot on iris data.
Figure 17.12 Scatter plot on pandas data frame.
Figure 17.13 Area plot of iris DataFrame.
Figure 17.14 Histogram of iris DataFrame.
Figure 17.15 KDE of iris DataFrame.
Figure 17.16 Pie plot of tip size by day.
Figure 17.17 Barh plot of tips data by day.
Figure 17.18 Iris plot.
Figure 17.19 Panel plot example one.
Figure 17.20 Panel plot example two.
Figure 17.21 Plot with custom line colour.
Figure 17.22 Plot with custom linetype.
Figure 17.23 Plot with custom colour linetype.
Figure 17.24 Plot with limits altered.
Figure 17.25 Plot with reverse limits.
Figure 17.26 Plot with labels.
Figure 17.27 Plot with legend.
Figure 17.28 Scatter plots with different markers.
Figure 17.29 Scatter plot with different sizes.
Figure 17.30 Scatter using replot in seaborn.
Figure 17.31 Scatter plot using replot in seaborn with a third variable.
Figure 17.32 Scatter plot using replot in seaborn with a third variable and ...
Figure 17.33 Scatter plot using replot in seaborn with hue on size.
Figure 17.34 Scatter plot using replot in seaborn with a third variable usin...
Figure 17.35 Scatter plot using replot in seaborn with different size of poi...
Figure 17.36 Line plot in Seaborn using replot.
Figure 17.37 Line plot in replot with mean and confidence interval.
Figure 17.38 Line plot in replot with mean and no confidence interval.
Figure 17.39 Line plot in replot with mean and standard deviation.
Figure 17.40 Line plot in replot with hue applied.
Figure 17.41 Line plot in replot with hue and style applied.
Figure 17.42 Line plot in replot with hue and style applied on the dots data...
Figure 17.43 Multi scatter plot on tips data.
Figure 17.44 Multi line plot with rows and columns using fmri dataset.
Figure 17.45 Multi line plot with rows and columns using reduced fmri datase...
Figure 17.46 Multiline plot using col wrap.
Figure 17.47 Catplot of day against total bill from the tips dataset.
Figure 17.48 Catplot of day against total bill from the tips dataset with ki...
Figure 17.49 Catplot of day against total bill from the tips dataset with ki...
Figure 17.50 Catplot of size against total bill.
Figure 17.51 Catplot of smoker against tip using order argument.
Figure 17.52 Catplot of total bill against day with swarm and hue of time.
Figure 17.53 Boxplot using catplot.
Figure 17.54 Boxplot using catplot with a hue.
Figure 17.55 Boxen plot using catplot.
Figure 17.56 Violin plot using catplot.
Figure 17.57 Violin plot using catplot using a split on the hue.
Figure 17.58 Bar plot using catplot.
Figure 17.59 Count plot using catplot.
Figure 17.60 Boxplot of iris data.
Figure 17.61 Multiple plots with col in catplot.
Figure 17.62 Histogram with KDE.
Figure 17.63 Histogram with ruglplot.
Figure 17.64 Histogram with bins option set.
Figure 17.65 Joint plot.
Figure 17.66 Pairplot example using iris data.
Chapter 18
Figure 18.1 Example of flask‐restful download page.
Figure 18.2 Display of terminal window upon starting up API.
Figure 18.3 Display of API from browser.
Figure 18.4 Display of API from browser getting all films.
Figure 18.5 Display of API from browser getting film id 1.
Figure 18.6 Display of API from browser getting film id 3.
Chapter 19
Figure 19.1 Display of website from terminal.
Figure 19.2 Display of website from browser.
Figure 19.3 Display of website from with lowercase decorator applied.
Figure 19.4 Display of website from browser with html.
Figure 19.5 Display of website from browser with h1 hello world.
Figure 19.6 Display of website from browser showing a table.
Figure 19.7 Display of website from browser showing a table with customisati...
Figure 19.8 Display of website from browser showing a table with header and ...
Cover Page
Table of Contents
Title Page
Copyright
Begin Reading
Index
Wiley End User License Agreement
iv
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
29
30
31
32
33
34
35
36
37
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
257
258
259
260
261
262
Rob Mastrodomenico
Global Sports StatisticsSwindon, United Kingdom
This edition first published 2022
© 2022 John Wiley and Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Rob Mastrodomenico to be identified as the authors of this work has been asserted in accordance with law.
Registered Office
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
The contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data
Names: Mastrodomenico, Rob, author.
Title: The Python book / Rob Mastrodomenico.
Description: Hoboken, NJ : Wiley, 2022. | Includes bibliographical
references and index.
Identifiers: LCCN 2021040056 (print) | LCCN 2021040057 (ebook) | ISBN
9781119573319 (paperback) | ISBN 9781119573395 (adobe pdf) | ISBN
9781119573289 (epub)
Subjects: LCSH: Python (Computer program language)
Classification: LCC QA76.73.P98 M379 2022 (print) | LCC QA76.73.P98
(ebook) | DDC 005.13/3‐‐dc23
LC record available at https://lccn.loc.gov/2021040056
LC ebook record available at https://lccn.loc.gov/2021040057
Cover Design: Wiley
Cover Image: © shuoshu/Getty Images
Welcome to The Python Book, over the following pages you will be given an insight into the Python language. The genesis of this book has come from my experience of using and more importantly teaching Python over the last 10 years. With my background as a Data Scientist, I have used a number of different programming languages over the course of my career and Python being the one that has stuck with me. Why Python? For me I enjoy Python because its fast to develop with and covers many different application allowing me to use Python for pretty much everything. However for you the reader, Python is a great choice of language to learn as its easy to pick up and fast to get going with which means that for the novice programmers they can feel like they are making progress. This book is not just for complete novices, if you have some experience with Python, then this book is a great reference. The fact that you can pick up Python quickly means that many users skip the basics. This book looks to cover all the basics giving you the building blocks to do great things with the language. What this book is not intended to do is over complicating anything. Python is beautiful in its simplicity and this book looks to stick to that approach. Concepts will be explained in simple terms and examples will be used to show how to practically use the introduced concepts.
Now having discussed what this book is intended to do, what is Python? Simply put Python is a programming language, its general purpose meaning that it can do lots of things. In this book, we will specialise in applying Python to data‐driven applications, however Python can be used for many other applications including AI, machine learning, web development, to name just a few. The language itself is of high level and also interpreted meaning that code need not be compiled before running. One of the big attractions to the language is the simplicity of its syntax, which makes it great to learn and even better to write code. Aside from the clear, easy to understand syntax, the language makes use of indentation as an important tool to distinguish different elements of the code. Python is an object‐orientated language and we will demonstrate this in more detail throughout this book. However, you can write Python code how you prefer be it object orientated, functional or interactively. The best way to demonstrate Python is by doing, so let's get started but to do so we need to get Python installed.
For the purposes of this book, we want you to install the Anaconda distribution of Python that is available at https://www.anaconda.com. Here, you have distributions for Windows, Mac, and Linux, which can be easily installed on your computer. Once you have the Anaconda installed, you will have access to the Anaconda navigator as shown in Figure 2.1.
Here, you get the following included by default:
JupyterLab
Notebook
Qt Console
Spyder
To follow the examples within this book you can use the Notebook or Qt Console. The Notebook is an interactive web based editor as shown in Figure 2.2.
Here, you can type your code, run the command, and then see the result, which is a nice way to work and is very popular. Here, we will show how we can define a variable x and then just type x and run the command with the run button to show the result (Figure 2.3).
However for the purposes of the book we will use a console‐based view that you can easily obtain through the Qt Console. An example is shown in Figure 2.4.
Like with the notebook, we show the same example using Qt Console in Figure 2.5.
Within this book we will denote anything that is an input with and with any output having no arrows preceding it (Figure 2.6).
Another concept that the reader will need to be familiar with is the ability to navigate using the terminal (linux systems including mac) or command prompt (windows). These can be obtained through various approaches but simply using the search procedures with the word terminal or command prompt will bring up the relevant screen. To navigate through the file system you can use the command cd to change directory. This essentially is like us clicking on a folder to see what is in it. Unlike using a file viewing interface you cannot see what is in a given directory by default so to do so you need to use the command ls. This command lists the files and directories within the current locations. Let's demonstrate with an example of navigating to a directory and then running a python file.
Aside from the Anaconda navigator we have over 250 open‐source data science and machine learning packages are automatically installed. You can also make use of the conda installer to install over 7500 packages easily into Python. A full list of packages that come with Anaconda is available for the relevant operating system from https://repo.anaconda.com/pkgs/. Details on the using the conda installer is available from https://docs.anaconda.com/anaconda/user-guide/tasks/install-packages/ however this is outside the scope of this book. The last concept we will raise but not cover in detail is that of virtual environments. This concept is where the user develops in an isolated Python environment and adds packages as needed. It is a very popular approach to development however as this book is aimed at beginners we use all packages included in the Anaconda installation.
Figure 2.1 Anaconda navigator.
Figure 2.2 Jupyter Notebook.
Figure 2.3 Jupyter Notebook example.
Figure 2.4 Qt Console.
Figure 2.5 Qt Console example.
Figure 2.6 Command line example.
We have discussed packages without really describing what they are so let's look at packages and how it sits within the general setup of Python. As mentioned previously, Python is object orientated which means that everything is an object, you'll get to understand this in practice, however there are a few important builtin functions which aren't objects and they are worth mentioning here as they will be used within the book. These builtin types will be used throughout the book so keep an eye out for them. Below we show some useful ones, for a full list refer to https://docs.python.org/3/library/functions.html.
dir(): This function takes in an object and returns the _dir_() of that object giving us the attributes of the object.
float(): Returns a floating point number from an integer of string
int(): Returns an integer from a float of string
len(): Returns the length of an object
list(): Creates a list from the argument given
max(): Gives the maximum value from the argument provided
min(): Gives the minimum value from the argument provided
print(): Prints the object to the text stream
round(): Rounds the number to a specified precision
str(): Converts the object to type string
type(): Returns the type of an object
abs(): Returns the absolute value of a numeric value passed in
help(): Gives access to the Python help system
Now if you are unfamiliar with the Python the concepts used above they will be introduced throughout this book.
Alongside these builtin functions Python also comes with a number of packages. These packages perform specific tasks and are imported into our code. Python has a number of packages that come as default however there are lots of third‐party packages which we can also use. In using the Anaconda distribution we get all the default packages as well as the packages that are described previously. We will cover both default and third‐party packages throughout this book. To demonstrate this we will introduce how to import a package. The package we are going to introduce is datetime which is part of the standard Python library. What this means is it comes with the Python and is not developed by a third party. Now to import the datetime package you just need to type the following:
In doing this we now have access to everything within datetime and to see what datetime contains we can run the built in function dir which as we showed earlier gives us the attribute of the object.
Now if we want to see what these attributes are we use the dot syntax to access attributes of the object. So to see what MINYEAR and MAXYEAR are we can do so as follows.
Now we can import specific parts of a package by using the from syntax as demonstrate below.
So what this says is from the package datetime import the specific date attribute. This is then the only aspect of datetime that we have access to. This is good practice to only import what you need from a package. Now every time we want to use date we have to call date, in this case its easy enough but you can also give the import an alias which can reduce your code down.
That is the basics of importing packages, throughout this book we will import from various packages as well as show how this same syntax can be used to import our own code. Alongside builtin functions these are key concepts that we will use extensively within this book.
The next concept of Python that we will introduce is data types and in this chapter we will introduce a number of these and show how they behave when applied to some basic operators. We first start by introducing integers which are a number without a decimal point, written as follows:
A float is by definition a floating point number so we can write the previous as follows:
A string is simply something enclosed in either a double or single quote. So again we can rewrite what we have seen as follows:
Given the fact that we know how to define these variables, how can we check what they are? Well, conveniently Python has a type method that will allow us to determine the type of a variable. So we will rewrite what we have done and assign each instance to a variable and then see what type Python thinks they are
So now we can define the variables, the question is what can we do with them? Initially we will consider the following operations:
+
−
*
/
These are commonly known as the mathematical operation: addition, subtraction, multiplication, and division.
So let's start with + now if we have two integers applying + is mathematical addition as we will show
Similarly if we do the same with two floats we get a similar result
But what happens if we apply addition to a float and an integer, let's see
What we see is that addition works on a float and an integer but it returns a float, so it's converting the integer into a float.
What if we use addition on a string? Well this is the interesting part, let's run the same example from before with x and y as string representations.
What has happened here? Well we have stuck together x and y, this is known as concatenation and is a very powerful tool in dealing with strings.
We considered the + operation with integers and floats but what will happen if we do the + operation with a string and say an integer
What we see here is an error message saying we cannot concatenate a str and int object. So Python in this instance wants to use the + operation as concatenation but due to the fact it doesn't have two strings it can't do that and hence throws an error. In Python you cannot mix a string and integer or string and float so we won't consider operations between these types for the rest of this section.
Let us now look at the operation. First considering two integers we get the following:
As you may have expected the operation with two integers acts as mathematical subtraction. If we apply it to two floats, or to a mix of floats and integers it acts as subtraction.
What about for strings, can we apply to two strings?
Here we get another error but this time it is because the operation doesn't support strings. What this means is that when you try to operate on two strings using this operation, it doesn't know what to do. The same is true for * and / operations on string. So, if we are dealing with strings the only operation from this section that we can use is + which is concatenation.
The next operation we will consider is * which is generally known as mathematical multiplication to most. So considering its use on two integers we get the following:
As we can see its mathematical multiplication, the same is true when we run the same on two floats. Let us see what happens when we mix floats and integers.
As we can see it returns multiplication in float format, so like with addition and subtraction it converts integers to floats.
Next, we need to see how / operation works on integers and floats, so first we consider the same types, so we will apply / on integers:
There are other data types beyond these and the first we consider are complex numbers which can be defined as follows
We can obtain the real and imaginary parts of our complex numbers as follows
We can also use the built‐in function complex
