Streamlining Your Research Laboratory with Python - Mark F. Russo - E-Book

Streamlining Your Research Laboratory with Python E-Book

Mark F. Russo

0,0
85,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Enables scientists and researchers to efficiently use one of the most popular programming languages in their day-to-day work

Streamlining Your Research Laboratory with Python covers the Python programming language and its ecosystem of tools applied to tasks encountered by laboratory scientists and technicians working in the life sciences. After opening with the basics of Python, the chapters move through working with and analyzing data, generating reports, and automating the lab environment.

The book includes example processes within chapters and code listings on nearly every page along with schematics and plots that can clearly illustrate Python at work in the lab. The book also explores some real-world examples of Python’s application in research settings, demonstrating its potential to streamline processes, improve productivity, and foster innovation.

Streamlining Your Research Laboratory with Python includes information on:

  • Language basics including the interactive console, data types, variables and literals, strings, and expressions using operators
  • Custom functions and exceptions such as arguments and parameters, names and scope, and decorators
  • Conditional and repeated execution as methods to control the flow of a program
  • Tools such as JupyterLab, Matplotlib, NumPy, pandas DataFrame, and SciPy
  • Report generation in Microsoft Word and PowerPoint, PDF report generation, and serving results through HTTP and email automatically

Whether you are a biologist analyzing genetic data, a chemist scouting synthesis routes, an engineer optimizing machine parameters, or a social scientist studying human behavior, Streamlining Your Research Laboratory with Python serves as a logical and practical guide to add Python to your research toolkit.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 751

Veröffentlichungsjahr: 2025

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Table of Contents

Title Page

Copyright

Dedication

Preface

Chapter 1: Introduction

1.1 Python Implementations

1.2 Installing the Python Toolkit

1.3 Python 3 vs. Python 2

1.4 Python Package Index

1.5 Programming Editors

1.6 Notebook Editors

1.7 Using the Jupyter Notebook Interface

1.8 JupyterLite

1.9 Things Change

1.10 Key Takeaways

Chapter 2: Language Basics

2.1 Python Interactive Console

2.2 Data Types

2.3 Variables and Literals

2.4 Strings

2.5 Expressions Using Operators

2.6 Functions and How to Use Them

2.7 Your First Python Program

2.8 Key Takeaways

Chapter 3: Data Structures

3.1 Lists

3.2 Tuples

3.3 Dictionaries

3.4 Sets

3.5 Destructuring Assignment

3.6 Key Takeaways

Chapter 4: Controlling the Flow of a Program

4.1 Conditional Execution

4.2 Repeated Execution

4.3 Key Takeaways

Chapter 5: Custom Functions and Exceptions

5.1 Defining Custom Functions

5.2 Arguments and Parameters

5.3 Names and Scope

5.4 Scope vs. Namespace

5.5 Organizing Your Code with Modules

5.6 Decorators

5.7 How Things Go Wrong

5.8 Python Exceptions

5.9 Handling Exceptions

5.10 Raising Your Own Exceptions

5.11 Key Takeaways

Chapter 6: Regular Expressions

6.1 Matching Literal Text

6.2 Alternation

6.3 Defining and Matching Character Classes

6.4 Metaclasses

6.5 Pattern Sequences

6.6 Repeating Patterns with Quantifiers

6.7 Anchors

6.8 Capturing Groups

6.9 Regular Expressions in Python

6.10 Project – A Formula Mass Calculator

6.11 Key Takeaways

Chapter 7: Working with Data

7.1 A File System Primer

7.2 Text Files

7.3 Reading and Writing Text Files

7.4 Working with Comma-Separated Values (CSV) Files

7.5 The csv Module

7.6 Reading and Writing Excel Spreadsheet

7.7 Project – Generate a Random Sample Layout in a Spreadsheet

7.8 Project – Forecast Monthly Sample Processing

7.9 Managing the File System

7.10 Walking a File System Tree

7.11 Project – Find Duplicate Files

7.12 Working with Zip Files

7.13 Working with Standard Data Formats

7.14 Key Takeaways

Chapter 8: Web Resources

8.1 TCP/IP Networks – What You Need to Know

8.2 Introduction to Hypertext Transfer Protocol

8.3 Web Services and the Python Requests Module

8.4 Project – Print Weather Forecast for a Location

8.5 Project – Scraping HTML Page Content

8.6 Key Takeaways

Chapter 9: Data Analysis and Visualization

9.1 JupyterLab

9.2 Scientific Plotting with Matplotlib

9.3 NumPy – Numerical Python

9.4

pandas

DataFrame

9.5 SciPy – A Library for Mathematics, Science, and Engineering

9.6 Key Takeaways

Chapter 10: Report Generation

10.1 BytesIO Object

10.2 Generating Reports in Microsoft Word

10.3 Generating Microsoft PowerPoint Presentations

10.4 Generating PDF File Reports

10.5 Sending Email Programmatically

10.6 Serving Results with an HTTP Server

10.7 Key Takeaways

Chapter 11: Control and Automation

11.1 Concurrency in Python

11.2 Asynchronous Execution

11.3 Concurrent Programs with AsyncIO

11.4 Asynchronous Instrument Control and Coordination

11.5 Communicating over a Serial Port

11.6 Execute Remote Commands over HTTP

11.7 Persistent Network Connections using a WebSocket

11.8 Responding to File System Changes

11.9 Executing Tasks on a Schedule

11.10 Key Takeaways

Postface

References

Appendix A: ASCII American Standard Code for Information Interchange

Index

End User License Agreement

List of Illustrations

Chapter 1

Figure 1.1 The Python Interactive Console running in Windows PowerShell.

Figure 1.2 Jupyter notebook architecture.

Figure 1.3 JupyterLab in action.

Figure 1.4 JupyterLite demonstration.

Chapter 2

Figure 2.1 Python Interactive Console.

Chapter 3

Figure 3.1 Diagram of a list with positive and negative indexes.

Chapter 4

Figure 4.1 Schematic diagram of a basic if-statement.

Figure 4.2 Schematic of an if-else statement.

Figure 4.3 Schematic of an if-elif-else statement.

Figure 4.4 Schematic of a conditional expression.

Figure 4.5 Schematic of a while-statement.

Figure 4.6 Unit square with inscribed unit circle for the Monte Carlo dartboard...

Figure 4.7 Schematic of a for-statement.

Figure 4.8 Flow of a

break

and

continue

in a for-statement.

Figure 4.9 Schematic of a list comprehension.

Figure 4.10 Squares accumulation on the left, rewritten as a list comprehension...

Chapter 5

Figure 5.1 Schematic diagram of a Python function definition.

Figure 5.2 Schematic of a typical 96-well microtiter plate.

Figure 5.3 A basic try-statement.

Figure 5.4 A full try-statement with all options.

Chapter 7

Figure 7.1 Sample file system hierarchy.

Figure 7.4 Sample throughput recorded in a spreadsheet.

Figure 7.5 Results from the Monte Carlo simulation of sample throughput.

Chapter 8

Figure 8.1 Schematic of two computers connected over a TCP/IP network.

Figure 8.2 Image files returned from running

fetch_and_save_image.py

with the...

Chapter 9

Figure 9.1 A new notebook opened in JupyterLab with a Python 3 kernel.

Figure 9.6 Output plot generated by the updated

weather_forecast.py

program.

Figure 9.7 A

matplotlib

pcolormap

visualization with default axis formatting.

Figure 9.9 Custom heatmap illustration.

Figure 9.19 Illustration of the parameters of the 4PL model.

Chapter 10

Figure 10.1 Generated Word document.

Figure 10.2 Generated Word document containing a paragraph with styled text runs.

Figure 10.3 Generated document with the “heat1.png” file inserted.

Figure 10.4 Generated document with an added Table.

Figure 10.5 Generated Presentation with a blank slide and a single picture shape.

Figure 10.6 Generated Presentation with an added Table.

Figure 10.7 One Figure slide from the generated presentation.

Figure 10.8 Generated PDF with drawn lines and text.

Figure 10.9 Output PDF document generated by Listing 10.16.

Figure 10.10 PDF file with an image.

Figure 10.11 First page of the generated PDF report file.

Figure 10.12 Example page returned by the http.server module’s default HTTP server.

Figure 10.13 Example page from the

http.server

bound to a local network address.

Chapter 11

Figure 11.1 An integrated laboratory automation system.

Figure 11.2 USB plug and port diagrams.

Figure 11.3 Common RS-232C physical port designs.

Figure 11.4 Three sample barcodes.

Figure 11.5 Listing returned from the HTTP server started using the program in L...

Figure 11.6 Response from invoking a URL with the path /count_files.

Figure 11.7 Simple browser client for sending messages to a server using a WebSocket.

Figure 11.8 First browser window connected to the broadcast server.

Figure 11.9 Second browser window connected to the broadcast server.

Figure 11.10 Browser client for testing the Python client.

Figure 11.11 Browser interface for interactively scheduling samples in the asynch...

Figure 11.12 Sample Scheduler user interface used to schedule four samples and re...

List of Tables

Chapter 2

Table 2.1 Fundamental Python data types.

Table 2.2 Python escape characters and their meanings.

Table 2.3 Examples of common f-string format specifiers.

Table 2.4 Common string methods.

Table 2.5 Common Python binary operators.

Table 2.6 Common assignment and augmented assignment operators.

Table 2.7 Common Python comparison operators.

Table 2.8 Python Boolean operators.

Table 2.9 Truth table for “and” operator (logical conjunction).

Table 2.10 Truth table for “or” operator (logical disjunction).

Table 2.11 Truth table for “not” operator (negation).

Table 2.12 Truth table for missing operator

exclusive-or

.

Table 2.13 Common Python global functions.

Table 2.14 Common mathematical functions and constants available from the

math

module.

Table 2.15 Useful

random

module functions.

Table 2.16 Useful functions and properties of the

time

module.

Table 2.17

datetime

type functions to create new

datetime

objects.

Table 2.18

datetime

object methods.

Table 2.19 Several useful

datetime

object format specifiers.

Table 2.20 Useful functions and properties of the Python

sys

module.

Chapter 3

Table 3.1 Several operators that accept lists as operands.

Table 3.2 Several list methods.

Table 3.3 Tuple methods.

Table 3.4 Operators that accept dictionaries as operands.

Table 3.5 Several dictionary methods.

Table 3.6 Operators that accept sets as operands.

Table 3.7 Several set methods.

Chapter 5

Table 5.1 Common Python exception types.

Chapter 6

Table 6.1 Non-glyph characters and their matching escape sequence.

Table 6.2 Regular expression character classes.

Table 6.3 Regular expression metaclasses.

Table 6.4 Sample regular expression patterns and descriptions, with matching a...

Table 6.5 Regular expression quantifiers.

Table 6.6 Regular expression quantifiers with examples.

Table 6.7 Greedy and non-greedy regular expression pattern matching.

Table 6.8 Several useful Pattern object methods.

Chapter 7

Table 7.1 A few sample code points and their UTF-8 encodings.

Table 7.2 File open mode characters.

Table 7.3 Common file object fields and methods.

Table 7.4 Common properties and methods of the

openpyxl

Workbook object.

Table 7.5 Common properties and methods of the

openpyxl

Worksheet object.

Table 7.6 Common properties and methods of the

openpyxl

Cell object.

Table 7.7 Methods and properties describing a path.

Table 7.8 Attributes of the object returned by invoking

stat()

on a Path.

Table 7.9 Methods and Properties to operate on a Path.

Table 7.10 Useful functions of the

shutil

module.

Table 7.11 Several methods and properties of the zipfile.ZipFile object.

Table 7.12 Several methods and properties of the

zipfile

Path object.

Table 7.13 JSON objects, descriptions, and examples.

Table 7.14 Common

json

module functions.

Table 7.15 Important objects in the BeautifulSoup module.

Table 7.16 Useful BeautifulSoup Tag object properties and methods.

Table 7.17

find(…)

and

find_all(…)

filter method argument options.

Chapter 8

Table 8.1 Parts of the example PubChem URL.

Table 8.2 HTTP Response status code ranges and descriptions.

Table 8.3 Useful Request module Response object methods and properties.

Table 8.4 Useful keyword parameters of Request module functions that customize...

Table 8.5 Useful properties of the

PreparedRequest

object.

Chapter 9

Table 9.1 Core Libraries in the Python Scientific Stack.

Table 9.2 Common functions of the

matplotlib.pyplot

submodule to customize a plot.

Table 9.3 Several examples of

plot(…)

function format string parameters.

Table 9.4 Several

pyplot

submodule functions that generate different types of...

Table 9.5 Common NumPy functions, including ndarray object creation functions.

Table 9.6 Common properties, methods, and operators of the ndarray object.

Table 9.7 Several

pandas

functions to read and write a DataFrame to/from a for...

Table 9.8 Several DataFrame properties and methods.

Table 9.9 Several useful

pandas

module functions.

Table 9.10 Source rack map for all substances used in screening experiment.

Table 9.11 Interpretation of Z-factor QC values.

Table 9.12 Computed statistics returned from the

scipy.stats.describe(…)

function.

Table 9.13 Liquid handler syringe comparison data set

lhdata_ttest.csv

.

Table 9.14 Attributes of the LinregressResult object returned from the

linregress(…)

function.

Table 9.15 Common parameters of the

scipy.optimize.curve_fit(…)

function.

Table 9.16 Sample experimental data measuring Michaelis–Menten kinetics.

Chapter 10

Table 10.1 File-like methods of the

io

module BytesIO object.

Table 10.2 Common objects provided by the

python-docx

module.

Table 10.3 Common Document object methods and properties.

Table 10.4 Common Paragraph object methods and properties.

Table 10.5 Common Run object methods and properties.

Table 10.6 Common Table object properties and methods.

Table 10.7 Common _Cell object properties and methods.

Table 10.8 Common methods and properties of the python-pptx module Presentation...

Table 10.9 Important slides object collection methods.

Table 10.10 Important slide object properties.

Table 10.11 Common SlideShapes object properties and methods.

Table 10.12 Several length object types defined by the

pptx.util

submodule.

Table 10.13 Common Table object methods.

Table 10.14 Common Canvas methods for setting graphic styles prior to drawing.

Table 10.15 Common Canvas graphics drawing methods.

Table 10.16 Common Canvas text methods.

Table 10.17 Common PDFTextObject methods.

Table 10.18 Other common Canvas methods.

Table 10.19 A few of the Flowable object types in PLATYPUS.

Chapter 11

Table 11.1 Two tasks racing to increment a count in a single file.

Table 11.2 RS-232-C communication parameter settings.

Table 11.3 Serial constructor parameters and default values.

Table 11.4 Common Serial object methods and parameters with descriptions.

Table 11.5 Meaning of

timeout

parameter values.

Table 11.6 Common

aiohttp

Request object methods and properties.

Table 11.7 Common

aiohttp

Response object methods and properties.

Table 11.8 Keyword parameters that customize the behavior of the watch(…) metho...

Table 11.9 Common functions in the

sched

module for scheduling future task exec...

Appendix A

Table 1A ASCII decimal numbers (Dec) and standard character name or symbol (Chr).

Table 2A ASCII control characters having no glyph with description.

Guide

Cover

Table of Contents

Title Page

Copyright

Dedication

Preface

Begin Reading

Postface

References

Appendix A: ASCII American Standard Code for Information Interchange

Index

End User License Agreement

Pages

iii

iv

v

xv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

Streamlining Your Research Laboratory with Python

Mark F. Russo, PhD

The College of New Jersey

Ewing, NJ, USA

William Neil

Bristol Myers Squibb Company

Princeton, NJ, USA

Copyright © 2025 by Mark F. Russo. All rights reserved, including rights for text and data mining and training of artificial intelligence technologies or similar technologies.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

The manufacturer’s authorized representative according to the EU General Product Safety Regulation is Wiley-VCH GmbH, Boschstr. 12, 69469 Weinheim, Germany, e-mail: [email protected].

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data:

Names: Russo, Mark F., author. | Neil, William (Automation specialist), author.

Title: Streamlining your research laboratory with Python / Mark F. Russo, William Neil.

Description: Hoboken, New Jersey : Wiley, [2025] | Includes bibliographical references and index.

Identifiers: LCCN 2024062194 | ISBN 9781394249886 (hardback) | ISBN 9781394249909 (epdf) | ISBN 9781394249893 (epub)

Subjects: LCSH: Laboratories—Data processing. | Python (Computer program language)

Classification: LCC Q183.A1 R885 2025 | DDC 502.85/5133—dc23/eng/20250127

LC record available at https://lccn.loc.gov/2024062194

Cover Design: Wiley

Cover Image: Photo by Min Fang, © Designer/Getty Images

For my girls, Jean, Emily, and Gabrielle. You three are my world.

—Mark

For my wife, Cindy, and our sons, Billy, Matthew, and Andrew. I am proud of you and love you.

—William

Preface

Python [1] is a powerful and versatile programming language that has found widespread adoption in numerous scientific disciplines, including the research laboratory. Its simplicity, readability, and extensive libraries make it an invaluable tool for scientists and researchers who need to process data, conduct data analysis, visualize results, and automate routine tasks. In the context of research laboratories, Python’s user-friendly nature, coupled with its vast ecosystem of scientific libraries and tools, can transform the way experiments are automated as well as the way data is collected and analyzed.

Python’s appeal lies in the fact that it is at once both accessible to novice programmers and sufficiently powerful for experienced programmers. It boasts a straightforward syntax that minimizes the learning curve, enabling researchers to quickly grasp the essentials of programming and apply them to their specific scientific endeavors. Furthermore, its open-source nature ensures that a vibrant community continuously enhances and extends its capabilities, offering an ever-evolving set of tools tailored to the needs of a broad and diverse community.

We do not assume that the reader has prior knowledge of Python programming. Early chapters of this book provide a thorough introduction to the aspects of Python programming that are critical to its application by researchers, especially in a research laboratory setting. That said, we do not intend to cover Python programming in its entirety. Our focus is on helping researchers streamline their laboratory operations, including experimental data collection, analysis, and reporting.

We survey the broad variety of ways in which Python may be used in a research laboratory. We delve into the various aspects of Python that make it an ideal choice for scientists and researchers. We examine its key features, its role in automation, data collection, data analysis, visualization, and scientific computing, and how it can be integrated seamlessly into laboratory workflows. Additionally, we explore some real-world examples of Python’s application in research settings, demonstrating its potential to streamline processes, improve productivity, and foster innovation.

Throughout the book, we develop numerous Python programs to solve real practical problems faced by research scientists. All source code may be downloaded from the book’s GitHub repository at https://github.com/russomf/syrlwp. All corrections and updates will be found there as well.

Whether you are a biologist analyzing genetic data, a chemist scouting synthesis routes, an engineer optimizing machine parameters, or a social scientist studying human behavior, Python has the flexibility and adaptability to be a powerful and indispensable tool in your research toolkit. Join us on a journey into the world of Python programming for research laboratories. You will gain insights into how this versatile language can empower you to unlock new possibilities, accelerate your research, and contribute to scientific advancements.

Chapter 1Introduction

Python is one of the most popular programming languages, and for good reasons. Among the principles that guide the design of the language, called the Zen of Python, is the principle that Readability Counts. You will discover this repeatedly throughout your learning journey. Unlike source code that you may have encountered in the past written in other programming languages, Python will not appear to the untrained eye as an undiscovered form of hieroglyphics. If written well, Python source code can be relatively easy to read and understand, and it can be equally straightforward to write. Perhaps this is what has driven the popularity of Python.

As a general-purpose programming language, you will find that learning Python provides you with the power to solve virtually any computing problem that you may encounter. This includes data collection and processing, instrument control, scientific computations, publication quality graphing, report generation, and much more. Many of these packages are designed to solve the kinds of scientific problems encountered in a research laboratory. We will cover many of the most popular and widely used scientific Python packages. When combining Python’s clean and simple syntax with over 600,000 packages that are freely available in the Python Package Index (PyPI), you will find that you have at your fingertips an incredibly powerful toolkit to solve authentic scientific and research laboratory problems.

Another incredible feature of Python is that it has been selected as one of the two most popular languages used for Data Science. As a research scientist, experimental data is likely the lens through which you learn about the world. While you may never need the full range of advanced numerical, statistical, and modeling features required by a professional data scientist, it is no doubt that you will benefit from the power of Python to operate legitimately in the Data Science field.

Finally, it is worth noting that Python is an open-source language, which means it is free to use and has a large and active community of developers. This community continuously maintains and improves Python’s vast array of libraries and other tools, making it a robust platform for scientific research. Python is available for most operating systems, which ensures that you will be able to run it wherever you need it, even on microcontrollers.

1.1 Python Implementations

Although not formally standardized like other programming languages such as C, C++, and JavaScript, the Python language syntax is defined by The Python Language Reference [1] as well as its reference implementation in C called CPython, which is available for all major operating systems. Both the language reference documentation and CPython implementation are available from Python’s official website at https://www.python.org [2].

Python may be implemented by anyone on any platform. Consequently, and due to its significant popularity, you will find Python available on almost every computing platform. In addition to CPython, which is available for all major operating systems, including Windows, MacOS, and Linux, an implementation of Python called IronPython has been implemented for the .Net runtime [3], Jython has been implemented for the Java Virtual Machine (JVM) [4], and at least two Python subsets that run on microcontrollers: MicroPython [5] and CircuitPython [6]. A version of Python has even been implemented in Python itself, appropriately called PyPy [7].

More recently, CPython and many of its most important packages have been compiled to WebAssembly [8]. WebAssembly is a stack-based virtual machine that runs entirely in and is confined by, a web browser. The WebAssembly port of Python is called Pyodide [9]. Pyodide allows us to run Python in a web browser without the need to install it first. Pyodide is an important option for laboratory scientists because laboratory computers are often locked down for security reasons. Typically, the primary purpose of a lab computer is to operate an attached instrument or process data, not to perform general-purpose computing. For security reasons, it is often the case that installing new software is strictly forbidden. Fortunately, because the entire Pyodide Python environment may be loaded into a web browser directly, Pyodide provides a means to access the power of Python without the need to convince your IT team to grant you the elevated privileges required to install software.

No matter which implementation of Python you choose, your knowledge of the Python programming language will be instrumental in helping you streamline your laboratory operations.

1.2 Installing the Python Toolkit

To install CPython, visit the official Python home page at https://www.python.org/ and click the Downloads link [2]. An appropriate installer for your operating system will be offered. Download and run the installer. Python will be installed for you on your computer.

To test your installation, open a terminal program and enter the python command. Many terminal programs are available and will change based upon your computer’s operating system. On MacOS and Linux, you should find Terminal as one of your program options. On Windows, you may use PowerShell, Command Prompt, or another option. But no matter which terminal program you use, simply enter the python or python3 command into the terminal program and press Enter. This command will run the Python Interactive Console, which you may use to execute Python commands interactively.

To exit the Python console, enter the Python command exit(). See Figure 1.1 for an example.

Figure 1.1 The Python Interactive Console running in Windows PowerShell.

If the installation of Python is successful but you are still having problems starting the Python console, our experience suggests that the problem lies with your operating system’s ability to find the Python executable. Investigate where Python was installed and make sure that the path location is included in your PATH environment variable. Also check that you have the necessary permissions to access and execute the python program.

1.3 Python 3 vs. Python 2

Python 3 was introduced to the community in 2008 as a “breaking change” version of Python. Programs written for the previous Python 2 would not run in Python 3 due to significant changes to syntax and other implementation details. This change was necessary because several of the design decisions made for Python 2 needed an upgrade to make the language more suitable for modern applications. Some changes made to Python 3 were fundamental, including the way binary data is stored and processed.

There was a significant number of existing Python 2 programs in production around the world when Python 3 was announced. It is no surprise that many Python 2 programmers were less than enthusiastic about porting their source code to Python 3. Nevertheless, Python 2 was scheduled to be retired in 2015, but the resistance was so strong that this date had to be delayed. It wasn’t until January 1, 2020 that Python 2 was finally and fully retired.

Even though Python 2 has been retired and it no longer receives security patches, you can still install and use it. If you have a version of Python 2 installed, please resist the urge to use it to write new programs and install Python 3 instead. For guidance porting Python 2 code to Python 3, refer to Python’s own porting guide [10]. If you need both versions of Python available, you may install Python 3 with Python 2 and use them both simultaneously. With both versions of Python installed, run Python 3 from a terminal using the python3 command in place of the python command. To make sure you have a recent version of Python 3 installed, you can enter the following command into a terminal. In this book we use Python 3 exclusively.

python --version

1.4 Python Package Index

One of Python’s mottos is “batteries included,” and for a good reason. A Python distribution comes with a huge library of prewritten modules for you to use and build upon. While it’s true that a Python distribution includes quite a few “batteries,” it is not possible to include them all.

If a module is not shipped with Python, there is a good chance someone in the Python community has contributed a module that will help you. Additional Python modules are distributed through a Python package repository, with the two most popular being the PyPI [11] and the Package Repository for Anaconda [12]. Anaconda is an exceptional platform that provides high quality Python installations, package distributions, and other open-source resources. Importantly, it also offers paid support plans, which may be critical for businesses that depend upon Python as part of their core operations. In the following, we describe how to use PyPI for installing additional Python packages.

The PyPI provides a way for package authors to post their open-source Python packages, and for package users to find and install Python packages that are not distributed with Python. The PyPI hosts over 600,000 freely available Python packages ready for you to install and use. If you need something specific, there is a good chance that the PyPI has a package for that.

Python packages are downloaded from the PyPI and installed in your computing environment using a program and module distributed with Python called pip (package installer for Python). If you have a Python distribution, you have pip.

As an example, consider the watchfiles Python package, which is a toolkit for monitoring and responding to file system events [13]. The watchfiles package can help if you may want to execute a Python program automatically to parse data when a new file is written. Before you can use watchfiles, you must install it with the pip module. Python ships with a script named pip that is a convenient way to invoke the pip module. Our experience has been that this can be troublesome in some environments. To avoid the trouble, we skip the script and invoke the module directly to perform an installation from the PyPI using a command like the following.

python -m pip install watchfiles

This installs the watchfiles package so that it can be used by anyone logged in to the computer. To limit the installation to yourself only, add the --user option to the command.

python -m pip install --user watchfiles

To install a particular version of a package, add the package version to the pip command. For example, if you want to install version 0.24 of watchfiles, execute the following terminal command, which ensures that the version is exactly equal to 0.24.

python -m pip install --upgrade watchfiles==0.24

The pip package is updated frequently. When executed, it checks if there is a more recent version available and lets you know if there is. To ensure that you have the latest version of pip installed (or any module), add the --upgrade option.

python -m pip install --upgrade pip

If you are working in a networked environment that is behind a proxy server, which is the case with most sizable organizations, the pip command may be unable to access the PyPI repository directly. Fortunately, if you provide proxy server parameters to pip, it will use the proxy server to access PyPI. Add the --proxy option to the pip command along with the proxy server domain name and port number, as well as username and password, if necessary, using the following syntax.

python -m pip install

packageName

--proxy [user:passwd@]proxy.server:port

For many more details describing the options for installing Python packages from the PyPI, refer to the Python Packaging User Guide [14].

1.5 Programming Editors

In the next chapter you will learn that Python depends heavily on indentation to properly parse and interpret its source code structure. Statements in the same code block must be indented uniformly, otherwise they are considered to be in different code blocks, which often results in an incorrect interpretation of what you intended your program to do. Leading whitespace like tabs and spaces can be extremely important to Python programs. You must get it right.

It is possible to write Python programs using a simple text editor, but it can be difficult to ensure that all indentations are correct. An extra leading space character on one line can completely change the meaning of your program or break it entirely. For example, a plain text editor might use a sequence of space characters or a tab character to indent a statement on a line. Two consecutive lines of code, one above the other but using different whitespace characters for indentation, can appear to be indented identically. Despite the way the lines look, Python will consider these two lines to be in different code blocks. To ensure you don’t end up with an incorrect Python program due to line indentation inconsistencies, it is strongly recommended that you write your programs using an editor designed for programming in Python. Fortunately, there are many good options.

One good option that is freely available is the Visual Studio Code editor, often called VSCode [15]. Note that VSCode is not the same as the Visual Studio Integrated Development Environment, which is far more sophisticated than VSCode. Because VSCode is a general-purpose programming editor, after installation it is necessary to add extensions to the editor specifically for Python. A good extension for Python programming is the Python extension for Visual Studio Code distributed by Microsoft, which is freely available through the Visual Studio Code Marketplace [16]. This extension adds several functions to VSCode for Python programming, including syntax highlighting and checking, debugging, outlining and navigation, formatting, refactoring, variable explorer, and more.

To install VSCode, visit https://code.visualstudio.com/ and download an installer for your operating system. Run the installer to set up the editor on your computer. After installation is complete, start the program and find the icon bar docked to the left side of the window. Click the icon for Extensions, or enter Control-Shift-X. The Extensions icon is composed of four small boxes, one of which is being added to the others. Clicking this icon opens the Extensions Marketplace panel. Search for “Python” and find the “Python extension for Visual Studio Code” extension. Click the [Install] button. Once installed, this extension will configure your editor for advanced Python program editing, syntax highlighting, and much more. It is well worth the effort to seek out and install the VSCode extension pack for Python.

1.6 Notebook Editors

When working with experimental data, it helps to use a different style of editor called a Notebook. The idea was first described by Donald Knuth [17] when he introduced the concept of literate programming. In that paper, Knuth proposed extending our attitude toward computer programming as merely a way to instruct a computer but also to include a description for humans that explains a larger concept that is supported by the program. In this way, we consider such an extended computer program a work of literature.

The modern version of literate programming is best embodied in the Notebook-style editor, which intermingles formatted text with code snippets that analyze data and generate illustrative output. For Python, Jupyter and JupyterLab are among the most popular Notebook editors [18]. JupyterLab itself is composed of a Jupyter notebook coupled with a file system browser and other utilities. Jupyter is the de facto standard Notebook editor for Data Scientists who use Python.

Jupyter and JupyterLab leverage a web browser for its user interface, with a computer language kernel running behind it. A kernel is the compute engine that executes code snippets. Once the user enters code into a code cell of a notebook and submits it, the code is transmitted by the notebook to its running backend kernel. The kernel executes submitted code and results are returned to the notebook. If appropriate, returned results are rendered in a new notebook cell. For example, the result of a computation might be a chart or interactive widget. This architecture is illustrated in Figure 1.2.

Figure 1.2 Jupyter notebook architecture.

To install Jupyter and JupyterLab, open a terminal and use the pip module to install the package.

python -m pip install jupyterlab

Once installed, Jupyter may be started by executing the following command in a terminal.

jupyter-notebook

To start JupyterLab, replace “notebook” with “lab,” as follows.

jupyter-lab

In both cases, a local notebook server program with the Python language kernel is started, followed by the opening of a browser window, which is automatically navigated back to the notebook server to load the initial interface.

To shut down the notebook server, click the [Quit] button in a Jupyter Notebook, choose the “File | Shut Down” menu option from JupyterLab, or simply enter Ctrl-C in or Cmd-C the terminal window that is running the notebook server program.

The modular architecture of Jupyter allows one kernel of a notebook to be swapped out for another kernel. With a standard communication protocol in place for transmitting code snippets to a kernel and returning results, it is possible to develop kernels for other programming languages in addition to Python. Currently, notebook kernels are available for Perl, Ruby, JavaScript, Fortran, Java, and other languages. The landscape of available kernels is advancing all the time. If you have an interest in a kernel for another language, consult the Jupyter Project documentation [18].

Figure 1.3 demonstrates a simple example of JupyterLab with formatted descriptive text, a short Python program, and a chart resulting from the execution of the code in the code cell. This notebook illustrates how we may intermingle text, programs, and dynamically generated results, in a single document, thereby embodying Knuth’s concept of literate programming.

Figure 1.3 JupyterLab in action.

1.7 Using the Jupyter Notebook Interface

Notebook documents are composed of a linear stack of three kinds of cells: Markdown cells (formatted text), code cells (program snippets), and raw cells (unprocessed text). These cells may be added to a notebook, deleted, edited, and rearranged at will. Individual cells in a notebook may be executed in any order, or all cells may be executed from the top of the notebook to the bottom.

The content of any cell in a notebook is entered directly through the notebook interface. Executing a cell, whether it is a Markdown cell or a code cell, is performed by entering the key sequence Shift-Enter. This will either process and display the formatted text of a Markdown cell as HTML or it will transmit code to the language kernel for execution with results displayed upon return. To learn more about Markdown text formatting syntax implemented in a Jupyter Notebook, explore the Jupyter Notebook documentation site [19]. Python language syntax will be explored in great detail in the chapters that follow.

From the perspective of a laboratory scientist, it is easy to see how a notebook may be used to perform the kind of analysis and results that are entered into a classic laboratory notebook. A Jupyter notebook used to document an experiment might open with formatted text describing the experiment performed, including detailed procedures, source materials, and an experimental design. Experimental data might be added to a cell in a tabular format suitable for processing. A code cell should contain the full source code for a short program used to process experimental data. Capturing the actual source code for the program used to process raw experimental data has the side benefit of documenting the analysis in a way that allows it to be reproduced easily. After analysis, additional code cells might generate charts, tables, and other illustrations rendered right in the notebook. Once the notebook document is written and tested, it might be possible to reuse it by updating experimental data and rerunning the entire notebook.

Of course, notebook documents like those created with JupyterLab are not and cannot be used as laboratory notebooks because they lack the required tracking, signature capabilities, and other security features. But they can form the basis of a report to be entered into a notebook, complete with an ability to reproduce the analysis at a later time by rerunning a copy of the notebook. It’s easy to see how notebook editors can be a valuable tool for a research scientist.

1.8 JupyterLite

Earlier we described one implementation of Python called Pyodide. The authors of Pyodide recognized that several of Python’s core data processing libraries could reach even more users if compiled to a WebAssembly target so that they were available to run in Pyodide. These additional packages are the tools used most often by Data Scientists. With these additional packages compiled to WebAssembly, and because JupyterLab uses the browser as its user interface, it was possible to combine JupyterLab with Pyodide and a set of additional libraries compiled to WebAssembly to create a fully in-browser data science environment. Add a lightweight browser-based file system, and the result is an environment named JupyterLite (Figure 1.4) [20].

Figure 1.4 JupyterLite demonstration.

JupyterLite is a complete notebook environment that runs entirely in a web browser – no installation necessary. This includes both the notebook front end as well as the Python kernel language back end. You may load this complete environment at the Jupyter website [21]. This is a tremendous benefit, especially when you are blocked from installing software, such as in a tightly controlled lab computing environment managed by large organizations.

Note that the JupyterLite environment is indeed a “lite” version of the toolkit. For example, the file system is simulated and exists entirely in browser storage. It is possible to upload files from a local disk to the browser-based simulated file system and download files from the browser to local disk, but accessing the local file system from the JupyterLite environment directly is impossible due to the tight security constraints implemented by web browsers. This limits the amount of data that can be analyzed. Also, while any pure Python package may be used in JupyterLite, more high-performance packages that depend on utilities written in another language, such as C or Rust, are not available unless they have been precompiled to WebAssembly.

1.9 Things Change

Computing environments develop and change quickly. Unfortunately, this means that the information here may become out of date. It is worth taking some time to research the implementations of Python, programming editors, and other tools currently available. Better options may appear, and you will want to benefit from them.

1.10 Key Takeaways

Currently, Python is one of the most popular general-purpose programming languages.

Learning Python will provide you with the power to solve virtually any computing problem you may encounter.

The PyPI includes approximately 600,000 additional packages for you to download.

Python is one of the two most popular languages used for Data Science.

Python is an open-source language, which means it is free to use and has a large and active community of developers.

There are many implementations of Python for you to use. The CPython implementation is the reference standard by which other implementations are compared.

Install Python by visiting the language home page [

2

], clicking the Downloads link to download a suitable installer and running the installer program.

Additional packages are installable from the PyPI using the

pip

module, provided with the standard Python distribution.

Because Python syntax depends heavily on uniform indentation, a good programming editor will be a great benefit. Visual Studio Code is one very good option. To install VSCode, download an installer [

15

].

Notebook editors are another style of editor that support

literate programming

, a style of programming in which documentation, code, and results are intermingled.

The JupyterLab notebook editor is a very popular notebook style of editor used for Python programming in the field of Data Science. Notebooks are useful for research scientists as well.

Things change quickly in technology fields. Always check for the most recent releases and the latest options available for solving computing-related problems.