Applied Deep Learning with Python - Alex Galea - E-Book

Applied Deep Learning with Python E-Book

Alex Galea

0,0
34,79 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Taking an approach that uses the latest developments in the Python ecosystem, you’ll first be guided through the Jupyter ecosystem, key visualization libraries and powerful data sanitization techniques before you train your first predictive model. You’ll then explore a variety of approaches to classification such as support vector networks, random decision forests and k-nearest neighbors to build on your knowledge before moving on to advanced topics.

After covering classification, you’ll go on to discover ethical web scraping and interactive visualizations, which will help you professionally gather and present your analysis. Next, you’ll start building your keystone deep learning application, one that aims to predict the future price of Bitcoin based on historical public data. You’ll then be guided through a trained neural network, which will help you explore common deep learning network architectures (convolutional, recurrent, and generative adversarial networks) and deep reinforcement learning. Later, you’ll delve into model optimization and evaluation. You’ll do all this while working on a production-ready web application that combines TensorFlow and Keras to produce meaningful user-friendly results.

By the end of this book, you’ll be equipped with the skills you need to tackle and develop your own real-world deep learning projects confidently and effectively.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 295

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Applied Deep Learning with Python
Use scikit-learn, TensorFlow, and Keras to create intelligent systems and machine learning solutions
Alex Galea
Luis Capelo
BIRMINGHAM - MUMBAI

Applied Deep Learning with Python

Copyright © 2019 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Acquisitions Editors: Aditya Date, Koushik Sen Content Development Editors: Tanmayee Patil, Rina Yadav Production Coordinator: Ratan Pote

First published: August 2018

Production reference: 2260719

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78980-474-4

www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.Packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the authors

Alex Galea has been professionally practicing data analytics since graduating with a Master's degree in Physics from the University of Guelph, Canada. He developed a keen interest in Python while researching quantum gases as part of his graduate studies. Alex is currently doing web data analytics, where Python continues to play a key role in his work. He is a frequent blogger about data-centric projects that involve Python and Jupyter Notebooks.

Luis Capelo is a Harvard-trained analyst and programmer who specializes in the design and development of data science products. He is based in the great New York City, USA.

He is the head of the Data Products team at Forbes, where they both investigate new techniques for optimizing article performance and create clever bots that help them distribute their content. Previously, he led a team of world-class scientists at the Flowminder Foundation, where we developed predictive models for assisting the humanitarian community. Prior to that, he worked for the United Nations as part of the Humanitarian Data Exchange team (founders of the Center for Humanitarian Data).

He is a native of Havana, Cuba, and the founder and owner of a small consultancy firm dedicated to supporting the nascent Cuban private sector.

About the reviewers

Elie Kawerk likes to solve problems using the analytical skills he has accumulated over the years. He uses the data science process, including statistical methods and machine learning, to extract insights from data and get value out of it.

His formal training is in computational physics. He used to simulate atomic and molecular physics phenomena with the help of supercomputers using the good old FORTRAN language; this involved a lot of linear algebra and quantum physics equations.

Manoj Pandey is a Python programmer and the founder and organizer of PyData Delhi. He works on research and development from time to time, and is currently working with RaRe Technologies on their incubator program for a computational linear algebra project. Prior to this, he has worked with Indian startups and small design/development agencies, and teaches Python/JavaScript to many on Codementor.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Applied Deep Learning with Python

Packt Upsell

Why subscribe?

Packt.com

Contributors

About the authors

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Get in touch

Reviews

Jupyter Fundamentals

Basic Functionality and Features

What is a Jupyter Notebook and Why is it Useful?

Navigating the Platform

Introducing Jupyter Notebooks

Jupyter Features

Exploring some of Jupyter's most useful features

Converting a Jupyter Notebook to a Python Script

Python Libraries

Import the external libraries and set up the plotting environment

Our First Analysis - The Boston Housing Dataset

Loading the Data into Jupyter Using a Pandas DataFrame

Load the Boston housing dataset

Data Exploration

Explore the Boston housing dataset

Introduction to Predictive Analytics with Jupyter Notebooks

Linear models with Seaborn and scikit-learn

Activity: Building a Third-Order Polynomial Model

Linear models with Seaborn and scikit-learn

Using Categorical Features for Segmentation Analysis

Create categorical fields from continuous variables and make segmented visualizations

Summary

Data Cleaning and Advanced Machine Learning

Preparing to Train a Predictive Model

Determining a Plan for Predictive Analytics

Preprocessing Data for Machine Learning

Exploring data preprocessing tools and methods

Activity: Preparing to Train a Predictive Model for the Employee-Retention Problem

Training Classification Models

Introduction to Classification Algorithms

Training two-feature classification models with scikit-learn

The plot_decision_regions Function

Training k-nearest neighbors for our model

Training a Random Forest

Assessing Models with k-Fold Cross-Validation and Validation Curves

Using k-fold cross-validation and validation curves in Python with scikit-learn

Dimensionality Reduction Techniques

Training a predictive model for the employee retention problem

Summary

Web Scraping and Interactive Visualizations

Scraping Web Page Data

Introduction to HTTP Requests

Making HTTP Requests in the Jupyter Notebook

Handling HTTP requests with Python in a Jupyter Notebook

Parsing HTML in the Jupyter Notebook

Parsing HTML with Python in a Jupyter Notebook

Activity: Web Scraping with Jupyter Notebooks

Interactive Visualizations

Building a DataFrame to Store and Organize Data

Building and merging Pandas DataFrames

Introduction to Bokeh

Introduction to interactive visualizations with Bokeh

Activity: Exploring Data with Interactive Visualizations

Summary

Introduction to Neural Networks and Deep Learning

What are Neural Networks?

Successful Applications

Why Do Neural Networks Work So Well?

Representation Learning

Function Approximation

Limitations of Deep Learning

Inherent Bias and Ethical Considerations

Common Components and Operations of Neural Networks

Configuring a Deep Learning Environment

Software Components for Deep Learning

Python 3

TensorFlow

Keras

TensorBoard

Jupyter Notebooks, Pandas, and NumPy

Activity: Verifying Software Components

Exploring a Trained Neural Network

MNIST Dataset

Training a Neural Network with TensorFlow

Training a Neural Network

Testing Network Performance with Unseen Data

Activity: Exploring a Trained Neural Network

Summary

Model Architecture

Choosing the Right Model Architecture

Common Architectures

Convolutional Neural Networks

Recurrent Neural Networks

Generative Adversarial Networks

Deep Reinforcement Learning

Data Normalization

Z-score

Point-Relative Normalization

Maximum and Minimum Normalization

Structuring Your Problem

Activity: Exploring the Bitcoin Dataset and Preparing Data for Model

Using Keras as a TensorFlow Interface

Model Components

Activity: Creating a TensorFlow Model Using Keras

From Data Preparation to Modeling

Training a Neural Network

Reshaping Time-Series Data

Making Predictions

Overfitting

Activity: Assembling a Deep Learning System

Summary

Model Evaluation and Optimization

Model Evaluation

Problem Categories

Loss Functions, Accuracy, and Error Rates

Different Loss Functions, Same Architecture

Using TensorBoard

Implementing Model Evaluation Metrics

Evaluating the Bitcoin Model

Overfitting

Model Predictions

Interpreting Predictions

Activity:Creating an Active Training Environment

Hyperparameter Optimization

Layers and Nodes - Adding More Layers

Adding More Nodes

Layers and Nodes - Implementation

Epochs

Epochs - Implementation

Activation Functions

Linear (Identity)

Hyperbolic Tangent (Tanh)

Rectifid Linear Unit

Activation Functions - Implementation

Regularization Strategies

L2 Regularization

Dropout

Regularization Strategies – Implementation

Optimization Results

Activity:Optimizing a Deep Learning Model

Summary

Productization

Handling New Data

Separating Data and Model

Data Component

Model Component

Dealing with New Data

Re-Training an Old Model

Training a New Model

Activity: Dealing with New Data

Deploying a Model as a Web Application

Application Architecture and Technologies

Deploying and Using Cryptonic

Activity: Deploying a Deep Learning Application

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

This Learning Path takes a step-by-step approach to teach you how to get started with data science, machine learning, and deep learning. Each module is designed to build on the learning of the previous chapter. The book contains multiple demos that use real-life business scenarios for you to practice and apply your new skills in a highly relevant context.

In the first part of this Learning Path, you will learn entry-level data science. You'll learn about commonly used libraries that are part of the Anaconda distribution, and then explore machine learning models with real datasets to give you the skills and exposure you need for the real world.

In the second part, you'll be introduced to neural networks and deep learning. You will then learn how to train, evaluate, and deploy Tensorflow and Keras models as real-world web applications. By the time you are done reading, you will have the knowledge to build applications in the deep learning environment and create elaborate data visualizations and predictions.

Who this book is for

If you’re a Python programmer stepping out into the world of data science, this is the right-way to get started. It is also ideal for experienced developers, analysts, or data scientists, who want to work with TensorFlow and Keras. We assume that you are familiar with Python, web application development, Docker commands, and concepts of linear algebra, probability, and statistics.

What this book covers

Chapter 1, Jupyter Fundamentals, covers the fundamentals of data analysis in Jupyter. We will start with usage instructions and features of Jupyter such as magic functions and tab completion. We will then transition to data science specific material. We will run an exploratory analysis in a live Jupyter Notebook. We will use visual assists such as scatter plots, histograms, and violin plots to deepen our understanding of the data. We will also perform simple predictive modeling.,

Chapter 2, Data Cleaning and Advanced Machine Learning, shows how predictive models can be trained in Jupyter Notebooks. We will talk about how to plan a machine learning strategy. This chapter also explains the machine learning terminology such as supervised learning, unsupervised learning, classification, and regression. We will discuss methods for preprocessing data using scikit-learn and pandas.,

Chapter 3, Web Scraping and Interactive Visualizations, explains how to scrap web page tables and then use interactive visualizations to study the data. We will start by looking at how HTTP requests work, focusing on GET requests and their response status codes. Then, we will go into the Jupyter Notebook and make HTTP requests with Python using the Requests library. We will see how Jupyter can be used to render HTML in the notebook, along with actual web pages that can be interacted with. After making requests, we will see how Beautiful Soup can be used to parse text from the HTML, and used this library to scrape tabular data.

Chapter 4, Introduction to Neural Networks and Deep Learning, helps you set up and configure deep learning environment and start looking at individual models and case studies. It also discusses neural networks and its idea along with their origins and explores their power.

Chapter 5, Model Architecture, shows how to predict Bitcoin prices using deep learning model.

Chapter 6, Model Evaluation and Optimization, shows how to evaluate a neural network model. We will modify the network's hyper parameters to improve its performance.

Chapter 7, Productization, explains how to create a working application from a deep learning model. We will deploy our Bitcoin prediction model as an application that is capable of handling new data by creating a new models.

To get the most out of this book

This book will be most applicable to professionals and students interested in data analysis and want to enhance their knowledge in the field of developing applications using TensorFlow and Keras. For the best experience, you should have knowledge of programming fundamentals and some experience with Python. In particular, having some familiarity with Python libraries such as Pandas, matplotlib, and scikit-learn will be useful.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packtpub.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/TrainingByPackt/Applied-Deep-Learning-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/TrainingByPackt/Applied-Deep-Learning-with-Python. Check them out!

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Jupyter Fundamentals

Jupyter Notebooks are one of the most important tools for data scientists using Python. This is because they're an ideal environment for developing reproducible data analysis pipelines. Data can be loaded, transformed, and modeled all inside a single Notebook, where it's quick and easy to test out code and explore ideas along the way. Furthermore, all of this can be documented "inline" using formatted text, so you can make notes for yourself or even produce a structured report. Other comparable platforms - for example, RStudio or Spyder - present the user with multiple windows, which promote arduous tasks such as copy and pasting code around and rerunning code that has already been executed. These tools also tend to involve Read Eval Prompt Loops (REPLs) where code is run in a terminal session that has saved memory. This type of development environment is bad for reproducibility and not ideal for development either. Jupyter Notebooks solve all these issues by giving the user a single window where code snippets are executed and outputs are displayed inline. This lets users develop code efficiently and allows them to look back at previous work for reference, or even to make alterations.

We'll start the chapter by explaining exactly what Jupyter Notebooks are and continue to discuss why they are so popular among data scientists. Then, we'll open a Notebook together and go through some exercises to learn how the platform is used. Finally, we'll dive into our first analysis and perform an exploratory analysis in the section Basic Functionality and Features.

By the end of this chapter, you will be able to:

Learn what a Jupyter Notebook is and why it's useful for data analysis

Use Jupyter Notebook features

Study Python data science libraries

Perform simple exploratory data analysis

All the codes from this book are available as chapter-specific IPython notebooks in the code bundle. All color plots from this book are also available in the code bundle.

Basic Functionality and Features

In this section, we first demonstrate the usefulness of Jupyter Notebooks with examples and through discussion. Then, in order to cover the fundamentals of Jupyter Notebooks for beginners, we'll see the basic usage of them in terms of launching and interacting with the platform. For those who have used Jupyter Notebooks before, this will be mostly a review; however, you will certainly see new things in this topic as well.

What is a Jupyter Notebook and Why is it Useful?

Jupyter Notebooks are locally run web applications which contain live code, equations, figures, interactive apps, and Markdown text. The standard language is Python, and that's what we'll be using for this book; however, note that a variety of alternatives are supported. This includes the other dominant data science language, R:

Those familiar with R will know about R Markdown. Markdown documents allow for Markdown-formatted text to be combined with executable code. Markdown is a simple language used for styling text on the web. For example, most GitHub repositories have a README.md Markdown file. This format is useful for basic text formatting. It's comparable to HTML but allows for much less customization.

Commonly used symbols in Markdown include hashes (#) to make text into a heading, square and round brackets to insert hyperlinks, and stars to create italicized or bold text:

Having seen the basics of Markdown, let's come back to R Markdown, where Markdown text can be written alongside executable code. Jupyter Notebooks offer the equivalent functionality for Python, although, as we'll see, they function quite differently than R Markdown documents. For example, R Markdown assumes you are writing Markdown unless otherwise specified, whereas Jupyter Notebooks assume you are inputting code. This makes it more appealing to use Jupyter Notebooks for rapid development and testing.

From a data science perspective, there are two primary types for a Jupyter Notebook depending on how they are used: lab-style and deliverable.

Lab-style Notebooks are meant to serve as the programming analog of research journals. These should contain all the work you've done to load, process, analyze, and model the data. The idea here is to document everything you've done for future reference, so it's usually not advisable to delete or alter previous lab-style Notebooks. It's also a good idea to accumulate multiple date-stamped versions of the Notebook as you progress through the analysis, in case you want to look back at previous states.

Deliverable Notebooks are intended to be presentable and should contain only select parts of the lab-style Notebooks. For example, this could be an interesting discovery to share with your colleagues, an in-depth report of your analysis for a manager, or a summary of the key findings for stakeholders.

In either case, an important concept is reproducibility. If you've been diligent in documenting your software versions, anyone receiving the reports will be able to rerun the Notebook and compute the same results as you did. In the scientific community, where reproducibility is becoming increasingly difficult, this is a breath of fresh air.

Navigating the Platform

Now, we are going to open up a Jupyter Notebook and start to learn the interface. Here, we will assume you have no prior knowledge of the platform and go over the basic usage.

Introducing Jupyter Notebooks

Navigate to the companion material directory in the terminal.

On Unix machines such as Mac or Linux, command-line navigation can be done using ls to display directory contents and cd to change directories. On Windows machines, use dir to display directory contents and use cd to change directories instead. If, for example, you want to change the drive from C: to D:, you should execute d: to change drives.

Start a new local Notebook server here by typing the following into the terminal:

jupyter notebook.

A new window or tab of your default browser will open the Notebook Dashboard to the working directory. Here, you will see a list of folders and files contained therein.

Click on a folder to navigate to that particular path and open a file by clicking on it. Although its main use is editing IPYNB Notebook files, Jupyter functions as a standard text editor as well.

Reopen the terminal window used to launch the app. We can see the

NotebookApp

being run on a local server. In particular, you should see a line like this:

[I 20:03:01.045 NotebookApp] The Jupyter Notebook is running at: http:// localhost:8888/?token=e915bb06866f19ce462d959a9193a94c7c088e81765f9d8a

Going to that HTTP address will load the app in your browser window, as was done automatically when starting the app. Closing the window does not stop the app; this should be done from the terminal by typing

Ctrl + C

.

Close the app by typing

Ctrl +

C

in the terminal. You may also have to confirm by entering

y

. Close the web browser window as well.

When loading the NotebookApp, there are various options available to you. In the terminal, see the list of available options by running the following:

jupyter notebook –-help.

One such option is to specify a specific port. Open a NotebookApp at

local port 9000

by running the following:

jupyter notebook --port 9000

The primary way to create a new Jupyter Notebook is from the Jupyter Dashboard. Click

New

in the upper-right corner and select a kernel from the drop-down menu (that is, select something in the Notebooks section):

Kernels provide programming language support for the Notebook. If you have installed Python with Anaconda, that version should be the default kernel. Conda virtual environments will also be available here.

Virtual environments are a great tool for managing multiple projects on the same machine. Each virtual environment may contain a different version of Python and external libraries. Python has built-in virtual environments; however, the Conda virtual environment integrates better with Jupyter Notebooks and boasts other nice features. The documentation is available at https://conda.io/docs/user-guide/tasks/manage-environments.html.

With the newly created blank Notebook, click on the top cell and type print('hello world'), or any other code snippet that writes to the screen. Execute it by clicking on the cell and pressing Shift + Enter, or by selecting Run Cell in the Cell menu.

Any stdout or stderr output from the code will be displayed beneath as the cell runs. Furthermore, the string representation of the object written in the final line will be displayed as well. This is very handy, especially for displaying tables, but sometimes we don't want the final object to be displayed. In such cases, a semicolon (; ) can be added to the end of the line to suppress the display.

New cells expect and run code input by default; however, they can be changed to render Markdown instead.

Click into an empty cell and change it to accept Markdown-formatted text. This can be done from the drop-down menu icon in the toolbar or by selecting

Markdown

from the

Cell

menu. Write some text in here (any text will do), making sure to utilize Markdown formatting symbols such as #.

Focus on the toolbar at the top of the Notebook:

There is a Play icon in the toolbar, which can be used to run cells. As we'll see later, however, it's handier to use the keyboard shortcut Shift +Enter to run cells. Right next to this is a Stop icon, which can be used to stop cells from running. This is useful, for example, if a cell is taking too long to run:

New cells can be manually added from the Insert menu:

Cells can be copied, pasted, and deleted using icons or by selecting options from the Edit menu:

Cells can also be moved up and down this way:

There are useful options under the Cell menu to run a group of cells or the entire Notebook:

Experiment with the toolbar options to move cells up and down, insert new cells, and delete cells.

An important thing to understand about these Notebooks is the shared memory between cells. It's quite simple: every cell existing on the sheet has access to the global set of variables. So, for example, a function defined in one cell could be called from any other, and the same applies to variables. As one would expect, anything within the scope of a function will not be a global variable and can only be accessed from within that specific function.

Open the Kernel menu to see the selections. The Kernel menu is useful for stopping script executions and restarting the Notebook if the kernel dies. Kernels can also be swapped here at any time, but it is unadvisable to use multiple kernels for a single Notebook due to reproducibility concerns.

Open the

File

menu to see the selections. The

File

menu contains options for downloading the Notebook in various formats. In particular, it's recommended to save an HTML version of your Notebook, where the content is rendered statically and can be opened and viewed "as you would expect" in web browsers.

The Notebook name will be displayed in the upper-left corner. New Notebooks will automatically be named Untitled.

Change the name of your IPYNB

Notebook

file by clicking on the current name in the upper-left corner and typing the new name. Then, save the file.

Close the current tab in your web browser (exiting the Notebook) and go to the Jupyter Dashboard tab, which should still be open. (If it's not open, then reload it by copy and pasting the HTTP link from the terminal.)

Since we didn't shut down the Notebook, we just saved and exited, it will have a green book symbol next to its name in the Files section of the Jupyter Dashboard and will be listed as Running on the right side next to the last modified date. Notebooks can be shut down from here.

Quit the Notebook you have been working on by selecting it (checkbox to the left of the name) and clicking the orange Shutdown button:

If you plan to spend a lot of time working with Jupyter Notebooks, it's worthwhile to learn the keyboard shortcuts. This will speed up your workflow considerably. Particularly useful commands to learn are the shortcuts for manually adding new cells and converting cells from code to Markdown formatting. Click on Keyboard Shortcuts from the Help menu to see how.

Jupyter Features

Jupyter has many appealing features that make for efficient Python programming. These include an assortment of things, from methods for viewing docstrings to executing Bash commands. Let's explore some of these features together in this section.

The official IPython documentation can be found here: http://ipython.readthedocs.io/en/stable/. It has details on the features we will discuss here and others.

Exploring some of Jupyter's most useful features

From the Jupyter Dashboard, navigate to the

chapter-1

directory and open the

chapter-1-workbook.ipynb

file by selecting it. The standard file extension for Jupyter Notebooks is

.ipynb

, which was introduced back when they were called IPython Notebooks.

Scroll down to Subtopic

Jupyter Features

in the Jupyter Notebook. We start by reviewing the basic keyboard shortcuts. These are especially helpful to avoid having to use the mouse so often, which will greatly speed up the workflow. Here are the most useful keyboard shortcuts. Learning to use these will greatly improve your experience with Jupyter Notebooks as well as your own efficiency:

Shift

+

Enter

is used to run a cell

The

Esc

key

is used to leave a cell

The

M

key is used to change a cell to Markdown (after pressing

Esc

)

The

Y

key is used to change a cell to code (after pressing

Esc

)

Arrow keys

move cells (after pressing

Esc

)

The

Enter

key

is used to enter a cell

Moving on from shortcuts, the help option is useful for beginners and experienced coders alike. It can help provide guidance at each uncertain step.

Users can get help by adding a question mark to the end of any object and running the cell. Jupyter finds the docstring for that object and returns it in a pop-out window at the bottom of the app.

Run the

Getting Help

section cells and check out how Jupyter displays the docstrings at the bottom of the Notebook. Add a cell in this section and get help on the object of your choice:

Tab completion can be used to do the following:

List available modules when importing external libraries

List available modules of imported external libraries

Function and variable completion

This can be especially useful when you need to know the available input arguments for a module, when exploring a new library, to discover new modules, or simply to speed up workflow. They will save time writing out variable names or functions and reduce bugs from typos. The tab completion works so well that you may have difficulty coding Python in other editors after today!

Click into an empty code cell in the Tab Completion section and try using tab completion in the ways suggested immediately above. For example, the first suggestion can be done by typing import (including the space after) and then pressing the Tab key:

Last but not least of the basic Jupyter Notebook features are

magic

commands. These consist of one or two percent signs followed by the command. Magics starting with

%%

will apply to the entire cell, and magics starting with

%

will only apply to that line. This will make sense when seen in an example.

Scroll to the Jupyter Magic Functions section and run the cells containing %lsmagic and %matplotlib inline:

%lsmagic lists the available options. We will discuss and show examples of some of the most useful ones. The most common magic command you will probably see is %matplotlib inline, which allows matplotlib figures to be displayed in the Notebook without having to explicitly use plt.show().

The timing functions are very handy and come in two varieties: a standard timer (%time or %%time) and a timer that measures the average runtime of many iterations (%timeit and %%timeit).

Run the cells in the Timers section. Note the difference between using one and two percent signs.

Even by using a Python kernel (as you are currently doing), other languages can be invoked using magic commands. The built-in options include JavaScript, R, Pearl, Ruby, and Bash. Bash is particularly useful, as you can use Unix commands to find out where you are currently (pwd), what's in the directory (ls), make new folders (mkdir), and write file contents (cat / head / tail).

Run the first cell in the

Using bash in the notebook section

. This cell writes some text to a file in the working directory, prints the directory contents, prints an empty line, and then writes back the contents of the newly created file before removing it:

Run the following cells containing only

ls

and

pwd

. Note how we did not have to explicitly use the Bash magic command for these to work.

There are plenty of external magic commands that can be installed. A popular one is ipython-sql, which allows for SQL code to be executed in cells.

If you've not already done so, install

ipython-sql

now. Open a new terminal window and execute the following code:

pip install ipython-sql

Run the

%load_ext sql

cell to load the external command into the Notebook:

This allows for connections to remote databases so that queries can be executed (and thereby documented) right inside the Notebook.

Run the cell containing the SQL sample query: