E-Book
43,19 €

Mastering IPython 4.0 E-Book

Thomas Bitterman

0,0

43,19 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Get to grips with the advanced concepts of interactive computing to make the most out of IPython

About This Book

Most updated book on Interactive computing with IPython 4.0;
Detailed, example-rich guide that lets you use the most advanced level interactive programming with IPython;
Get flexible interactive programming with IPython using this comprehensive guide

Who This Book Is For

This book is for IPython developers who want to make the most of IPython and perform advanced scientific computing with IPython utilizing the ease of interactive computing.

It is ideal for users who wish to learn about the interactive and parallel computing properties of IPython 4.0, along with its integration with third-party tools and concepts such as testing and documenting results.

What You Will Learn

Develop skills to use IPython for high performance computing (HPC)
Understand the IPython interactive shell
Use XeroMQ and MPI to pass messages
Integrate third-party tools like R, Julia, and JavaScript with IPython
Visualize the data
Acquire knowledge to test and document the data
Get to grips with the recent developments in the Jupyter notebook system

In Detail

IPython is an interactive computational environment in which you can combine code execution, rich text, mathematics, plots, and rich media.

This book will get IPython developers up to date with the latest advancements in IPython and dive deep into interactive computing with IPython. This an advanced guide on interactive and parallel computing with IPython will explore advanced visualizations and high-performance computing with IPython in detail.

You will quickly brush up your knowledge of IPython kernels and wrapper kernels, then we'll move to advanced concepts such as testing, Sphinx, JS events, interactive work, and the ZMQ cluster. The book will cover topics such as IPython Console Lexer, advanced configuration, and third-party tools.

By the end of this book, you will be able to use IPython for interactive and parallel computing in a high-performance computing environment.

Style and approach

This is a comprehensive guide to IPython for interactive, exploratory and parallel computing. It will let the IPython get up to date with the latest advancements in IPython and dive deeper into interactive computing with IPython

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 381

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Mastering IPython 4.0

Credits

About the Author

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Using IPython for HPC

The need for speed

FORTRAN to the rescue – the problems FORTRAN addressed

Readability

Portability

Efficiency

The computing environment

Choosing between IPython and Fortran

Fortran

IPython

Object-orientation

Ease of adoption

Popularity – Fortran versus IPython

Useful libraries

The cost of building (and maintaining) software

Requirements and specification gathering

Development

Execution

Testing and maintenance

Alternatives

Cross-language development

Prototyping and exploratory development

An example case – Fast Fourier Transform

Fast Fourier Transform

Fortran

Python

Performance concerns

Software engineering concerns

Complexity-based metrics

Size-based metrics

Where we stand now

High Performance Computing

The HPC learning curve

Cloudy with a chance of parallelism (or Amazon's computer is bigger than yours)

HPC and parallelism

Clouds and HPC

Going parallel

Terminology

A parallel programming example

A serial program

A parallel equivalent

Discussion

Summary

2. Advanced Shell Topics

What is IPython?

Installing IPython

All-in-one distributions

Package management with conda

Canopy Package Manager

What happened to the Notebook?

Starting out with the terminal

IPython beyond Python

Shell integration

History

Magic commands

Creating custom magic commands

Cython

Configuring IPython

Debugging

Post-mortem debugging

Debugging at startup

Debugger commands

Read-Eval-Print Loop (REPL) and IPython architecture

Alternative development environments

Spyder

Canopy

PyDev

Others

Summary

3. Stepping Up to IPython for Parallel Computing

Serial processes

Program counters and address spaces

Batch systems

Multitasking and preemption

Time slicing

Threading

Threading in Python

Example

Limitations of threading

Global Interpreter Lock

What happens in an interpreter?

CPython

Multi-core machines

Kill GIL

Using multiple processors

The IPython parallel architecture

Overview

Components

The IPython Engine

The IPython Controller

The IPython Hub

The IPython Scheduler

Getting started with ipyparallel

ipcluster

Hello world

Using map_sync

Asynchronous calls

Synchronizing imports

Parallel magic commands

%px

%%px

%pxresult

%pxconfig

%autopx

Types of parallelism

SIMD

SPMD

ipcluster and mpiexec/mpirun

ipcluster and PBS

Starting the engines

Starting the controller

Using the scripts

MapReduce

Scatter and gather

A more sophisticated method

MIMD

MPMD

Task farming and load balancing

The @parallel function decorator

Data parallelism

No data dependence

External data dependence

Application steering

Debugging

First to the post

Graceful shutdown

Summary

4. Messaging with ZeroMQ and MPI

The storage hierarchy

Address spaces

Data locality

ZeroMQ

A sample ZeroMQ program

The server

The client

Messaging patterns in ZeroMQ

Pairwise

Server

Client

Discussion

Client/server

Server 1

Server 2

Client

Discussion

Publish/subscribe

Publisher

Subscriber

Discussion

Push/Pull

Ventilator

Worker

Sink

Discussion

Important ZeroMQ features

Issues using ZeroMQ

Startup and shutdown

Discovery

MPI

Hello World

Rank and role

Point-to-point communication

Broadcasting

Reduce

Discussion

Change the configuration

Divide the work

Parcel out the work

Process control

Master

Worker

ZeroMQ and IPython

ZeroMQ socket types

IPython components

Client

Engine(s)

Controller

Hub

Scheduler

Connection diagram

Messaging use cases

Registration

Heartbeat

IOPub

Summary

5. Opening the Toolkit – The IPython API

Performance profiling

Using utils.timing

Using %%timeit

Using %%prun

The AsyncResult class

multiprocessing.pool.Pool

Blocking methods

Nonblocking methods

Obtaining results

An example program using various methods

mp.pool.AsyncResult

Getting results

An example program using various methods

AsyncResultSet metadata

Metadata keys

Other metadata

The Client class

Attributes

Methods

The View class

View attributes

Calling Python functions

Synchronous calls

Asynchronous calls

Configurable calls

Job control

DirectView

Data movement

Dictionary-style data access

Scatter and gather

Push and pull

Imports

Discussion

LoadBalancedView

Data movement

Imports

Summary

6. Works Well with Others – IPython and Third-Party Tools

The rpy2 module/extension

Installing rpy2

Using Rmagic

The %R magic

The %%R magic

Pulling and pushing

Graphics

Using rpy2.robjects

The basics

Interpreting a string as R

Octave

The oct2py module/extension

Installing oct2py

Using Octave magic

The %octave magic

Tricky issues

The %%octave magic

Pushing and pulling

Graphics

Using the Octave module

Pushing and pulling

Running Octave code

The hymagic module/extension

Installing hymagic

Using hymagic

The %hylang magic

The %%hylang magic

A quick introduction to Hy

Hello world!

Get used to parentheses

Arithmetic operations are in the wrong place

Function composition is everywhere

Control structures in Hy

Setting variable values

Defining functions

if statements

Conditionals

Loops

Calling Python

Summary

7. Seeing Is Believing– Visualization

Matplotlib

Starting matplotlib

An initial graph

Modifying the graph

Controlling interactivity

Bokeh

Starting Bokeh

An initial graph

Modifying the graph

Customizing graphs

Interactive plots

An example interactive plot

Installing ggplot2 and pandas

Using DataFrames

An initial graph

Modifying the graph

A different view

Python-nvd3

Starting Python-nvd3

An initial graph

Putting some tools together

A different type of plot

Summary

8. But It Worked in the Demo! – Testing

Unit testing

A quick introduction

Assertions

Environmental issues

Before it starts – setup

While it is running – mocks

After it finishes – teardown

Writing to be tested

unittest

Important concepts

A test using setUp and tearDown

One-time setUp and tearDown

Decorators

pytest

Installation

Back compatibility

Test discovery

Organizing test files

Assertions

A test using setUp and tearDown

Classic xUnit-style

Being verbose

Using fixtures

Skipping and failing

Monkeypatching

nose2

Installation

Back compatibility

Test discovery

Running individual tests

Assertions, setup, and teardown

Modified xUnit-style

Using decorators

Plugins

Generating XML with the junitxml plugin

Summary

9. Documentation

Inline comments

Using inline comments

Function annotations

Syntax

Parameters

Return values

Semantics

Type hints

Syntax

Semantics

Docstrings

Example

Inheriting docstrings

Recommended elements

One-line docstrings

Syntax

Semantics

Multiline docstrings

Syntax

Semantics

The API

Inputs

Functionality

Outputs

Error conditions

Relationship with other parts of the system

Example uses

Example

reStructuredText

History and goals

Customers

The solution

Overview

Paragraphs

Text styles

Literal blocks

Lists

Enumerated lists

Bulleted lists

Definition lists

Hyperlinks

Sections

Docutils

Installation

Usage

Documenting source files

Sphinx

Installation and startup

Specifying the source files

Summary

10. Visiting Jupyter

Installation and startup

The Dashboard

Creating a notebook

Interacting with Python scripts

Working with cells

Cell tricks

Cell scoping

Cell execution

Restart & Run All

Magics

Cell structure

Code cells

Markdown cells

Raw cells

Heading cells

Being graphic

Using matplotlib

Using Bokeh

Using R

Using Python-nvd3

Format conversion

Other formats

nbviewer

Summary

11. Into the Future

Some history

The Jupyter project

The Notebook

The console

Jupyter Client

The future of Jupyter

Official roadmap

Official subprojects

Direct creation

Incorporation

Incubation

External incubation

IPython

Current activity

The rise of parallelism

The megahertz wars end

The problem

A parallel with the past

The present

Problems are getting bigger and harder

Computers are becoming more parallel

Clouds are rolling in

There is no Big Idea

Pragmatic evolution of techniques

Better tools

The Next Big Idea

Growing professionalism

The NSF

Software Infrastructure for Sustained Innovation

Summary

Index

Mastering IPython 4.0

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: May 2016

Production reference: 1240516

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78588-841-0

www.packtpub.com

Credits

Author

Thomas Bitterman

Reviewers

James Davidheiser

Dipanjan Deb

Commissioning Editor

Veena Pagare

Acquisition Editor

Manish Nainani

Content Development Editor

Deepti Thore

Technical Editor

Tanmayee Patil

Copy Editor

Vikrant Phadke

Project Coordinator

Shweta H. Birwatkar

Proofreader

Safis Editing

Indexer

Monica Ajmera Mehta

Graphics

Disha Haria

Production Coordinator

Nilesh Mohite

Cover Work

Nilesh Mohite

About the Author

Thomas Bitterman has a PhD from Louisiana State University and is currently an assistant professor at Wittenberg University. He previously worked in the industry for many years, including a recent stint at the Ohio Supercomputer Center. Thomas has experience in such diverse areas as electronic commerce, enterprise messaging, wireless networking, supercomputing, and academia. He also likes to keep sharp, writing material for Packt Publishing and O'Reilly in his copious free time.

I would like to thank my girlfriend for putting up with the amount of time this writing has taken away.

The Ohio Supercomputer Center has been very generous with their resources. The AweSim infrastructure (https://awesim.org/en/) is truly years ahead of anything else in the field. The original architect must have been a genius.

And last (but by no means least), I would like to thank Deepti Thore, Manish Nainani, Tanmayee Patil and everyone else at Packt, without whose patience and expertise this project would have never come to fruition.

About the Reviewer

Dipanjan Deb is an experienced analytics professional with about 16 years of cumulative experience in machine/statistical learning, data mining, and predictive analytics across the healthcare, maritime, automotive, energy, CPG, and human resource domains. He is highly proficient in developing cutting-edge analytic solutions using open source and commercial packages to integrate multiple systems to provide massively parallelized and large-scale optimization.

He has extensive experience in building analytics teams of data scientists that deliver high-quality solutions. Dipanjan strategizes and collaborates with industry experts, technical experts, and data scientists to build analytic solutions that shorten the transition from a POC to commercial release.

He is well versed in overarching supervised, semi-supervised, and unsupervised learning algorithm implementations in R, Python, Vowpal Wabbit, Julia, and SAS; and distributed frameworks, including Hadoop and Spark, both in-premise and in cloud environments. He is a part-time Kaggler and IoT/IIoT enthusiast (Raspberry Pi and Arduino prototyping).

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

Welcome to the world of IPython 4.0, which is used in high performance and parallel environments. Python itself has been gaining traction in these areas, and IPython builds on these strengths.

High-performance computing (HPC) has a number of characteristics that make it different from the majority of other computing fields. We will start with a brief overview of what makes HPC different and how IPython can be a game-changing technology.

We will then start on the IPython command line. Now that Jupyter has split from the IPython project, this is the primary means by which the developer will interface with the language. This is an important enough topic to devote two chapters to. In the first, we will concentrate on basic commands and gaining an understanding of how IPython carries them out. The second chapter will cover more advanced commands, leaving the reader with a solid grounding in what the command line has to offer.

After that, we will address some particulars of parallel programming. IPython parallel is a package that contains a great deal of functionality required for parallel computing in IPython. It supports a flexible set of parallel programming models and is critical if you want to harness the power of massively parallel architectures.

Programs running in parallel but on separate processors often need to exchange information despite having separate address spaces. They do so by sending messages. We will cover two messaging systems, ZeroMQ and MPI, and in relation to both how they are used in already existing programs and how they interact with IPython.

We will then take a deeper look at libraries that can enhance your productivity, whether included in IPython itself or provided by third-parties. There are far too many tools to cover in this book, and more are being written all the time, but a few will be particularly applicable to parallel and HPC projects.

An important feature of IPython is its support for visualization of datasets and results. We will cover some of IPython's extensive capabilities, whether built-in to the language or through external tools.

Rounding off the book will be material on testing and documentation. These oft-neglected topics separate truly professional code from also-rans, and we will look at IPython's support for these phases of development. Finally, we will discuss where the field is going. Part of the fun of programming is that everything changes every other year (if not sooner), and we will speculate on what the future might hold.

What this book covers

Chapter 1, Using IPython for HPC, discusses the distinctive features of parallel and HPC computing and how IPython fits in (and how it does not).

Chapter 2, Advanced Shell Topics, introduces the basics of working with the command line including debugging, shell commands, and embedding, and describes the architecture that underlies it.

Chapter 3, Stepping Up to IPython for Parallel Computing, explores the features of IPython that relate directly to parallel computing. Different parallel architectures will be introduced and IPython's support for them will be described.

Chapter 4, Messaging with ZeroMQ and MPI, covers these messaging systems and how they can be used with IPython and parallel architectures.

Chapter 5, Opening the Toolkit – The IPython API, introduces some of the more useful libraries included with IPython, including performance profiling, AsyncResult, and View.

Chapter 6, Works Well with Others – IPython and Third-Party Tools, describes tools created by third-parties, including R, Octave, and Hy. The appropriate magics are introduced, passing data between the languages is demonstrated, and sample programs are examined.

Chapter 7, Seeing Is Believing – Visualization, provides an overview of various tools that can be used to produce visual representations of data and results. Matplotlib, bokeh, R, and Python-nvd3 are covered.

Chapter 8, But It Worked in the Demo! – Testing, covers issues related to unit testing programs and the tools IPython provides to support this process. Frameworks discussed include unittest, pytest, and nose2.

Chapter 9, Documentation, discusses the different audience for documentation and their requirements. The use of docstrings with reStructuredText, docutils, and Sphinx is demonstrated in the context of good documentation standards.

Chapter 10, Visiting Jupyter, introduces the Jupyter notebook and describes its use as a laboratory notebook combining data and calculations.

Chapter 11, Into the Future, reflects on the current rapid rate of change and speculates on what the future may hold, both in terms of the recent split between IPython and the Jupyter project and relative to some emerging trends in scientific computing in general.

What you need for this book

This book was written using the IPython 4.0 and 4.0.1 (stable) releases from August 2015 through March 2016; all examples and functions should work in these versions. When third-party libraries are required, the version used will be noted at that time. Given the rate of change of the IPython 4 implementation, the various third-party libraries, and the field in general, it is an unfortunate fact that getting every example in this book to run on every reader's machine is doubtful. Add to that the differences in machine architecture and configuration and the problem only worsens. Despite efforts to write straightforward, portable code, the reader should not be surprised if some work is required to make the odd example work on their system.

Who this book is for

This book is for IPython developers who want to make the most of IPython and perform advanced scientific computing with IPython, utilizing the ease of interactive computing.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-IPython-4. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/MasteringIPython40_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Using IPython for HPC

In this chapter, we are going to look at why IPython should be considered a viable tool for building high-performance and parallel systems.

This chapter covers the following topics:

The need for speedFortran as a solutionChoosing between IPython and FortranAn example case—the Fast Fourier TransformHigh-performance computing and the cloudGoing parallel

The need for speed

Computers have never been fast enough. From their very beginnings in antiquity as abaci to the building-sized supercomputers of today, the cry has gone up "Why is this taking so long?"

This is not an idle complaint. Humanity's ability to control the world depends on its ability to model it and to simulate different courses of action within that model. A medieval trader, before embarking on a trading mission, would pull out his map (his model of the world) and plot a course (a simulation of his journey). To do otherwise was to invite disaster. It took a long period of time and a specialized skill set to use these tools. A good navigator was an important team member. To go where no maps existed was a perilous journey.

The same is true today, except that the models have become larger and the simulations more intricate. Testing a new nuclear missile by actually launching it is ill-advised. Instead, a model of the missile is built in software and a simulation of its launching is run on a computer. Design flaws can be exposed in the computer (where they are harmless), and not in reality.

Modeling a missile is much more complex than modeling the course of a ship. There are more moving parts, the relevant laws of physics are more complicated, the tolerance for error is lower, and so on and so forth. This would not be possible without employing more sophisticated tools than the medieval navigator had access to. In the end, it is our tools' abilities that limit what we can do.

It is the nature of problems to expand to fill the limits of our capability to solve them. When computers were first invented, they seemed like the answer to all our problems. It did not take long before new problems arose.

Choosing between IPython and Fortran

We will start by taking a look at each language in general, and follow that with a discussion on the cost factors that impact a software project and how each language can affect them. No two software development projects are the same, and so the factors discussed next (along with many, many others) should serve as guidelines for the choice of language. This chapter is not an attempt to promote IPython at the expense of Fortran, but it shows that IPython is a superior choice when implementing certain important types of systems.

Fortran

Many of the benefits and drawbacks of Fortran are linked to its longevity. For the kinds of things that have not changed over the decades, Fortran excels (for example, numerical computing, which is what the language was originally designed for). Newer developments (for example, text processing, objects) have been added to the language in its various revisions.

The benefits of Fortran are as follows:

Compilation makes for efficient runtime performanceExistence of many tested and optimized libraries for scientific computingHighly portableOptimized for scientific computing (especially matrix operations)Stable language definition with a well-organized system for revisions

The drawbacks of Fortran are as follows:

Text processing is an add-onObject-orientation is a recent additionShrinking pool of new talent

IPython

IPython/Python is the new kid in town. It began in 2001 when Fernando Perez decided that he wanted some additional features out of Python. In particular, he wanted a more powerful command line and integration with a lab-notebook-style interface. The end result was a development environment that placed greater emphasis on ongoing interaction with the system than what traditional batch processing provided.

The nearly 45-year delay between the advent of Fortran and IPython's birth provided IPython the advantage of being able to natively incorporate ideas about programming that have arisen since Fortran was created (for example, object-orientation and sophisticated data structuring operations). However, its relative newness puts it behind in terms of installed code base and libraries. IPython, as an extension of Python, shares its benefits and drawbacks to a large extent.

The benefits of IPython are as follows:

Good at non-numeric computingMore conciseMany object-oriented featuresEase of adoptionUseful librariesSophisticated data structuring capabilitiesTesting and documentation frameworksBuilt-in visualization toolsEase of interaction while building and running systems

The drawbacks of IPython are as follows:

Its interpreted nature makes for slower runtimeFewer libraries (although the ones that exist are of high quality)

Some of these benefits deserve more extensive treatment here, while others merit entire chapters.

Object-orientation

Object-oriented programming (OOP) was designed for writing simulations. While some simulations reduce to computational application of physical laws (for example, fluid dynamics), other types of simulation (for example, traffic patterns and neural networks) require modeling the entities involved at a more abstract level. This is more easily accomplished with a language that supports classes and objects (such as Python) than an imperative language.

The ability to match a program structure to a problem's structure makes it easier to write, test, and debug a system. The OOP paradigm is simply superior when simulating a large number of individually identifiable, complex elements.

Ease of adoption

It is easy to learn Python. It is currently the most popular introductory programming language in the United States among the top 39 departments (http://cacm.acm.org/blogs/blog-cacm/176450-python-is-now-the-most-popular-introductory-teaching-language-at-top-us-universities/fulltext):

Note that Fortran is not on the list.

This is no accident, nor is Python limited to a "teaching language." Rather, it is a well-designed language with an easy-to-learn syntax and a gentle learning curve. It is much easier to learn Python than Fortran, and it is also easier to move from Fortran to Python than the reverse. This has led to an increasing use of Python in many areas.

Popularity – Fortran versus IPython

The trend toward teaching Python has meant that there is a much larger pool of potential developers who know Python. This is an important consideration when staffing a project.

TIOBE Software ranks the popularity of programming languages based on skilled engineers, courses, and third-party vendors. Their rankings for October 2015 put Python in the fifth place and growing. Fortran is 22nd (behind COBOL, which is 21st).

IEEE uses its own methods, and they produced the following graph:

The column on the left is the 2015 ranking, and the column on the right is the 2014 ranking, for comparison. Fortran came in 29th, with a Spectrum ranking of 39.5.

Useful libraries

The growing number of Python coders has led to an increasing number of libraries written in/for Python. SciPy, NumPy, and sage are leading the way, with new open source libraries coming out on a regular basis. The usefulness of a language is heavily dependent on its libraries, and while Python cannot boast the depth in this field that Fortran can, the sheer number of Python developers means that it is closing the gap rapidly.

The cost of building (and maintaining) software

If developers were all equal in talent, they worked for free, development time were no object, all code were bug-free, and all programs only needed to run once and were then thrown away, Fortran would be the clear winner given its efficiency and installed library base.

This is not how commercial software is developed. At a first approximation, a software project's cost can be broken down into the cost of several parts:

Requirements and specification gatheringDevelopmentExecutionTesting and maintenance

Requirements and specification gathering

There is no clear differentiation between IPython and Fortran in the difficulty of production, good requirements, and specifications. These activities are language-independent. While the availability of prewritten software packages may impact parts of the specification, both languages are equally capable of reducing requirements and specifications to a working system.

Development

As discussed previously, Python code tends to be more concise, leading to higher programmer productivity. Combine this with the growing numbers of developers already fluent in Python and Python is the clear winner in terms of reducing development time.

Execution

If it is costly to run on the target system (which is true for many supercomputers), or the program takes a long time to run (which is true for some large-scale simulations such as weather prediction), then the runtime efficiency of Fortran is unmatched. This consideration looms especially large when development on a program has largely concluded and the majority of the time spent on it is in waiting for it to complete its run.

Testing and maintenance

There are many different styles of testing: unit, coverage, mocks, web, and GUI, to name just a few. Good tests are hard to write and not very the effort put into them is often unappreciated. Most programmers will avoid writing tests if they can. To that end, it is important to have a set of good, easy-to-use testing tools.

Python has the advantage in this area, particularly because of such quality unit testing frameworks such as unit test, nose, and Pythoscope. The introspection capabilities of the Python language make the writing and use of testing frameworks much easier than those available for Fortran.

You could always just skip testing (it is, after all, expensive and unpopular), or do it the old-fashioned way; try a few values and check whether they work. This leads to an important consideration governing how much testing to do: the cost of being wrong. This type of cost is especially important in scientific and engineering computing. While the legal issues surrounding software liability are in flux, moral and practical considerations are important. No one wants to be the developer who was responsible for lethally overdosing chemotherapy patients because of a bug. There are types of programming for which this is not important (word processors come to mind), but any system that involves human safety or financial risk incurs a high cost when something goes wrong.

Maintenance costs are similar to testing costs in that maintenance programming tends to be unpopular and allows new errors to creep into previously correct code. Python's conciseness reduces maintenance costs by reducing the number of lines of code that need to be maintained. The superior testing tools allow the creation of comprehensive regression testing suites to minimize the chances of errors being introduced during maintenance.

Alternatives

There are alternatives to the stark IPython/Fortran choice: cross-language development and prototyping.

Cross-language development

Python began as a scripting language. As such, it was always meant to be able to interoperate with other languages. This can be a great advantage in several situations:

A divided development team: If some of your developers know only Fortran and some know only Python, it can be worth it to partition the system between the groups and define a well-structured interface between them. Functionality can then be assigned to the appropriate team:

Runtime-intensive sections to the Fortran groupProcess coordination, I/O, and others to the Python group

Useful existing libraries: It always seems like there is a library that does exactly what is needed but it is written in another language. Python's heritage as a scripting language means that there are many tools that can be used to make this process easier. Of particular interest in this context is F2Py (part of NumPy), which makes interfacing with Fortran code easier.Specialized functionality: Even without a pre-existing library, it may be advantageous to write some performance-sensitive modules in Fortran. This can raise development, testing, and maintenance costs, but it can sometimes be worth it. Conversely, IPython provides specialized functionality in several areas (testing, introspection, and graphics) that Fortran projects could use.

Prototyping and exploratory development

It is often the case that it is not clear before writing a program how useful that program will turn out to be. Experience with the finished product would provide important feedback, but building the entire system would be prohibitively costly.

Similarly, there may be several different ways to build a system. Without clear guidelines to start with, the only way to decide between alternatives is to build several different versions and see which one is the best.

These cases share the problem of needing the system to be complete before being able to decide whether to build the system in the first place.

The solution is to build a prototype—a partially functional system that nevertheless incorporates important features of the finished product as envisioned. The primary virtue of a prototype is its short development time and concomitant low cost. It is often the case that the prototype (or prototypes) will be thrown away after a short period of evaluation. Errors, maintainability, and software quality in general are not important insofar as they are important to evaluating the prototype (say, for use in estimating the schedule for the entire project).

Python excels as a prototyping language. It is flexible and easy to work with (reducing development time) while being powerful enough to implement sophisticated algorithms. Its interpreted nature is not an issue, as prototypes are generally not expected to be efficient (only quick and cheap).

It is possible to adopt an approach known as Evolutionary Prototyping. In this approach, an initial prototype is built and evaluated. Based on this evaluation, changes are decided upon. The changes are made to the original prototype, yielding an improved version. This cycle completes until the software is satisfactory. Among other advantages, this means that a working version of the system is always available for benchmarking, testing, and so on. The results of the ongoing evaluations may point out functionality that would be better implemented in one language or another, and these changes could be made as described in the section on cross-language development.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Mastering IPython 4.0 E-Book

Thomas Bitterman

About This Book

Who This Book Is For

What You Will Learn

In Detail

Style and approach

Table of Contents

Mastering IPython 4.0

Mastering IPython 4.0

Credits

About the Author

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Chapter 1. Using IPython for HPC

The need for speed

Choosing between IPython and Fortran

Fortran

IPython

Object-orientation

Ease of adoption

Popularity – Fortran versus IPython

Useful libraries

The cost of building (and maintaining) software

Requirements and specification gathering

Development

Execution

Testing and maintenance

Alternatives

Cross-language development

Prototyping and exploratory development