39,59 €
This book is for developers who want to use Python to write programs that lean heavily on functional programming design patterns. You should be comfortable with Python programming, but no knowledge of functional programming paradigms is needed.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 518
Veröffentlichungsjahr: 2015
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: January 2015
Production reference: 1270115
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78439-699-2
www.packtpub.com
Author
Steven Lott
Reviewers
Oleg Broytman
Rui Carmo
Ian Cordasco
Julien Danjou
Amoatey Harrison
Shivin Kapur
Gong Yi
Commissioning Editor
Ed Gordon
Acquisition Editor
Owen Roberts
Content Development Editor
Sumeet Sawant
Technical Editor
Abhishek R. Kotian
Copy Editors
Roshni Banerjee
Pranjali Chury
Deepa Nambiar
Karuna Narayanan
Nithya P
Project Coordinator
Danuta Jones
Proofreaders
Stephen Copestake
Maria Gould
Bernadette Watkins
Indexer
Hemangini Bari
Graphics
Abhinash Sahu
Production Coordinator
Melwyn D'sa
Cover Work
Melwyn D'sa
Steven Lott has been programming since the 1970s, when computers were large, expensive, and rare. As a contract software developer and architect, he has worked on hundreds of projects, from very small to very large. He's been using Python to solve business problems for over 10 years. He's particularly adept struggling with gnarly data representation problems. His other titles include Mastering Object-Oriented Python and Python for Secret Agents, both by Packt Publishing. After spending years as a technomad—living in various places on the east coast of the US—he has dropped the hook in the Chesapeake Bay. He blogs at http://slott-softwarearchitect.blogspot.com.
Oleg Broytman is a software developer currently working with web technologies on Unix/Linux using the Python programming language on the server side and Javascript on the client side. Oleg started to work with computers even before the IBM PC era. He worked with DOS for some time and then switched to Unix, mostly Linux and FreeBSD. For about 25 years, he has been working in the medicine field in Moscow, Russia.
You can contact him at <[email protected]>.
I'd like to thank my wife Olga! She supports everything I do any way I do it (well, mostly!). Her love and support make me happy and allow me to achieve as much as I do.
Rui Carmo is a systems architect with 20 years' experience in telecoms and Internet, having worked in software development, product management, mobile network planning, systems engineering, virtualization, cloud services, and a lot of what the industry is currently trying to roll into the "DevOps" moniker. He has been coding in Python for over a decade (starting with Python 2.3) and has gravitated towards Clojure, Erlang, and Hy (an LISP that leverages the Python AST) in the past few years due to the intrinsic advantages of functional programming.
He currently lives in the wonderful city of Lisbon, Portugal, with his wife and two children, and he blogs at http://taoofmac.com. You can find him on GitHub, Twitter, and Hacker News as @rcarmo.
Julien Danjou is an open source hacker working at Red Hat. He started his career as a Debian developer and contributed to a lot of free software (GNU Emacs, Freedesktop, and so on), writing some software on his own, such as the awesome window manager.
Nowadays, Julien contributes to OpenStack, an open source cloud platform entirely written in Python. He has been a Python developer since then, worked on Hy (an LISP dialect in Python), and written a self-published book titled The Hacker's Guide to Python in 2014.
Amoatey Harrison is a Python programmer with a passion for building software systems to solve problems. When he is not programming, he is playing video games, swimming, or simply hanging out with friends.
After graduating from the Kwame Nkrumah University of Science and Technology with a degree in computer engineering, he is doing his national service at the GCB Bank Head Office in Accra, Ghana.
He would like to think of himself as a cool nerd.
ShivinKapur is an aspiring computer science student who is passionate about learning new things.
Gong Yi is a software developer working in Shanghai, China. He maintains an open source project at https://github.com/topikachu/python-ev3, which can control LEGO® MINDSTORMS® EV3 by Python language.
I thank my wife Zhu Xiaoling for her patience and love, and I thank my son Yang Yang for being the best thing ever.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Programming languages sometimes fit neatly into tidy categories like imperative and functional. Imperative languages might further subdivide into those that are procedural and those that include features for object-oriented programming. The Python language, however, contains aspects of all of these three language categories. Though Python is not a purely functional programming language, we can do a great deal of functional programming in Python.
Most importantly, we can leverage many design patterns and techniques from other functional languages and apply them to Python programming. These borrowed concepts can lead us to create succinct and elegant programs. Python's generator expressions, in particular, avoid the need to create large in-memory data structures, leading to programs which may execute more quickly because they use fewer resources.
We can't easily create purely functional programs in Python. Python lacks a number of features that would be required for this. For example, we don't have unlimited recursion, lazy evaluation of all expressions, and an optimizing compiler.
Generally, Python emphasizes strict evaluation rules. This means that statements are executed in order and expressions are evaluated from left to right. While this deviates from functional purity, it allows us to perform manual optimizations when writing in Python. We'll take a hybrid approach to functional programming using Python's functional features when they can add clarity or simplify the code and use ordinary imperative features for optimization.
There are several key features of functional programming languages that are available in Python. One of the most important is the idea that functions are first-class objects. In some languages, functions exist only as a source code construct: they don't exist as proper data structures at runtime. In Python, functions can use functions as arguments and return functions as results.
Python offers a number of higher-order functions. Functions like map(), filter(), and functools.reduce() are widely used in this role. Less obvious functions like sorted(), min(), and max() are also higher-order functions; they have a default function and, consequently, different syntax from the more common examples.
Functional programs often exploit immutable data structures. The emphasis on stateless objects permits flexible optimization. Python offers tuples and namedtuples as complex but immutable objects. We can leverage these structures to adapt some design practices from other functional programming languages.
Many functional languages emphasize recursion but exploit Tail-Call Optimization (TCO). Python tends to limit recursion to a relatively small number of stack frames. In many cases, we can think of a recursion as a generator function. We can then simply rewrite it to use a yield from statement, doing the tail-call optimization ourselves.
We'll look at the core features of functional programming from a Python point of view. Our objective is to borrow good ideas from functional programming languages, and use these ideas to create expressive and succinct applications in Python.
Chapter 1, Introducing Functional Programming, introduces some of the techniques that characterize functional programming. We'll identify some of the ways to map these features to Python, and finally, we'll also address some ways that the benefits of functional programming accrue when we use these design patterns to build Python applications.
Chapter 2, Introducing Some Functional Features, will delve into six central features of the functional programming paradigm. We'll look at each in some detail to see how they're implemented in Python. We'll also point out some features of functional languages that don't apply well to Python. In particular, many functional languages have complex type-matching rules required to support compilation and optimization.
Chapter 3, Functions, Iterators, and Generators, will show how to leverage immutable Python objects and generator expressions, and adapt functional programming concepts to the Python language. We'll look at some of the built-in Python collection and how we can leverage them without departing too far from functional programming concepts.
Chapter 4, Working with Collections, shows how we can use a number of built-in Python functions to operate on collections of data. This section will focus on a number of relatively simple functions such as any() and all(), which will reduce a collection of values to a single result.
Chapter 5, Higher-order Functions, examines the commonly used higher order functions such as map() and filter(). The chapter also includes a number of other functions that are also higher-order functions, as well as how we can create our own higher-order functions.
Chapter 6, Recursions and Reductions, shows how we can design an algorithm using recursion and then optimize it into a high performance for loop. We'll also look at some other reductions that are widely used, including the collections.Counter() function.
Chapter 7, Additional Tuple Techniques, shows a number of ways in which we can use immutable tuples and namedtuples instead of stateful objects. Immutable objects have a much simpler interface: we never have to worry about abusing an attribute and setting an object into some inconsistent or invalid state.
Chapter 8, The Itertools Module, examines a number of functions in the standard library module. This collection of functions simplifies writing programs that deal with collections or generator functions.
Chapter 9, More Itertools Techniques, covers the combinatoric functions in the itertools module. These functions are somewhat less useful. This chapter includes some examples that illustrate ill-considered uses of these functions and the consequences of combinatoric explosion.
Chapter 10, The Functools Module, will show how to use some of the functions in this module for functional programming. A few of the functions in this module are more appropriate for building decorators, and are left for the next chapter. The other functions, however, provide several more ways to design and implement function programs.
Chapter 11, Decorator Design Techniques, shows how we can look at a decorator as a way to build a composite function. While there is considerable flexibility here, there are also some conceptual limitations: we'll look at ways in which overly complex decorators can become confusing rather than helpful.
Chapter 12, The Multiprocessing and Threading Modules, points out an important consequence of good functional design: we can distribute the processing workload. Using immutable objects means that we can't corrupt an object because of poorly synchronized write operations.
Chapter 13, Conditional Expressions and the Operator Module, will show some ways in which we can break out of Python's strict order of evaluation. There are limitations to what we can achieve here. We'll also look at the operator module and how the operator module can lead to some slight clarification of some simple kinds of processing.
Chapter 14, The PyMonad Library, examines some of the features of the PyMonad library. This provides some additional functional programming features. This also provides a way to learn more about monads. In some functional languages, monads are an important way to force a particular order for operations that might get optimized into an undesirable order. Since Python already has strict ordering of expressions and statements, the monad feature is more instructive than practical.
Chapter 15, A Functional Approach to Web Services, shows how we can think of web services as a nested collection of functions that transform a request into a reply. We'll see ways in which we can leverage functional programming concepts for building responsive, dynamic web content.
Chapter 16, Optimizations and Improvements, includes some additional tips on performance and optimization. We'll emphasize techniques like memoization because they're easy to implement and can—in the right context—yield dramatic performance improvements.
This book presumes some familiarity with Python 3 and general concepts of application development. We won't look deeply at subtle or complex features of Python; we'll avoid much consideration of the internals of the language.
We'll presume some familiarity with functional programming. Since Python is not a functional programming language, we can't dig deeply into functional concepts. We'll pick and choose the aspects of functional programming that fit well with Python and leverage just those that seem useful.
Some of the examples use Exploratory Data Analysis (EDA) as a problem domain to show the value of functional programming. Some familiarity with basic probability and statistics will help with this. There are only a few examples that move into more serious data science.
You'll need to have Python 3.3 or 3.4 installed and running. For more information on Python, visit http://www.python.org/.
In Chapter 14, The PyMonad Library, we'll look at installing this additional library. If you have Python 3.4 ,which includes pip and Easy Install, this will be very easy. If you have Python 3.3, you might have already installed pip or Easy Install or both. Once you have an installer, you can add PyMonad. Visit https://pypi.python.org/pypi/PyMonad/ for more details.
This book is for programmers who want to create succinct, expressive Python programs by borrowing techniques and design patterns from functional programming languages. Some algorithms can be expressed elegantly in a functional style; we can—and should—adapt this to make Python programs more readable and maintainable.
In some cases, a functional approach to a problem will also lead to extremely high performance algorithms. Python makes it too easy to create large intermediate data structures, tying up memory and processor time. With functional programming design patterns, we can often replace large lists with generator expressions that are equally expressive, but take up much less memory and run much more quickly.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title through the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the erratasubmissionform link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website, or added to any list of existing errata, under the Errata section of that title.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.
Functional programming defines a computation using expressions and evaluation—often encapsulated in function definitions. It de-emphasizes or avoids the complexity of state change and mutable objects. This tends to create programs that are more succinct and expressive. In this chapter, we'll introduce some of the techniques that characterize functional programming. We'll identify some of the ways to map these features to Python. Finally, we'll also address some ways in which the benefits of functional programming accrue when we use these design patterns to build Python applications.
Python has numerous functional programming features. It is not a purely functional programming language. It offers enough of the right kinds of features that it confers to the benefits of functional programming. It also retains all optimization power available from an imperative programming language.
We'll also look at a problem domain that we'll use for many of the examples in this book. We'll try to stick closely to Exploratory Data Analysis (EDA) because its algorithms are often good examples of functional programming. Furthermore, the benefits of functional programming accrue rapidly in this problem domain.
Our goal is to establish some essential principles of functional programming. The more serious Python code will begin in Chapter 2, Introducing Some Functional Features.
We'll focus on Python 3 features in this book. However, some of the examples might also work in Python 2.
It's difficult to be definitive on what fills the universe of programming paradigms. For our purposes, we will distinguish between just two of the many programming paradigms: Functionalprogrammingand Imperativeprogramming. One important distinguishing feature between these two is the concept of state.
In an imperative language, like Python, the state of the computation is reflected by the values of the variables in the various namespaces. The values of the variables establish the state of a computation; each kind of statement makes a well-defined change to the state by adding or changing (or even removing) a variable. A language is imperative because each statement is a command, which changes the state in some way.
Our general focus is on the assignment statement and how it changes state. Python has other statements, such as global or nonlocal, which modify the rules for variables in a particular namespace. Statements like def, class, and import change the processing context. Other statements like try, except, if, elif, and else act as guards to modify how a collection of statements will change the computation's state. Statements like for and while, similarly, wrap a block of statements so that the statements can make repeated changes to the state of the computation. The focus of all these various statement types, however, is on changing the state of the variables.
Ideally, each statement advances the state of the computation from an initial condition toward the desired final outcome. This "advances the computation" assertion can be challenging to prove. One approach is to define the final state, identify a statement that will establish this final state, and then deduce the precondition required for this final statement to work. This design process can be iterated until an acceptable initial state is derived.
In a functional language, we replace state—the changing values of variables—with a simpler notion of evaluating functions. Each function evaluation creates a new object or objects from existing objects. Since a functional program is a composition of a function, we can design lower-level functions that are easy to understand, and we will design higher-level compositions that can also be easier to visualize than a complex sequence of statements.
Function evaluation more closely parallels mathematical formalisms. Because of this, we can often use simple algebra to design an algorithm, which clearly handles the edge cases and boundary conditions. This makes us more confident that the functions work. It also makes it easy to locate test cases for formal unit testing.
It's important to note that functional programs tend to be relatively succinct, expressive, and efficient when compared to imperative (object-oriented or procedural) programs. The benefit isn't automatic; it requires a careful design. This design effort is often easier than functionally similar procedural programming.
As part of our introduction, we'll look at a classic example of functional programming. This is based on the classic paper Why Functional Programming Matters by John Hughes. The article appeared in a paper called Research Topics in Functional Programming, edited by D. Turner, published by Addison-Wesley in 1990.
Here's a link to the paper Research Topics in Functional Programming:
http://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf
This discussion of functional programming in general is profound. There are several examples given in the paper. We'll look at just one: the Newton-Raphson algorithm for locating the roots of a function. In this case, the function is the square root.
It's important because many versions of this algorithm rely on the explicit state managed via loops. Indeed, the Hughes paper provides a snippet of the Fortran code that emphasizes stateful, imperative processing.
The backbone of this approximation is the calculation of the next approximation from the current approximation. The next_() function takes x, an approximation to the sqrt(n) method and calculates a next value that brackets the proper root. Take a look at the following example:
This function computes a series of values . The distance between the values is halved each time, so they'll quickly get to converge on the value such that, which means . We don't want to call the method next() because this name would collide with a built-in function. We call it the next_() method so that we can follow the original presentation as closely as possible.
Here's how the function looks when used in the command prompt:
We've defined the f() method as a lambda that will converge on . We started with 1.0 as the initial value for . Then we evaluated a sequence of recursive evaluations: , and so on. We evaluated these functions using a generator expression so that we could round off each value. This makes the output easier to read and easier to use with doctest. The sequence appears to converge rapidly on .
We can write a function, which will (in principle) generate an infinite sequence of values converging on the proper square root:
This function will generate approximations using a function, f(), and an initial value, a. If we provide the next_() function defined earlier, we'll get a sequence of approximations to the square root of the n argument.
The repeat() function expects the f() function to have a single argument, however, our next_() function has two arguments. We can use a lambda object, lambda x: next_(n, x), to create a partial version of the next_() function with one of two variables bound.
The Python generator functions can't be trivially recursive, they must explicitly iterate over the recursive results, yielding them individually. Attempting to use a simple return repeat(f, f(a)) will end the iteration, returning a generator expression instead of yielding the sequence of values.
We have two ways to return all the values instead of returning a generator expression, which are as follows:
Both techniques of yielding the values of a recursive generator function are equivalent. We'll try to emphasize yield from. In some cases, however, the yield with a complex expression will be more clear than the equivalent mapping or generator expression.
Of course, we don't want the entire infinite sequence. We will stop generating values when two values are so close to each other that we can call either one the square root we're looking for. The common symbol for the value, which is close enough, is the Greek letter Epsilon, ε, which can be thought of as the largest error we will tolerate.
In Python, we'll have to be a little clever about taking items from an infinite sequence one at a time. It works out well to use a simple interface function that wraps a slightly more complex recursion. Take a look at the following code snippet:
We've defined an internal function, head_tail(), which accepts the tolerance, ε, an item from the iterable sequence, a, and the rest of the iterable sequence, iterable. The next item from the iterable bound to a name b. If , then the two values that are close enough together that we've found the square root. Otherwise, we use the b value in a recursive invocation of the head_tail() function to examine the next pair of values.
Our within() function merely seeks to properly initialize the internal head_tail() function with the first value from the iterable parameter.
Some functional programming languages offer a technique that will put a value back into an iterable sequence. In Python, this might be a kind of unget() or previous() method that pushes a value back into the iterator. Python iterables don't offer this kind of rich functionality.
We can use the three functions next_(), repeat(), and within() to create a square root function, as follows:
We've used the repeat() function to generate a (potentially) infinite sequence of values based on the next_(n,x) function. Our within() function will stop generating values in the sequence when it locates two values with a difference less than ε.
When we use this version of the sqrt() method, we need to provide an initial seed value, a0, and an ε value. An expression like sqrt(1.0, .0001, 3) will start with an approximation of 1.0 and compute the value of to within 0.0001. For most applications, the initial a0 value can be 1.0. However, the closer it is to the actual square root, the more rapidly this method converges.
The original example of this approximation algorithm was shown in the Miranda language. It's easy to see that there are few profound differences between Miranda and Python. The biggest difference is Miranda's ability to construct cons, a value back into an iterable, doing a kind of unget. This parallelism between Miranda and Python gives us confidence that many kinds of functional programming can be easily done in Python.
Later in this book, we'll use the field of EDA as a source for concrete examples of functional programming. This field is rich with algorithms and approaches to working with complex datasets; functional programming is often a very good fit between the problem domain and automated solutions.
While details vary from author to author, there are several widely accepted stages of EDA. These include the following:
Visit the following link to get more information how SciPY works and its usage:
https://www.packtpub.com/big-data-and-business-intelligence/learning-scipy-numerical-and-scientific-computing or https://www.packtpub.com/big-data-and-business-intelligence/learning-python-data-visualization
Data modeling and machine learning: This tends to be proscriptive as it involves extending a model to new data. We're going to skirt this because some of the models can become mathematically complex. If we spend too much time on these topics, we won't be able to focus on functional programming.Evaluation and comparison: When there are alternative models, each must be evaluated to determine which is a better fit for the available data. This can involve ordinary descriptive statistics of model outputs. This can benefit from functional design techniques.The goal of EDA is often to create a model that can be deployed as a decision support application. In many cases, a model might be a simple function. A simple functional programming approach can apply the model to new data and display results for human consumption.
We've looked at programming paradigms with an eye toward distinguishing the functional paradigm from two common imperative paradigms. Our objective in this book is to explore the functional programming features of Python. We've noted that some parts of Python don't allow purely functional programming; we'll be using some hybrid techniques that meld the good features of succinct, expressive functional programming with some high-performance optimizations in Python.
In the next chapter, we'll look at five specific functional programming techniques in detail. These techniques will form the essential foundation for our hybridized functional programming in Python.
Since we're not using variables to track the state of a computation, our focus needs to stay on immutable objects. We can make extensive use of tuples and namedtuples to provide more complex data structures that are immutable.
The idea of immutable objects is not foreign to Python. There can be a performance advantage to using immutable tuples instead of more complex mutable objects. In some cases, the benefits come from rethinking the algorithm to avoid the costs of object mutation.
We will avoid class definitions (almost) entirely. It can seem like it's anathema to avoid objects in an Object-OrientedProgramming (OOP) language. Functional programming simply doesn't need stateful objects. We'll see this throughout this book. There are reasons for defining callable objects; it is a tidy way to provide namespace for closely-related functions, and it supports a pleasant level of configurability.
We'll look at a common design pattern that works well with immutable objects: the wrapper() function. A list of tuples is a fairly common data structure. We will often process this list of tuples in one of the two following ways:
For example, look at the following command snippet:
This fits the three-part pattern, although it might not be obvious how well it fits.
First, we wrap, using map(lambda yc: (yc[1],yc), year_cheese). This will transform each item into a two tuple with a key followed by the original item. In this example, the comparison key is merely yc[1].
Second, do the processing using the max() function. Since each piece of data has been simplified to a two tuple with position zero used for comparison, we don't really need the higher-order function feature of the max() function. The default behavior of the max() function is exactly what we require.
Finally, we unwrap using the subscript [1]. This will pick the second element of the two tuple selected by the max() function.
This kind of wrap and unwrap is so common that some languages have special functions with names like fst() and snd() that we can use as a function prefix instead of a syntactic suffix of [0] or [1]. We can use this idea to modify our wrap-process-unwrap example, as follows:
We defined a snd() function to pick the second item from a tuple. This provides us with an easier-to-read version of unwrap(process(wrap())). We used map(lambda... , year_cheese) to wrap our raw data items. We used max() function as the processing and, finally, the snd()