26,39 €
Boost your scientific and analytic capabilities in no time at all by discovering how to build real-world applications with NumPy
If you are an experienced Python developer who intends to drive your numerical and scientific applications with NumPy, this book is for you. Prior experience or knowledge of working with the Python language is required.
In today's world of science and technology, it's all about speed and flexibility. When it comes to scientific computing, NumPy tops the list. NumPy gives you both the speed and high productivity you need.
This book will walk you through NumPy using clear, step-by-step examples and just the right amount of theory. We will guide you through wider applications of NumPy in scientific computing and will then focus on the fundamentals of NumPy, including array objects, functions, and matrices, each of them explained with practical examples.
You will then learn about different NumPy modules while performing mathematical operations such as calculating the Fourier Transform; solving linear systems of equations, interpolation, extrapolation, regression, and curve fitting; and evaluating integrals and derivatives. We will also introduce you to using Cython with NumPy arrays and writing extension modules for NumPy code using the C API. This book will give you exposure to the vast NumPy library and help you build efficient, high-speed programs using a wide range of mathematical features.
This quick guide will help you get to grips with the nitty-gritties of NumPy using with practical programming examples. Each topic is explained in both theoretical and practical ways with hands-on examples providing you efficient way of learning and adequate knowledge to support your professional work.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 181
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: April 2016
Production reference: 1220416
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78439-367-0
www.packtpub.comAuthors
Leo (Liang-Huan) Chin
Tanmay Dutta
Copy Editor
Sonia Cheema
Reviewers
Miklós Prisznyák
Pruthuvi Maheshakya Wijewardena
Project Coordinator
Izzat Contractor
Commissioning Editor
Kartikey Pandey
Proofreader
Safis Editing
Acquisition Editor
Larissa Pinto
Indexer
Rekha Nair
Content Development Editor
Rohit Singh
Graphics
Kirk D'Penha
Disha Haria
Jason Monteiro
Technical Editor
Murtaza Tinwala
Production Coordinator
Melwyn Dsa
Leo (Liang-Huan) Chin is a data engineer with more than 5 years of experience in the field of Python. He works for Gogoro smart scooter, Taiwan, where his job entails discovering new and interesting biking patterns . His previous work experience includes ESRI, California, USA, which focused on spatial-temporal data mining. He loves data, analytics, and the stories behind data and analytics. He received an MA degree of GIS in geography from State University of New York, Buffalo. When Leo isn't glued to a computer screen, he spends time on photography, traveling, and exploring some awesome restaurants across the world. You can reach Leo at http://chinleock.github.io/portfolio/.
Tanmay Dutta is a seasoned programmer with expertise in programming languages such as Python, Erlang, C++, Haskell, and F#. He has extensive experience in developing numerical libraries and frameworks for investment banking businesses. He was also instrumental in the design and development of a risk framework in Python (pandas, NumPy, and Django) for a wealth fund in Singapore. Tanmay has a master's degree in financial engineering from Nanyang Technological University, Singapore, and a certification in computational finance from Tepper Business School, Carnegie Mellon University.
I would like to thank my wife and my brother for invaluable technical guidance and the rest of my family for supporting and encouraging me to write this book. I would like to express my gratitude to the editors, who provided me with support, encouragement, and valuable comments regarding the content and format and assisted me in the editing of this book. I would like to thank Packt Publishing for giving me the opportunity to coauthor this book.
Miklós Prisznyák is a senior software engineer with a scientific background. He graduated as a physicist and worked on his MSc thesis on Monte Carlo simulations of non-Abelian lattice quantum field theories in 1992. Having worked for 3 years at the Central Research Institute for Physics in Hungary, he joined MultiRáció Kft. in Budapest, a company founded by other physicists, which specialized in mathematical data analysis and the forecasting of economic data. It was here that he discovered the Python programming language in 2000. He set up his own consulting company in 2002 and worked on various projects for insurance, pharmacy, and e-commerce companies, using Python whenever he could. He also worked for a European Union research institute in Italy, testing, debugging, and developing a distributed, Python-based Zope/Plone web application. He moved to Great Britain in 2007, and at first, he worked for a Scottish start-up using Twisted Python. He then worked in the aerospace industry in England using, among others, the PyQt windowing toolkit, the Enthought application framework, and the NumPy and SciPy libraries. He returned to Hungary in 2012 and rejoined MultiRáció. Since then, he's mainly worked on a Python extension to OpenOffice/EuroOffice using NumPy and SciPy again, which allows users to solve nonlinear and stochastic optimization problems with the spreadsheet software Calc. He has also used Django, which is the most popular Python web framework currently. Miklós likes to travel and read books, and he is interested in the sciences, mathematics, linguistics, history, politics, go (the board game), and a few other topics. Besides this, he enjoys a good cup of coffee. However, he thinks nothing beats spending time with his brilliant, maths-savvy, Minecraft-programming, 13-year-old son, Zsombor, who also learned English on his own.
Pruthuvi Maheshakya Wijewardena holds a bachelor's degree in engineering from University of Moratuwa, Sri Lanka. He has contributed to the scikit-learn machine learning library as a Google Summer of Code participant and has experience working with the Python language, especially the NumPy, SciPy, pandas, and statsmodels libraries. While studying for his undergraduate degree, he was able to publish his thesis on machine learning. Currently, he works as a software engineer at WSO2, as a part of the data analytics team.
I would like to thank my mother, brothers, teachers, and friends.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.packtpub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Whether you are new to scientific/analytic programming, or a seasoned expert, this book will provide you with the skills you need to successfully create, optimize, and distribute your Python/NumPy analytical modules.
Starting from the beginning, this book will cover the key features of NumPy arrays and the details of tuning the data format to make it most fit to your analytical needs. You will then get a walkthrough of the core and submodules that are common to various multidimensional, data-typed analysis. Next, you will move on to key technical implementations, such as linear algebra and Fourier analysis. Finally, you will learn about extending your NumPy capabilities for both functionality and performance by using Cython and the NumPy C API. The last chapter of this book also provides advanced materials to help you learn further by yourself.
This guide is an invaluable tutorial if you are planning to use NumPy in analytical projects.
Chapter 1, An Introduction to NumPy, is a Getting Started chapter of this book, which provides the instructions to help you set up the environment. It starts with introducing the Scientific Python Module family (SciPy Stack) and explains the key role NumPy plays in scientific computing with Python.
Chapter 2, The NumPy ndarray Object, covers the essential usage of NumPy ndarray object, including the initialization, the fundamental attributes, data types, and memory layout. It also covers the theory underneath the operation, which gives you a clear picture of ndarray.
Chapter 3, Using Numpy Arrays, is an advanced chapter on NumPy ndarray usage, which continues Chapter 2, The NumPy ndarray Object. It covers the universal functions in NumPy and shows you the tricks to speed up your code. It also shows you the shape manipulation and broadcasting rules.
Chapter 4, Numpy Core and Libs Submodules, includes two sections. The first section has detailed explanation about the relationship between the way NumPy ndarray allocates memory and the interaction of CPU cache. The second part of this chapter covers the special NumPy Array containing multiple data types (the structure/record array). Also, this chapter explores the experimental datetime64 module in NumPy.
Chapter 5, Linear Algebra in NumPy, starts by utilizing matrix and mathematical computation using linear algebra modules. It shows you multiple ways to solve a mathematical problem: using Matrix, vector decomposition, and polynomials. It also provides concrete practice for curve fitting and regression.
Chapter 6, Fourier Analysis in NumPy, covers the signal processing with NumPy FFT module and the Fourier application on amplifying signals/enlarging images without distortion. It also provides the basic usage of the matplotlib package in Python.
Chapter 7, Building and Distributing NumPy Code, covers the basic details around packaging and publishing the code in Python. It provides a basic introduction to NumPy-specific setup files and how to build extension modules.
Chapter 8, Speeding Up NumPy with Cython, introduces the users to the Cython programming language and introduces readers to techniques that can be used to speed up existing Python code.
Chapter 9, Introduction to the NumPy C-API, provides a basic introduction to the NumPy C API and, in general, how to write wrappers around the existing C/C++ library. The chapter aims to provide a gentle introduction along with equipping the readers with a basic knowledge of how to create new wrappers and understand the existing programs.
Chapter 10, Further Reading, is the last chapter of this book. It gives a summary of what we've learned in the book and explores 4 SciPy stack Python modules relying on NumPy arrays, which give you ideas about further scientific Python programming.
For this book, you will need the following setup:
If you know Python, but are new to scientific programming and want to enter the world of scientific computation, or perhaps you are a Python developer with experience in analytics, but want to gain insight to enhance your analytical skills. In either case, NumPy or this book is ideal for you. Learning NumPy and how to apply it to your Python programs is perfect as your next step towards building professional analytical applications. It would be helpful to have a bit of familiarity with basic programming concepts and mathematics, but no prior experience is required. The later chapters cover concepts such as package distribution, speeding-up code, and C/C++ integration, which require a certain amount of programming and debugging know-how. The readers are assumed to be able to build C/C++ programs in their preferred choice of OS (use gcc in linux and cygwin/migw and more in Windows).
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Note that SciPy can mean a number of thing, like the Python module named scipy."
A block of code is set as follows:
In [42]: print("Hello, World!")Any command-line input or output is written as follows:
In [6]: x Out[6]: array([[1, 2, 3], [2, 3, 4]]) In [7]: x[0,0] Out[7]: 1 In [8]: x[1,2] Out[8]: 4New terms and important words are shown in bold.
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/NumPyEssentials_ColoredImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
"I'd rather do math in a general-purpose language than try to do general-purpose programming in a math language."
--John D CookPython has become one of the most popular programming languages in scientific computing over the last decade. The reasons for its success are numerous, and these will gradually become apparent as you proceed with this book. Unlike many other mathematical languages, such as MATLAB, R and Mathematica, Python is a general-purpose programming language. As such, it provides a suitable framework to build scientific applications and extend them further into any commercial or academic domain. For example, consider a (somewhat) simple application that requires you to write a piece of software and predicts the popularity of a blog post. Usually, these would be the steps that you'd take to do this:
Normally, as you move through these steps, you will find yourself jumping between different software stacks. Step 1 requires a lot of web scraping. Web scraping is a very common problem, and there are tools in almost every programming language to scrape the Web (if you are already using Python, you would probably choose Beautiful Soup or Scrapy). Steps 2 and 3 involve solving a machine learning problem and require the use of sophisticated mathematical languages or frameworks, such as Weka or MATLAB, which are only a few of the vast variety of tools that provide machine learning functionality. Similarly, step 4 can be implemented in many ways using many different tools. There isn't one right answer. Since this is a problem that has been amply studied and solved (to a reasonable extent) by a lot of scientists and software developers, getting a working solution would not be difficult. However, there are issues, such as stability and scalability, that might severely restrict your choice of programming languages, web frameworks, or machine learning algorithms in each step of the problem. This is where Python wins over most other programming languages. All the preceding steps (and more) can be accomplished with only Python and a few third-party Python libraries. This flexibility and ease of developing software in Python is precisely what makes it a comfortable host for a scientific computing ecosystem. A very interesting interpretation of Python's prowess as a mature application development language can be found in Python Data Analysis, Ivan Idris, Packt Publishing. Precisely, Python is a language that is used for rapid prototyping, and it is also used to build production-quality software because of the vast scientific ecosystem it has acquired over time. The cornerstone of this ecosystem is NumPy.
Numerical Python (NumPy) is a successor to the Numeric package. It was originally written by Travis Oliphant to be the foundation of a scientific computing environment in Python. It branched off from the much wider SciPy module in early 2005 and had its first stable release in mid-2006. Since then, it has enjoyed growing popularity among Pythonists who work in the mathematics, science, and engineering fields. The goal of this book is to make you conversant enough with NumPy so that you're able to use it and can build complex scientific applications with it.
Let's begin by taking a brief tour of the Scientific Python (SciPy) stack.
