41,99 €
Develop deep neural networks in Theano with practical code examples for image classification, machine translation, reinforcement agents, or generative models.
This book is indented to provide a full overview of deep learning. From the beginner in deep learning and artificial intelligence, to the data scientist who wants to become familiar with Theano and its supporting libraries, or have an extended understanding of deep neural nets.
Some basic skills in Python programming and computer science will help, as well as skills in elementary algebra and calculus.
This book offers a complete overview of Deep Learning with Theano, a Python-based library that makes optimizing numerical expressions and deep learning models easy on CPU or GPU.
The book provides some practical code examples that help the beginner understand how easy it is to build complex neural networks, while more experimented data scientists will appreciate the reach of the book, addressing supervised and unsupervised learning, generative models, reinforcement learning in the fields of image recognition, natural language processing, or game strategy.
The book also discusses image recognition tasks that range from simple digit recognition, image classification, object localization, image segmentation, to image captioning. Natural language processing examples include text generation, chatbots, machine translation, and question answering. The last example deals with generating random data that looks real and solving games such as in the Open-AI gym.
At the end, this book sums up the best -performing nets for each task. While early research results were based on deep stacks of neural layers, in particular, convolutional layers, the book presents the principles that improved the efficiency of these architectures, in order to help the reader build new custom nets.
It is an easy-to-follow example book that teaches you how to perform fast, efficient computations in Python. Starting with the very basics-NumPy, installing Theano, this book will take you to the smooth journey of implementing Theano for advanced computations for machine learning and deep learning.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 276
Veröffentlichungsjahr: 2017
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2017
Production reference: 1280717
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78646-582-5
www.packtpub.com
Author
Christopher Bourez
Reviewers
Matthieu de Beaucorps
Frederic Bastien
Arnaud Bergeron
Pascal Lamblin
Commissioning Editor
Amey Varangaonkar
Acquisition Editor
Veena Pagare
Content Development Editor
Amrita Noronha
Technical Editor
Akash Patel
Copy Editor
Safis Editing
Project Coordinator
Shweta H Birwatkar
Proofreader
Safis Editing
Indexer
Pratik Shirodkar
Graphics
Tania Dutta
Production Coordinator
Shantanu Zagade
Cover Work
Shantanu N. Zagade
Christopher Bourez graduated from Ecole Polytechnique and Ecole Normale Supérieure de Cachan in Paris in 2005 with a Master of Science in Math, Machine Learning and Computer Vision (MVA).
For 7 years, he led a company in computer vision that launched Pixee, a visual recognition application for iPhone in 2007, with the major movie theater brand, the city of Paris and the major ticket broker: with a snap of a picture, the user could get information about events, products, and access to purchase.
While working on missions in computer vision with Caffe, TensorFlow or Torch, he helped other developers succeed by writing on a blog on computer science. One of his blog posts, a tutorial on the Caffe deep learning technology, has become the most successful tutorial on the web after the official Caffe website.
On the initiative of Packt Publishing, the same recipes that made the sucess of his Caffe tutorial have been ported to write this book on Theano technology. In the meantime, a wide range of problems for Deep Learning are studied to gain more practice with Theano and its application.
This book has been written in less than a year, and I would like to thank Mohammed Jabreel for his help with writing texts and code examples for chapters 3 and 5.
Mohammed Hamood Jabreel is is a PhD student in Computer Science Engineering at the Department of Computer Science and Mathematics, Universitat Rovira i Virgili. He has received a Master degree in Computer Engineering: Computer Security and Intelligent Systems from Universitat Rovira i Virgili , Spain in2015 and a Bachelor's degree in Computer Science in 2009 from Hodiedha University. His main research interest is the Natural Language Processing, Text Mining and Sentiment Analysis.
Second, I would like to thank IBM for their tremendous support through the Global Entrepeneur Program. Their infrastructure of dedicated GPUs has been of uncomparable quality and performance to train the neural networks.
Last, I would like to thank the reviewers, Matthieu de Beaucorps and Pascal Lamblin, as well as the Packt employees Amrita and Vinay for their ideas and follow-up.
Happy reading.
Matthieu de Beaucorps is a machine learning specialist with an engineering background. Since 2012, he has been working on developing deep neural nets to enhance identification and recommendation tasks in computer vision, audio, and NLP.
Pascal Lamblin is a software analyst at MILA (Montreal Institute for Learning Algorithms). After completing his engineering degree at École Centrale Paris, Pascal has done some research under the supervision of Yoshua Bengio at Université de Montréal and is now working on the development of Theano.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1786465825. If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Gain insight and practice with neural net architecture design to solve problems with artificial intelligence. Understand the concepts behind the most advanced networks in deep learning. Leverage Python language with Theano technology, to easily compute derivatives and minimize objective functions of your choice.
Chapter 1, Theano Basics, helps the reader to reader learn main concepts of Theano to write code that can compile on different hardware architectures and optimize automatically complex mathematical objective functions.
Chapter 2, Classifying Handwritten Digits with a Feedforward Network, will introduce a simple, well-known and historical example which has been the starting proof of superiority of deep learning algorithms. The initial problem was to recognize handwritten digits.
Chapter 3, Encoding word into Vector, one of the main challenge with neural nets is to connect the real world data to the input of a neural net, in particular for categorical and discrete data. This chapter presents an example on how to build an embedding space through training with Theano.
Such embeddings are very useful in machine translation, robotics, image captioning, and so on because they translate the real world data into arrays of vectors that can be processed by neural nets.
Chapter 4, Generating Text with a Recurrent Neural Net, introduces recurrency in neural nets with a simple example in practice, to generate text.
Recurrent neural nets (RNN) are a popular topic in deep learning, enabling more possibilities for sequence prediction, sequence generation, machine translation, connected objects. Natural Language Processing (NLP) is a second field of interest that has driven the research for new machine learning techniques.
Chapter 5, Analyzing Sentiments with a Bidirectional LSTM,applies embeddings and recurrent layers to a new task of natural language processing, sentiment analysis. It acts as a kind of validation of prior chapters.
In the meantime, it demonstrates an alternative way to build neural nets on Theano, with a higher level library, Keras.
Chapter 6, Locating with Spatial Transformer Networks, applies recurrency to image, to read multiple digits on a page at once. This time, we take the opportunity to rewrite the classification network for handwritten digits images, and our recurrent models, with the help of Lasagne, a library of built-in modules for deep learning with Theano.
Lasagne library helps design neural networks for experimenting faster. With this help, we'll address object localization, a common computer vision challenge, with Spatial Transformer modules to improve our classification scores.
Chapter 7, Classifying Images with Residual Networks, classifies any type of images at the best accuracy. In the mean time, to build more complex nets with ease, we introduce a library based on Theano framework, Lasagne, with many already implemented components to help implement neural nets faster for Theano.
Chapter 8, Translating and Explaining through Encoding – decoding Networks, presents encoding-decoding techniques: applied to text, these techniques are heavily used in machine-translation and simple chatbots systems. Applied to images, they serve scene segmentations and object localization. Last, image captioning is a mixed, encoding images and decoding to texts.
This chapter goes one step further with a very popular high level library, Keras, that simplifies even more the development of neural nets with Theano.
Chapter 9, Selecting Relevant Inputs or Memories with the Mechanism of Attention, for solving more complicated tasks, the machine learning world has been looking for higher level of intelligence, inspired by nature: reasoning, attention and memory. In this chapter, the reader will discover the memory networks on the main purpose of artificial intelligence for natural language processing (NLP): the language understanding.
Chapter 10, Predicting Times Sequence with Advanced RNN, time sequences are an important field where machine learning has been used heavily. This chapter will go for advanced techniques with Recurrent Neural Networks (RNN), to get state-of-art results.
Chapter 11, Learning from the Environment with Reinforcement, reinforcement learning is the vast area of machine learning, which consists in training an agent to behave in an environment (such as a video game) so as to optimize a quantity (maximizing the game score), by performing certain actions in the environment (pressing buttons on the controller) and observing what happens.
Reinforcement learning new paradigm opens a complete new path for designing algorithms and interactions between computers and real world.
Chapter 12, Learning Features with Unsupervised Generative Networks, unsupervised learning consists in new training algorithms that do not require the data to be labeled to be trained. These algorithms try to infer the hidden labels from the data, called the factors, and, for some of them, to generate new synthetic data.
Unsupervised training is very useful in many cases, either when no labeling exists, or when labeling the data with humans is too expensive, or lastly when the dataset is too small and feature engineering would overfit the data. In this last case, extra amounts of unlabeled data train better features as a basis for supervised learning.
Chapter 13, Extending Deep Learning with Theano, extends the set of possibilities in Deep Learning with Theano. It addresses the way to create new operators for the computation graph, either in Python for simplicity, or in C to overcome the Python overhead, either for the CPU or for the GPU. Also, introduces the basic concept of parallel programming for GPU. Lastly, we open the field of General Intelligence, based on the first skills developped in this book, to develop new skills, in a gradual way, to improve itself one step further.
Investing time and developments on Theano is very valuable and to understand why, it is important to explain that Theano belongs to the best deep learning technologies and is also much more than a deep learning library. Three reasons make of Theano a good choice of investment:
Let us first focus on the performance of the technology itself. The most popular libraries in deep learning are Theano (for Python), Torch (for Lua), Tensorflow (for Python) and Caffe (for C++ and with a Python wrapper). There has been lot's of benchmarks to compare deep learning technologies.
In Bastien et al 2012 (Theano: new features and speed improvements, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, Yoshua Bengio, Nov 2012), Theano made significant progress in speed, but the comparison on different tasks does not point a clear winner among the challenged technologies. Bahrampour et Al. 2016 (Comparative Study of Deep Learning Software Frameworks, Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah, mars 2016) conclude that:
These results are confirmed in the open-source rnn-benchmarks (https://github.com/glample/rnn-benchmarks) where for training (forward + backward passes), Theano outperforms Torch and TensorFlow. Also, Theano crushes Torch and TensorFlow for smaller batch sizes with larger numbers of hidden units. For bigger batch size and hidden layer size, the differences are smaller since they rely more on the performance of CUDA, the underlying NVIDIA graphic library common to all frameworks. Last, in up-to-date soumith benchmarks (https://github.com/soumith/convnet-benchmarks), the fftconv in Theano performs the best on CPU, while the best performing convolution implementations on GPU, cuda-convnet2 and fbfft, are CUDA extension, the underlying library. These results should convince the reader that, although results are mixed, Theano plays a leading role in the speed competition.
The second point to prefer Theano rather than Torch is that it comes with a rich ecosystem, taking benefit from the Python ecosystem, but also from a large number of libraries that have been developed for Theano. This book will present two of them, Lasagne, and Keras. Theano and Torch are the most extensible frameworks both in terms of supporting various deep architectures but also in terms of supported libraries. Last, Theano has not a reputation to be complex to debug, contrary to other deep learning libraries.
The third point makes Theano an uncomparable tool for the computer scientist because it is not specific to deep learning. Although Theano presents the same methods for deep learning than other libraries, its underlying principles are very different: in fact, Theano compiles the computation graph on the target architecture. This compilation step makes Theano's specificity, and it should be defined as a mathematical expression compiler, designed with machine learning in mind. The symbolic differentiation is one of the most useful features that Theano offers for implementing non-standard deep architectures. Therefore, Theano is able to address a much larger range of numerical problems, and can be used to find the solution that minimizes any problem expressed with a differentiable loss or energy function, given an existing dataset.
Theano installation requires conda or pip, and the install process is the same under Windows, Mac OS and Linux.
The code has been tested under Mac OS and Linux Ubuntu. There might be some specificities for Windows, such as modifying the paths, that the Windows developer will solve quite easily.
Code examples suppose there exists on your computer a shared folder, where to download, uncompress, and preprocess database files that can be very voluminous and should not be left inside code repositories. This practice helps spare some disk space, while multiple code directories and users can use the same copy of the database. The folder is usually shared between user spaces:
This book is indented to provide the widest overview of deep learning, with Theano as support technology. The book is designed for the beginner in deep learning and artificial intelligence, as well as the computer programmer who wants to get a cross domain experience and become familiar with Theano and its supporting libraries. This book helps the reader to begin with deep learning, as well as getting the relevant and practical informations in deep learning.
Are required some basic skills in Python programming and computer science, as well as skills in elementary algebra and calculus. The underlying technology for all experiments is Theano, and the book provides first an in-depth presentation of the core technology first, then introduces later on some libraries to do some reuse of existing modules.
The approach of this book is to introduce the reader to deep learning, describing the different types of networks and their applications, and in the meantime, exploring the possibilities offered by Theano, a deep learning technology, that will be the support for all implementations. This book sums up some of the best performing nets and state of the art results and helps the reader get the global picture of deep learning, taking her from the simple to the complex nets gradually.
Since Python has become the main programming language in data science, this book tries to cover all that a Python programmer needs to know to do deep learning with Python and Theano.
The book will introduce two abstraction frameworks on top of Theano, Lasagne and Keras, which can simplify the development of more complex nets, but do not prevent you from understanding the underlying concepts.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Deep-Learning-with-Theano. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
This chapter presents Theano as a compute engine and the basics for symbolic computing with Theano. Symbolic computing consists of building graphs of operations that will be optimized later on for a specific architecture, using the computation libraries available for this architecture.
Although this chapter might appear to be a long way from practical applications, it is essential to have an understanding of the technology for the following chapters; what is it capable of and what value does it bring? All the following chapters address the applications of Theano when building all possible deep learning architectures.
Theano may be defined as a library for scientific computing; it has been available since 2007 and is particularly suited to deep learning. Two important features are at the core of any deep learning library: tensor operations, and the capability to run the code on CPU or Graphical Computation Unit (GPU). These two features enable us to work with a massive amount of multi-dimensional data. Moreover, Theano proposes automatic differentiation, a very useful feature that can solve a wider range of numeric optimizations than deep learning problems.
The chapter covers the following topics:
Usually, input data is represented with multi-dimensional arrays:
We'll see more examples of input data arrays in the future chapters.
In Theano, multi-dimensional arrays are implemented with an abstraction class, named tensor, with many more transformations available than traditional arrays in a computer language such as Python.
At each stage of a neural net, computations such as matrix multiplications involve multiple operations on these multi-dimensional arrays.
Classical arrays in programming languages do not have enough built-in functionalities to quickly and adequately address multi-dimensional computations and manipulations.
Computations on multi-dimensional arrays have a long history of optimizations, with tons of libraries and hardware. One of the most important gains in speed has been permitted by the massive parallel architecture of the GPU, with computation ability on a large number of cores, from a few hundred to a few thousand.
Compared to the traditional CPU, for example, a quadricore, 12-core, or 32-core engine, the gains with GPU can range from 5x to 100x, even if part of the code is still being executed on the CPU (data loading, GPU piloting, and result outputting). The main bottleneck with the use of GPU is usually the transfer of data between the memory of the CPU and the memory of the GPU, but still, when well programmed, the use of GPU helps bring a significant increase in speed of an order of magnitude. Getting results in days rather than months, or hours rather than days, is an undeniable benefit for experimentation.
The Theano engine has been designed to address the challenges of multi-dimensional arrays and architecture abstraction from the beginning.
There is another undeniable benefit of Theano for scientific computation: the automatic differentiation of functions of multi-dimensional arrays, a well-suited feature for model parameter inference via objective function minimization. Such a feature facilitates experimentation by releasing the pain to compute derivatives, which might not be very complicated, but are prone to many errors.
In this section, we'll install Theano, run it on the CPU and GPU devices, and save the configuration.
The easiest way to install Theano is to use conda, a cross-platform package and environment manager.
If conda is not already installed on your operating system, the fastest way to install conda is to download the miniconda installer from https://conda.io/miniconda.html. For example, for conda under Linux 64 bit and Python 2.7, use this command:
Conda enables us to create new environments in which versions of Python (2 or 3) and the installed packages may differ. The conda root environment uses the same version of Python as the version installed on the system on which you installed conda.
Let's install Theano:
Run a Python session and try the following commands to check your configuration:
The last command prints all the configuration of Theano. The theano.config object contains keys to many configuration options.
To infer the configuration options, Theano looks first at the ~/.theanorc file, then at any environment variables that are available, which override the former options, and lastly at the variable set in the code that are first in order of precedence:
Some of the properties might be read-only and cannot be changed in the code, but floatX, which sets the default floating point precision for floats, is among the properties that can be changed directly in the code.
It is advised to use float32 since GPU has a long history without float64. float64 execution speed on GPU is slower, sometimes much slower (2x to 32x on latest generation Pascal hardware), and float32 precision is enough in practice.
Theano enables the use of GPU, units that are usually used to compute the graphics to display on the computer screen.
To have Theano work on the GPU as well, a GPU backend library is required on your system.
The CUDA library (for NVIDIA GPU cards only) is the main choice for GPU computations. There is also the OpenCL standard, which is open source but far less developed, and much more experimental and rudimentary on Theano.
Most scientific computations still occur on NVIDIA cards at the moment. If you have an NVIDIA GPU card, download CUDA from the NVIDIA website, https://developer.nvidia.com/cuda-downloads, and install it. The installer will install the latest version of the GPU drivers first, if they are not already installed. It will install the CUDA library in the /usr/local/cuda directory.
Install the cuDNN library, a library by NVIDIA, that offers faster implementations of some operations for the GPU. To install it, I usually copy the /usr/local/cuda directory to a new directory, /usr/local/cuda-{CUDA_VERSION}-cudnn-{CUDNN_VERSION}, so that I can choose the version of CUDA and cuDNN, depending on the deep learning technology I use and its compatibility.
In your .bashrc profile, add the following line to set the $PATH and $LD_LIBRARY_PATH variables:
N-dimensional GPU arrays have been implemented in Python in six different GPU libraries (Theano/CudaNdarray,PyCUDA/ GPUArray,CUDAMAT/ CUDAMatrix, PYOPENCL/GPUArray, Clyther, Copperhead), are a subset of NumPy.ndarray. Libgpuarray is a backend library to have them in a common interface with the same property.
To install libgpuarray with conda, use this command:
To run Theano in GPU mode, you need to configure the config.device variable before execution since it is a read-only variable once the code is run. Run this command with the THEANO_FLAGS environment variable:
The first return shows that GPU device has been correctly detected, and specifies which GPU it uses.
