E-Book
28,14 €

Mastering Predictive Analytics with scikit-learn and TensorFlow E-Book

Alan Fontaine

0,0

28,14 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Learn advanced techniques to improve the performance and quality of your predictive models

Key Features

Use ensemble methods to improve the performance of predictive analytics models

Implement feature selection, dimensionality reduction, and cross-validation techniques

Develop neural network models and master the basics of deep learning

Book Description

Python is a programming language that provides a wide range of features that can be used in the field of data science. Mastering Predictive Analytics with scikit-learn and TensorFlow covers various implementations of ensemble methods, how they are used with real-world datasets, and how they improve prediction accuracy in classification and regression problems.

This book starts with ensemble methods and their features. You will see that scikit-learn provides tools for choosing hyperparameters for models. As you make your way through the book, you will cover the nitty-gritty of predictive analytics and explore its features and characteristics. You will also be introduced to artificial neural networks and TensorFlow, and how it is used to create neural networks. In the final chapter, you will explore factors such as computational power, along with improvement methods and software enhancements for efficient predictive analytics.

By the end of this book, you will be well-versed in using deep neural networks to solve common problems in big data analysis.

What you will learn

Use ensemble algorithms to obtain accurate predictions

Apply dimensionality reduction techniques to combine features and build better models

Choose the optimal hyperparameters using cross-validation

Implement different techniques to solve current challenges in the predictive analytics domain

Understand various elements of deep neural network (DNN) models

Implement neural networks to solve both classification and regression problems

Who this book is for

Mastering Predictive Analytics with scikit-learn and TensorFlow is for data analysts, software engineers, and machine learning developers who are interested in implementing advanced predictive analytics using Python. Business intelligence experts will also find this book indispensable as it will teach them how to progress from basic predictive models to building advanced models and producing more accurate predictions. Prior knowledge of Python and familiarity with predictive analytics concepts are assumed.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 126

Veröffentlichungsjahr: 2018

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Mastering Predictive Analytics with scikit-learn and TensorFlow

Implement machine learning techniques to build advanced predictive models using Python

Alan Fontaine

BIRMINGHAM - MUMBAI

Mastering Predictive Analytics with scikit-learn and TensorFlow

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Sunith ShettyAcquisition Editor: Namrata PatilContent Development Editor: Athikho Sapuni RishanaTechnical Editor: Joseph SunilCopy Editor: Safis EditingProject Coordinator: Kirti PisatProofreader: Safis EditingIndexer:Rekha NairGraphics: Jisha Chirayil Production Coordinator:Deepika Naik

First published: September 2018

Production reference: 1280918

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78961-774-0

www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributor

About the author

Alan Fontaine is a data scientist with more than 12 years of experience in analytical roles. He has been a consultant for many projects in fields such as: business, education, medicine, mass media, among others. He is a big Python fan and has been using it routinely for five years for analyzing data, building models, producing reports, making predictions, and building interactive applications that transform data into intelligence.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Mastering Predictive Analytics with scikit-learn and TensorFlow

Packt Upsell

Why subscribe?

Packt.com

Contributor

About the author

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Ensemble Methods for Regression and Classification

Ensemble methods and their working

Bootstrap sampling

Bagging

Random forests

Boosting

Ensemble methods for regression

The diamond dataset

Training different regression models

KNN model

Bagging model

Random forests model

Boosting model

Using ensemble methods for classification

Predicting a credit card dataset 

Training different regression models

Logistic regression model

Bagging model

Random forest model

Boosting model

Summary

Cross-validation and Parameter Tuning

Holdout cross-validation

K-fold cross-validation

Implementing k-fold cross-validation

Comparing models with k-fold cross-validation

Introduction to hyperparameter tuning

Exhaustive grid search

Hyperparameter tuning in scikit-learn

Comparing tuned and untuned models

Summary

Working with Features

Feature selection methods 

Removing dummy features with low variance

Identifying important features statistically

Recursive feature elimination

Dimensionality reduction and PCA

Feature engineering

Creating new features

Improving models with feature engineering

Training your model

Reducible and irreducible error

Summary

Introduction to Artificial Neural Networks and TensorFlow

Introduction to ANNs

Perceptrons

Multilayer perceptron

Elements of a deep neural network model

Deep learning

Elements of an MLP model

Introduction to TensorFlow

TensorFlow installation

Core concepts in TensorFlow

Tensors

Computational graph

Summary

Predictive Analytics with TensorFlow and Deep Neural Networks

Predictions with TensorFlow

Introduction to the MNIST dataset

Building classification models using MNIST dataset

Elements of the DNN model

Building the DNN

Reading the data

Defining the architecture

Placeholders for inputs and labels

Building the neural network

The loss function

Defining optimizer and training operations

Training strategy and valuation of accuracy of the classification

Running the computational graph

Regression with Deep Neural Networks (DNN)

Elements of the DNN model

Building the DNN

Reading the data

Objects for modeling

Training strategy

Input pipeline for the DNN

Defining the architecture

Placeholders for input values and labels

Building the DNN

The loss function

Defining optimizer and training operations

Running the computational graph

Classification with DNNs

Exponential linear unit activation function

Classification with DNNs

Elements of the DNN model

Building the DNN

Reading the data

Producing the objects for modeling

Training strategy

Input pipeline for DNN

Defining the architecture

Placeholders for inputs and labels

Building the neural network

The loss function

Evaluation nodes

Optimizer and the training operation

Run the computational graph

Evaluating the model with a set threshold

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Python is a programming language that provides various features in the field of data science. In this book, we will be touching upon two Python libraries, scikit-learn and TensorFlow. We will learn about the various implementations of ensemble methods, how they are used with real-world datasets, and how they improve prediction accuracy in classification and regression problems.

This book starts with studying ensemble methods and their features. We will look at how scikit-learn provides the right tools to choose hyperparameters for models. From there, we will get down to the nitty-gritty of predictive analytics and explore its various features and characteristics. We will be introduced to artificial neural networks, TensorFlow, and the core concepts used to build neural networks.In the final section, we will consider factors such as computational power, improved methods, and software enhancements for efficient predictive analytics. You will become well versed in using DNNs to solve common challenges.

Who this book is for

This book is for data analysts, software engineers, and machine learning developers who are interested in implementing advanced predictive analytics using Python. Business intelligence experts will also find this book indispensable as it will teach them how to go from basic predictive models to building advanced models and producing better predictions. Knowledge of Python and familiarity with predictive analytics concepts are assumed.

What this book covers

Chapter 1, Ensemble Methods for Regression and Classification, covers the application of ensemble methods or algorithms to produce accurate predictions of models. We will go through the application of ensemble methods for regression and classification problems.

Chapter 2, Cross-validation and Parameter Tuning, explores various techniques to combine and build better models. We will learn different methods of cross-validation, including holdout cross-validation and k-fold cross-validation. We will also discuss what hyperparameter tuning is.

Chapter 3, Working with Features, explores feature selection methods, dimensionality reduction, PCA, and feature engineering. We will also study methods to improve models with feature engineering.

Chapter 4, Introduction to Artificial Neural Networks and TensorFlow, is an introduction to ANNs and TensorFlow. We will explore the various elements in the network and their functions. We will also learn the basic concepts of TensorFlow in it.

Chapter 5, Predictive Analytics with TensorFlow and Deep Neural Networks, explores predictive analytics with the help of TensorFlow and deep learning. We will study the MNIST dataset and classification of models using this dataset. We will learn about DNNs, their functions, and the application of DNNs to the MNIST dataset.

To get the most out of this book

This book presents some of the most advanced predictive analytics tools, models, and techniques. The main goal is to show the viewer how to improve the performance of predictive models, firstly, by showing how to build more complex models, and secondly by showing how to use related techniques that dramatically improve the quality of predictive models.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-Predictive-Analytics-with-scikit-learn-and-TensorFlow. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789617740_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The following screenshot shows the lines of code used for importing the train_test_split function and the RobustScaler method."

A block of code is set as follows:

import numpy as npimport matplotlib.pyplot as pltimport pandas as pd%matplotlib inline

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The method used to choose the best estimators for a particular dataset or choosing the best values for all hyperparameters is called hyperparameter tuning."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Ensemble Methods for Regression and Classification

Advanced analytical tools are widely used by business enterprises in order to solve problems using data. The goal of analytical tools is to analyze data and extract relevant information that can be used to solve problems or increase performance of some aspect of the business. It also involves various machine learning algorithms with which we can create predictive models for better results.

In this chapter, we are going to explore a simple idea that can drastically improve the performance of basic predictive models.

We are going to cover the following topics in this chapter:

Ensemble methods and their working

Ensemble methods for regression

Ensemble methods for classification

Ensemble methods and their working

Ensemble methods are based on a very simple idea: instead of using a single model to make a prediction, we use many models and then use some method to aggregate the predictions. Having different models is like having different points of view, and it has been demonstrated that by aggregating models that offer a different point of view; predictions can be more accurate. These methods further improve generalization over a single model because they reduce the risk of selecting a poorly performing classifier:

In the preceding diagram, we can see that each object belongs to one of three classes: triangles, circles, and squares. In this simplified example, we have two features to separate or classify the objects into the different classes. As you can see here, we can use three different classifiers and all the three classifiers represent different approaches and have different kinds of decision boundaries.