E-Book
28,14 €

Machine Learning for Data Mining E-Book

Jesus Salcedo

0,0

28,14 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Get efficient in performing data mining and machine learning using IBM SPSS Modeler

Key Features

Learn how to apply machine learning techniques in the field of data science

Understand when to use different data mining techniques, how to set up different analyses, and how to interpret the results

A step-by-step approach to improving model development and performance

Book Description

Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques.

This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization.

By the end of this book, you will be able to build predictive models and extract information of interest from the dataset

What you will learn

Hone your model-building skills and create the most accurate models

Understand how predictive machine learning models work

Prepare your data to acquire the best possible results

Combine models in order to suit the requirements of different types of data

Analyze single and multiple models and understand their combined results

Derive worthwhile insights from your data using histograms and graphs

Who this book is for

If you are a data scientist, data analyst, and data mining professional and are keen to achieve a 30% higher salary by adding machine learning to your skillset, then this is the ideal book for you. You will learn to apply machine learning techniques to various data mining challenges. No prior knowledge of machine learning is assumed.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 120

Veröffentlichungsjahr: 2019

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Machine Learning for Data Mining

Improve your data mining capabilities with advanced predictive modeling

Jesus Salcedo

BIRMINGHAM - MUMBAI

Machine Learning for Data Mining

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor:Sunith ShettyAcquisition Editor:Devika BattikeContent Development Editor:Unnati GuhaTechnical Editor: Dinesh ChaudharyCopy Editor: Safis EditingProject Coordinator:Manthan PatelProofreader: Safis EditingIndexer:Pratik ShirodkarGraphics:Jisha ChirayilProduction Coordinator:Arvindkumar Gupta

First published: April 2019

Production reference: 1300419

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-83882-897-4

www.packtpub.com

Contributors

About the author

Jesus Salcedo has a PhD in psychometrics from Fordham University. He is an independent statistical consultant and has been using SPSS products for over 20 years. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Title Page

Machine Learning for Data Mining

Contributors

About the author

Packt is searching for authors like you

About Packt

Why subscribe?

Packt.com

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Introducing Machine Learning Predictive Models

Characteristics of machine learning predictive models

Types of machine learning predictive models

Working with neural networks

Advantages of neural networks

Disadvantages of neural networks

Representing the errors

Types of neural network models

Multi-layer perceptron

Why are weights important?

An example representation of a multilayer perceptron model

The linear regression model

A sample neural network model

Feed-forward backpropagation

Model training ethics

Summary

Getting Started with Machine Learning

Demonstrating a neural network

Running a neural network model

Interpreting results

Analyzing the accuracy of the model

Model performance on testing partition

Support Vector Machines

Working with Support Vector Machines

Kernel transformation

But what is the best solution?

Types of kernel functions

Demonstrating SVMs

Interpreting the results

Trying additional solutions

Summary

Understanding Models

Models

Statistical models

Decision tree models

Machine learning models

Using graphs to interpret machine learning models

Using statistics to interpret machine learning models

Understanding the relationship between a continuous predictor and a categorical outcome variable

Using decision trees to interpret machine learning models

Summary

Improving Individual Models

Modifying model options

Using a different model to improve results

Removing noise to improve models

How to remove noise

Doing additional data preparation

Preparing the data

Balancing data

The need for balancing data

Implementing balance in data

Summary 

Advanced Ways of Improving Models

Combining models

Combining by voting

Combining by highest confidence

Implementing combining models

Combining models in Modeler

Combining models outside Modeler

Using propensity scores

Implementations of propensity scores

Meta-level modeling

Error modeling

Boosting and bagging

Boosting

Bagging

Predicting continuous outcomes

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

30% of data mining vacancies also involve machine learning. And those that do are 30% better paid than the rest. If you’re involved in data mining, you need to get on top of machine learning, before it gets on top of you.

Hands-On Machine Learning for Data Mining gives you everything you need to bring the power of machine learning into your data mining work. This book will enable you to pair the best algorithms with the right tools and processes. You will see how systems can learn from data, identify patterns, and make predictions on data, all with minimal human intervention.

Who this book is for

If you are a data mining professional who wishes to get a ticket to a 30% higher salary by adding machine learning to your skill set, then this is the ideal course for you. No prior knowledge in machine learning is assumed.

What this book covers

Chapter 1, Introducing Machine Learning Predictive Models, introduces you to the theory behind predictive models, looking at how they work and providing an insight into types of predictive modeling, such as the neural network model, which is explained in brief in this chapter.

Chapter 2, Getting Started with Machine Learning, introduces you to the implementation of a neural network model, and gives an insight into the implementation of Support Vector Machines (SVMs) as well.

Chapter 3, Understanding Models, explains different types of models and the situations in which each of them should ideally be used.

Chapter 4, Improving Individual Models, shows you different ways in which we can improve our models. This chapter will show you four methods to improve the accuracy of your model.

Chapter 5, Advanced Ways of Improving Models, focuses on combining different models in different ways to get increasingly better results. In this chapter, we will see how a certain part of a dataset, which doesn't contribute much to the results of a neural network model, performs very well on the CHAID and C5.0 decision tree models. We will also see how to model the errors to prepare our models.

To get the most out of this book

Some knowledge on what data mining is, and the basic concepts of machine learning, will act as starting points for this book.

Familiarity with any machine learning modeler, specifically the SPSS Modeler provided by IBM, will be a plus, but isn't necessary.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-for-Data-Mining. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781838828974_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Introducing Machine Learning Predictive Models

A large percentage of data mining opportunities involve machine learning, and these opportunities often come with greater financial rewards. This chapter will give you the basic knowledge that you need to bring the power of machine learning into your data mining work. In this chapter, we're going to talk about the characteristics of machine learning models and also see some examples of these models.

The following are the topics that we will be covering in this chapter:

Characteristics of machine learning predictive models

Types of machine learning predictive models

Working with neural networks

A sample neural network model

Characteristics of machine learning predictive models

Knowing the characteristics of machine learning predictive models will help you understand the advantages and limitations in comparison to any statistical or decision tree models.

Let's get some insights on a few characteristics of predictive models in machine learning:

Optimized to learn complex patterns

Machine learning models are designed to be optimized to learn complex patterns. In comparison to statistical models or decision tree models, predictive models greatly excel, when you have very complex patterns in data.

Account for interactions and nonlinear relationships

: Machine learning predictive models can account for interactions in the data and nonlinear relationships to an even better degree than decision tree models.

Few assumptions

These models are powerful because they have very few assumptions. They can also be used with different types of data.

A black box model's interpretation is not straightforward

: P

redictive models are black box models, t

his is one of the drawbacks of predictive machine learning models, because this implies that the interpretation is not straightforward. This means that, if we end up building many different equations and combine them, it becomes very difficult to see exactly how each one of these variables ended up interacting and impacting an output variable. So, the predictive machine learning models are great when it comes to predictive accuracy, but they're not that good for understanding the mechanics behind a prediction.

If you want to predict something, these models do a pretty good job and have amazing accuracy. But if you want to know why something is being predicted, and if you are looking forward to making some changes in the implementation so that you don't get a particular prediction, then it would be difficult to decipher.