28,14 €
Get efficient in performing data mining and machine learning using IBM SPSS Modeler
Key Features
Book Description
Machine learning (ML) combined with data mining can give you amazing results in your data mining work by empowering you with several ways to look at data. This book will help you improve your data mining techniques by using smart modeling techniques.
This book will teach you how to implement ML algorithms and techniques in your data mining work. It will enable you to pair the best algorithms with the right tools and processes. You will learn how to identify patterns and make predictions with minimal human intervention. You will build different types of ML models, such as the neural network, the Support Vector Machines (SVMs), and the Decision tree. You will see how all of these models works and what kind of data in the dataset they are suited for. You will learn how to combine the results of different models in order to improve accuracy. Topics such as removing noise and handling errors will give you an added edge in model building and optimization.
By the end of this book, you will be able to build predictive models and extract information of interest from the dataset
What you will learn
Who this book is for
If you are a data scientist, data analyst, and data mining professional and are keen to achieve a 30% higher salary by adding machine learning to your skillset, then this is the ideal book for you. You will learn to apply machine learning techniques to various data mining challenges. No prior knowledge of machine learning is assumed.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 120
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor:Sunith ShettyAcquisition Editor:Devika BattikeContent Development Editor:Unnati GuhaTechnical Editor: Dinesh ChaudharyCopy Editor: Safis EditingProject Coordinator:Manthan PatelProofreader: Safis EditingIndexer:Pratik ShirodkarGraphics:Jisha ChirayilProduction Coordinator:Arvindkumar Gupta
First published: April 2019
Production reference: 1300419
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-83882-897-4
www.packtpub.com
Jesus Salcedo has a PhD in psychometrics from Fordham University. He is an independent statistical consultant and has been using SPSS products for over 20 years. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Title Page
Copyright and Credits
Machine Learning for Data Mining
Contributors
About the author
Packt is searching for authors like you
About Packt
Why subscribe?
Packt.com
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Introducing Machine Learning Predictive Models
Characteristics of machine learning predictive models
Types of machine learning predictive models
Working with neural networks
Advantages of neural networks
Disadvantages of neural networks
Representing the errors
Types of neural network models
Multi-layer perceptron
Why are weights important?
An example representation of a multilayer perceptron model
The linear regression model
A sample neural network model
Feed-forward backpropagation
Model training ethics
Summary
Getting Started with Machine Learning
Demonstrating a neural network
Running a neural network model
Interpreting results
Analyzing the accuracy of the model
Model performance on testing partition
Support Vector Machines
Working with Support Vector Machines
Kernel transformation
But what is the best solution?
Types of kernel functions
Demonstrating SVMs
Interpreting the results
Trying additional solutions
Summary
Understanding Models
Models
Statistical models
Decision tree models
Machine learning models
Using graphs to interpret machine learning models
Using statistics to interpret machine learning models
Understanding the relationship between a continuous predictor and a categorical outcome variable
Using decision trees to interpret machine learning models
Summary
Improving Individual Models
Modifying model options
Using a different model to improve results
Removing noise to improve models
How to remove noise
Doing additional data preparation
Preparing the data
Balancing data
The need for balancing data
Implementing balance in data
Summary 
Advanced Ways of Improving Models
Combining models
Combining by voting
Combining by highest confidence
Implementing combining models
Combining models in Modeler
Combining models outside Modeler
Using propensity scores
Implementations of propensity scores
Meta-level modeling
Error modeling
Boosting and bagging
Boosting
Bagging
Predicting continuous outcomes
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
30% of data mining vacancies also involve machine learning. And those that do are 30% better paid than the rest. If you’re involved in data mining, you need to get on top of machine learning, before it gets on top of you.
Hands-On Machine Learning for Data Mining gives you everything you need to bring the power of machine learning into your data mining work. This book will enable you to pair the best algorithms with the right tools and processes. You will see how systems can learn from data, identify patterns, and make predictions on data, all with minimal human intervention.
If you are a data mining professional who wishes to get a ticket to a 30% higher salary by adding machine learning to your skill set, then this is the ideal course for you. No prior knowledge in machine learning is assumed.
Chapter 1, Introducing Machine Learning Predictive Models, introduces you to the theory behind predictive models, looking at how they work and providing an insight into types of predictive modeling, such as the neural network model, which is explained in brief in this chapter.
Chapter 2, Getting Started with Machine Learning, introduces you to the implementation of a neural network model, and gives an insight into the implementation of Support Vector Machines (SVMs) as well.
Chapter 3, Understanding Models, explains different types of models and the situations in which each of them should ideally be used.
Chapter 4, Improving Individual Models, shows you different ways in which we can improve our models. This chapter will show you four methods to improve the accuracy of your model.
Chapter 5, Advanced Ways of Improving Models, focuses on combining different models in different ways to get increasingly better results. In this chapter, we will see how a certain part of a dataset, which doesn't contribute much to the results of a neural network model, performs very well on the CHAID and C5.0 decision tree models. We will also see how to model the errors to prepare our models.
Some knowledge on what data mining is, and the basic concepts of machine learning, will act as starting points for this book.
Familiarity with any machine learning modeler, specifically the SPSS Modeler provided by IBM, will be a plus, but isn't necessary.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-for-Data-Mining. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781838828974_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
A large percentage of data mining opportunities involve machine learning, and these opportunities often come with greater financial rewards. This chapter will give you the basic knowledge that you need to bring the power of machine learning into your data mining work. In this chapter, we're going to talk about the characteristics of machine learning models and also see some examples of these models.
The following are the topics that we will be covering in this chapter:
Characteristics of machine learning predictive models
Types of machine learning predictive models
Working with neural networks
A sample neural network model
Knowing the characteristics of machine learning predictive models will help you understand the advantages and limitations in comparison to any statistical or decision tree models.
Let's get some insights on a few characteristics of predictive models in machine learning:
Optimized to learn complex patterns
:
Machine learning models are designed to be optimized to learn complex patterns. In comparison to statistical models or decision tree models, predictive models greatly excel, when you have very complex patterns in data.
Account for interactions and nonlinear relationships
: Machine learning predictive models can account for interactions in the data and nonlinear relationships to an even better degree than decision tree models.
Few assumptions
:
These models are powerful because they have very few assumptions. They can also be used with different types of data.
A black box model's interpretation is not straightforward
: P
redictive models are black box models, t
his is one of the drawbacks of predictive machine learning models, because this implies that the interpretation is not straightforward. This means that, if we end up building many different equations and combine them, it becomes very difficult to see exactly how each one of these variables ended up interacting and impacting an output variable. So, the predictive machine learning models are great when it comes to predictive accuracy, but they're not that good for understanding the mechanics behind a prediction.
If you want to predict something, these models do a pretty good job and have amazing accuracy. But if you want to know why something is being predicted, and if you are looking forward to making some changes in the implementation so that you don't get a particular prediction, then it would be difficult to decipher.
The following are some of the different types of machine learning predictive models:
Neural networks
Support Vector Machines
