Data Mining Techniques - Gordon S. Linoff - E-Book

Data Mining Techniques E-Book

Gordon S. Linoff

0,0
39,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The leading introductory book on data mining, fully updated and revised! When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition--more than 50% new and revised-- is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company. * Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems * Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately * Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more * Provides best practices for performing data mining using simple tools such as Excel Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1487

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Title Page

Copyright

Dedication

About the Authors

Credits

Acknowledgments

Introduction

Chapter 1: What Is Data Mining and Why Do It?

What Is Data Mining?

Why Now?

Skills for the Data Miner

The Virtuous Cycle of Data Mining

A Case Study in Business Data Mining

Steps of the Virtuous Cycle

Data Mining in the Context of the Virtuous Cycle

Lessons Learned

Chapter 2: Data Mining Applications in Marketing and Customer Relationship Management

Two Customer Lifecycles

Organize Business Processes Around the Customer Lifecycle

Data Mining Applications for Customer Acquisition

A Data Mining Example: Choosing the Right Place to Advertise

Data Mining to Improve Direct Marketing Campaigns

Using Current Customers to Learn About Prospects

Data Mining Applications for Customer Relationship Management

Retention

Beyond the Customer Lifecycle

Lessons Learned

Chapter 3: The Data Mining Process

What Can Go Wrong?

Data Mining Styles

Goals, Tasks, and Techniques

Formulating Data Mining Problems: From Goals to Tasks to Techniques

What Techniques for Which Tasks?

Lessons Learned

Chapter 4: What You Should Know About Data

Occam's Razor

Looking At and Measuring Data

Measuring Response

Multiple Comparisons

Chi-Square Test

An Example: Chi-Square for Regions and Starts

Case Study: Comparing Two Recommendation Systems with an A/B Test

Data Mining and Statistics

Lessons Learned

Chapter 5: Descriptions and Prediction: Profiling and Predictive Modeling

Directed Data Mining Models

Directed Data Mining Methodology

Step 1: Translate the Business Problem into a Data Mining Problem

Step 2: Select Appropriate Data

Step 3: Get to Know the Data

Step 4: Create a Model Set

Step 5: Fix Problems with the Data

Step 6: Transform Data to Bring Information to the Surface

Step 7: Build Models

Step 8: Assess Models

Step 9: Deploy Models

Step 10: Assess Results

Step 11: Begin Again

Lessons Learned

Chapter 6: Data Mining Using Classic Statistical Techniques

Similarity Models

Table Lookup Models

RFM: A Widely Used Lookup Model

Naïve Bayesian Models

Linear Regression

Multiple Regression

Logistic Regression

Fixed Effects and Hierarchical Effects

Lessons Learned

Chapter 7: Decision Trees

What Is a Decision Tree and How Is It Used?

Decision Trees Are Local Models

Growing Decision Trees

Finding the Best Split

Pruning

Extracting Rules from Trees

Decision Tree Variations

Assessing the Quality of a Decision Tree

When Are Decision Trees Appropriate?

Case Study: Process Control in a Coffee Roasting Plant

Lessons Learned

Chapter 8: Artificial Neural Networks

A Bit of History

The Biological Model

Artificial Neural Networks

A Sample Application: Real Estate Appraisal

Training Neural Networks

Radial Basis Function Networks

Neural Networks in Practice

Choosing the Training Set

Preparing the Data

Interpreting the Output from a Neural Network

Neural Networks for Time Series

Can Neural Network Models Be Explained?

Lessons Learned

Chapter 9: Nearest Neighbor Approaches: Memory-Based Reasoning and Collaborative Filtering

Memory-Based Reasoning

Challenges of MBR

Case Study: Using MBR for Classifying Anomalies in Mammograms

Measuring Distance and Similarity

The Combination Function: Asking the Neighbors for Advice

Case Study: Shazam — Finding Nearest Neighbors for Audio Files

Collaborative Filtering: A Nearest-Neighbor Approach to Making Recommendations

Lessons Learned

Chapter 10: Knowing When to Worry: Using Survival Analysis to Understand Customers

Customer Survival

Hazard Probabilities

From Hazards to Survival

Proportional Hazards

Survival Analysis in Practice

Lessons Learned

Chapter 11: Genetic Algorithms and Swarm Intelligence

Optimization

Genetic Algorithms

The Traveling Salesman Problem

Case Study: Using Genetic Algorithms for Resource Optimization

Case Study: Evolving a Solution for Classifying Complaints

Lessons Learned

Chapter 12: Tell Me Something New: Pattern Discovery and Data Mining

Undirected Techniques, Undirected Data Mining

What is Undirected Data Mining?

Methodology for Undirected Data Mining

Lessons Learned

Chapter 13: Finding Islands of Similarity: Automatic Cluster Detection

Searching for Islands of Simplicity

Customer Segmentation and Clustering

The K-Means Clustering Algorithm

Interpreting Clusters

Evaluating Clusters

Case Study: Clustering Towns

Variations on K-Means

Data Preparation for Clustering

Lessons Learned

Chapter 14: Alternative Approaches to Cluster Detection

Shortcomings of K-Means

Gaussian Mixture Models

Divisive Clustering

Agglomerative (Hierarchical) Clustering

Self-Organizing Maps

The Search Continues for Islands of Simplicity

Lessons Learned

Chapter 15: Market Basket Analysis and Association Rules

Defining Market Basket Analysis

Case Study: Spanish or English

Association Analysis

Building Association Rules

Extending the Ideas

Association Rules and Cross-Selling

Sequential Pattern Analysis

Lessons Learned

Chapter 16: Link Analysis

Basic Graph Theory

Social Network Analysis

Mining Call Graphs

Case Study: Tracking Down the Leader of the Pack

Case Study: Who Is Using Fax Machines from Home?

How Google Came to Rule the World

Lessons Learned

Chapter 17: Data Warehousing, OLAP, Analytic Sandboxes, and Data Mining

The Architecture of Data

A General Architecture for Data Warehousing

Analytic Sandboxes

Where Does OLAP Fit In?

Where Data Mining Fits in with Data Warehousing

Lessons Learned

Chapter 18: Building Customer Signatures

Finding Customers in Data

Designing Signatures

What a Signature Looks Like

Process for Creating Signatures

Dealing with Missing Values

Lessons Learned

Chapter 19: Derived Variables: Making the Data Mean More

Handset Churn Rate as a Predictor of Churn

Single-Variable Transformations

Combining Variables

Extracting Features from Time Series

Extracting Features from Geography

Using Model Scores as Inputs

Handling Sparse Data

Capturing Customer Behavior from Transactions

Lessons Learned

Chapter 20: Too Much of a Good Thing? Techniques for Reducing the Number of Variables

Problems with Too Many Variables

The Sparse Data Problem

Flavors of Variable Reduction Techniques

Sequential Selection of Features

Other Directed Variable Selection Methods

Principal Components

Variable Clustering

Lessons Learned

Chapter 21: Listen Carefully to What Your Customers Say: Text Mining

What Is Text Mining?

Working with Text Data

Case Study: Ad Hoc Text Mining

Classifying News Stories Using MBR

From Text to Numbers

Text Mining and Naïve Bayesian Models

DIRECTV: A Case Study in Customer Service

Lessons Learned

Index

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management

Published by

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Copyright © 2011 by Wiley Publishing, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN: 978-0-470-65093-6

ISBN: 978-1-118-08745-9 (ebk)

ISBN: 978-1-118-08747-3 (ebk)

ISBN: 978-1-118-08750-3 (ebk)

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that Internet websites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.

Library of Congress Control Number: 2011921769

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book.

To Stephanie, Sasha, and Nathaniel. Without your patience and understanding, this book would not have been possible.

—Michael

To Puccio.

Grazie per essere paziente con me.

Ti amo.

About the Authors

Gordon S. Linoff and Michael J. A. Berry are well known in the data mining field. They are the founders of Data Miners, Inc., a boutique data mining consultancy, and they have jointly authored several influential and widely read books in the field. The first of their jointly authored books was the first edition of Data Mining Techniques, which appeared in 1997. Since that time, they have been actively mining data in a wide variety of industries. Their continuing hands-on analytical work allows the authors to keep abreast of developments in the rapidly evolving fields of data mining, forecasting, and predictive analytics. Gordon and Michael are scrupulously vendor-neutral. Through their consulting work, the authors have been exposed to data analysis software from all of the major software vendors (and quite a few minor ones as well). They are convinced that good results are not determined by whether the software employed is proprietary or open-source, command-line or point-and-click; good results come from creative thinking and sound methodology.

Gordon and Michael specialize in applications of data mining in marketing and customer relationship management — applications such as improving recommendations for cross-sell and up-sell, forecasting future subscriber levels, modeling lifetime customer value, segmenting customers according to their behavior, choosing optimal landing pages for customers arriving at a website, identifying good candidates for inclusion in marketing campaigns, and predicting which customers are at risk of discontinuing use of a software package, service, or drug regimen. Gordon and Michael are dedicated to sharing their knowledge, skills, and enthusiasm for the subject. When not mining data themselves, they enjoy teaching others through courses, lectures, articles, on-site classes, and of course, the book you are about to read. They can frequently be found speaking at conferences and teaching classes. The authors also maintain a data mining blog at blog.data-miners.com.

Gordon lives in Manhattan. His most recent book before this one is Data Analysis Using SQL and Excel, which was published by Wiley in 2008.

Michael lives in Cambridge, Massachusetts. In addition to his consulting work with Data Miners, he teaches Marketing Analytics at the Carroll School of Management at Boston College.

Credits

Executive Editor

Robert Elliott

Senior Project Editor

Adaobi Obi Tulton

Production Editor

Daniel Scribner

Copy Editor

Paula Lowell

Editorial Director

Robyn B. Siesky

Editorial Manager

Mary Beth Wakefield

Freelancer Editorial Manager

Rosemarie Graham

Marketing Manager

Ashley Zurcher

Production Manager

Tim Tate

Vice President and Executive Group Publisher

Richard Swadley

Vice President and Executive Publisher

Barry Pruett

Associate Publisher

Jim Minatel

Project Coordinator, Cover

Katie Crocker

Proofreaders

Word One New York

Indexer

Ron Strauss

Cover Image

Ryan Sneed

Cover Designer

© PhotoAlto/Alix Minde/GettyImages

Acknowledgments

We are fortunate to be surrounded by some of the most talented data miners anywhere, so our first thanks go to our colleagues, past and present, at Data Miners, Inc., from whom we have learned so much: Will Potts, Dorian Pyle, and Brij Masand. There are also clients with whom we work so closely that we consider them our colleagues and friends as well: Harrison Sohmer, Stuart E. Ward, III, and Michael Benigno are in that category. Our editor, Bob Elliott, kept us (more or less) on schedule and helped us maintain a consistent style.

SAS Institute and the Data Warehouse Institute have given us unparalleled opportunities over the past 12 years for teaching. We owe special thanks to Herb Edelstein (now retired), Herb Kirk, Anne Milley, Bob Lucas, Hillary Kokes, Karen Washburn, and many others who have made these classes possible.

Over the past year, while we were writing this book, several friends and colleagues have been very supportive. We would like to acknowledge Diane and Savvas Mavridis, Steve Mullaney, Lounette Dyer, Maciej Zworski, John Wallace, Paul Rosenblum, and Don Wedding.

We also want to acknowledge all the people with whom we have worked in scores of data mining engagements over the years. We have learned something from every one of them. Among the many who have helped us throughout the years:

Alan ParkerGary KingDave WaltzTim MannsCraig StanfillJeremy PollockDirk De RoosRichard JamesMichael AlidioGeorgia TourasiMichael CavarettaAvery WangDave DulingEric JiangJeff HammerbacherBruce RylanderAndrew GelmanDaryl BerryDoug NewellAdam SchwebberEd FreemanTiha GhyczyErin McCarthyUsama FayyadJosh GoffPatrick OttKaren KennedyJohn MullerRonnie RowtonFrank TravisanoKurt ThearlingJim StagnitoMark SmithStephen BoyerNick RadcliffeYugo KanazawaPatrick SurryXu HeRonny KohaviKiran NagarurTerri KowalchukRamana ThumuVictor LoJacob HauskensYasmin NaminiJeremy PollockZai Ying HuangLutz HamelAmber Batata

And, of course, all the people we thanked in the first edition are still deserving of acknowledgment:

Bob FlynnMarc GoodmanBryan McNeelyMarc ReifeisClaire BuddenMarge SheroldDavid IsaacMario BourgoinDavid WaltzProf. Michael JordanDena d'EbinPatsy CampbellDiana LinPaul BeckerDon PeppersPaul BerryEd HortonRakesh AgrawalEdward EwenRic AmariFred ChapmanRich CohenGary DrescherRobert GrothGregory LampshireRobert UtzschniederJanet SmithRoland PeschJerry ModesStephen SmithJim FlynnSue OsterfeltKamran ParsayeSusan BuchananKaren StewartSyamala SrinivasanLarry BookmanWei-Xing HoLarry ScrogginsWilliam PetefishLars RohrbergYvonne McCollinLounette Dyer

Finally, we would like to thank our family and friends, particularly Stephanie and Giuseppe, who have endured with grace the sacrifices in writing this book.

Introduction

Fifteen years ago, Michael and I wrote the first version of this book. A little more than 400 pages, the book fulfilled our goal of surveying the field of data mining by bridging the gap between the technical and the practical, by helping business people understand the data mining techniques and by helping technical people understand the business applications of these techniques. When Bob Elliott, our editor at Wiley, asked us to write the third edition of Data Mining Techniques, we happily said “yes,” conveniently forgetting the sacrifices that writing a book requires in our personal lives. We also knew that the new edition would be considerably reworked from the previous two editions.

In the past 15 years, the field has broadened and so has the book, both figuratively and literally. The second edition, published in 2004 and expanded to 600 pages, introduced two key new technical chapters covering survival analysis and statistical algorithms that had then become (and still are) increasingly important for data miners. Once again, this version introduces new technical areas, particularly text mining and principal components, and a wealth of new examples and enhanced technical descriptions in all the chapters. These examples come from a broad section of industries, including financial services, retailing, telecommunications, media, insurance, health care, and web-based services.

As practitioners in the field, we have also continued to learn. Between us, we now have about half a century of experience in data mining. Since 1999, Michael and I have been teaching courses through the Business Knowledge Series at SAS Institute (this series is separate from the software side of the business and brings in outside experts to teach non-software-specific courses), the Data Warehouse Institute, and onsite classes at many different companies. Our role as instructors in these courses has introduced us to thousands of diverse business people working in many industries. One of these courses, “Business Data Mining Techniques,” was based on the second edition of this book. These courses provide a wealth of feedback about the subject of data mining, about what people are doing in the real world, and how best to present these ideas so they can be readily understood. Much of this feedback is reflected in this new edition. We seem to learn as much from our students as our students learn from us.

Michael has also been teaching a course on marketing analysis at Boston College's Carroll School of Management for the past two years. The first two editions of Data Mining Techniques are also popular in courses in many colleges and universities, including both business courses and, increasingly, the data mining programs that have appeared at various universities over the past decade. Although not intended as a textbook, Data Mining Techniques offers an excellent overview for students of all types. Over the years, we have made various data sets available on our website, which instructors use for their courses.

This book is divided into four parts. The first part talks about the business context of data mining. Chapter 1 introduces data mining, along with examples of how it is used in the real world. Chapter 2 explains the virtuous cycle of data mining and how data mining can help understand customers. This chapter has several examples showing how data mining is used throughout the customer lifecycle. Chapter 3 is an outline of the methodology of data mining. This overall methodology is refined by Chapters 5 and 12, for directed and undirected data mining, respectively. Chapter 4 covers business statistics, introducing some key technical ideas that are used throughout the rest of the book. This chapter also has an extended case study from MyBuys, showing the strengths and weaknesses of different methods for analyzing the results of A/B marketing tests.

Earlier editions placed all the data mining techniques in a single section. We have decided to split the techniques into two distinct categories, so directed and undirected techniques each have their own sections. The section on directed data mining starts by refining the data mining methodology in Chapter 3 for directed data mining. The following chapters cover directed data mining techniques, including statistical techniques, decision trees, neural network, memory-based reasoning, survival analysis, and genetic algorithms.

The directed data mining techniques were all covered in the second edition. However, we have enhanced them in several important ways, particularly by including more examples of their use in the real world. The decision tree chapter (Chapter 7) now includes a case study on uplift modeling from US Bank and also introduces support vector machines. The neural network chapter (Chapter 8) discusses radial basis function neural networks. The memory-based reasoning chapter (Chapter 9) now has two very interesting case studies, one on how Shazam identifies songs and another on using MBR to help radiologists determine whether mammograms are normal or abnormal. Chapter 10 on survival analysis includes a much-needed discussion on customer value. Chapter 11 on genetic algorithms includes swarm intelligence, another related concept from the world of “computational biology” that has promising applications for data mining.

The third section is devoted to undirected data mining techniques. Chapter 12 explains four different flavors of undirected data mining. Clustering algorithms have been split into two chapters. The first (Chapter 13) focuses on the most common technique, k-means clustering and three variants, k-medians, k-medoids, and k-modes. It also has an enhanced discussion of interpreting clusters, which is important regardless of the technique used for identifying them. The second chapter on clustering (Chapter 14) introduces many techniques, including hierarchical clustering, divisive clustering, self-organizing networks, and Gaussian mixture models (expectation maximization clustering), which is new in this edition. Chapter 15 on market basket analysis has been enhanced with examples that extend beyond association rules, including a case study on ethnic marketing. Chapter 16, “Link Analysis,” the last chapter in the undirected data mining section, was almost peripheral in the 1990s when we wrote the first edition of this book. Now, it is quite central, as exemplified by the three case studies in this chapter.

The final section of the book is devoted to data — data mining's first name, so to speak. Chapter 17 covers the computer architectures that support data, such as relational databases, data warehouses, and data marts. It also covers Hadoop and analytic sandboxes, both of which are used to process data not suitable for relational databases and traditional data mining tools. The two earlier editions had one chapter on preparing data for data mining. This subject is so important that this edition splits the topic into three chapters. Chapter 18 is about finding the customer in the data and building customer signatures, the data structure used by many data mining algorithms. Chapter 19 covers derived variables, with hints and tips on defining variables that help models perform better. Chapter 20 focuses on reducing the number of variables, whether for techniques such as neural networks that prefer fewer variables or for data visualization purposes. One of the key techniques in this chapter, principal components, is new in this edition.

Chapter 21 covers a topic that could be a book by itself — text mining. Analyzing text builds on so many of the ideas found earlier in the book that we felt that the chapter covering text mining had to go later in the book. Its position at the end highlights text mining as the culmination of topics covered throughout the book. The final case study from DIRECTV is not only an interesting application of text mining to the customer service side of the business, but also an excellent example of data mining in practice.

Like the first two editions, this book is aimed at current and future data mining practitioners and their managers. It is not intended for software developers looking for detailed instructions on how to implement the various data mining algorithms, nor for researchers trying to improve upon these algorithms, although both these groups can benefit from understanding how such software gets used. Ideas are presented in nontechnical language, with minimal use of mathematical formulas and arcane jargon. Throughout the book, the emphasis is as much on the real-world applications of data mining as on the technical explanations, so the techniques include examples with real business context.

In short, we have tried to write the book that we would have liked to read when we began our own data mining careers.

— Gordon S. Linoff, New York, January 2011

Chapter 1

What Is Data Mining and Why Do It?

In the first edition of this book, the first sentence of the first chapter began with the words, “Somerville, Massachusetts, home to one of the authors of this book…” and went on to tell of two small businesses in that town and how they had formed learning relationships with their customers. One of those businesses, a hair braider, no longer braids the hair of the little girl. In the years since the first edition, the little girl grew up, and moved away, and no longer wears her hair in cornrows. Her father, one of the authors, moved to nearby Cambridge. But one thing has not changed. The author is still a loyal customer of the Wine Cask, where some of the same people who first introduced him to cheap Algerian reds in 1978 and later to the wine-growing regions of France are now helping him to explore the wines of Italy and Germany.

Decades later, the Wine Cask still has a loyal customer. That loyalty is no accident. The staff learns the tastes of their customers and their price ranges. When asked for advice, the response is based on accumulated knowledge of that customer's tastes and budgets as well as on their knowledge of their stock.

The people at the Wine Cask know a lot about wine. Although that knowledge is one reason to shop there rather than at a big discount liquor store, their intimate knowledge of each customer is what keeps customers coming back. Another wine shop could open across the street and hire a staff of expert oenophiles, but achieving the same level of intimate customer knowledge would take them months or years.

Well-run small businesses naturally form learning relationships with their customers. Over time, they learn more and more about their customers, and they use that knowledge to serve them better. The result is happy, loyal customers and profitable businesses.

Larger companies, with hundreds of thousands or millions of customers, do not enjoy the luxury of actual personal relationships with each one. Larger firms must rely on other means to form learning relationships with their customers. In particular, they must learn to take full advantage of something they have in abundance — the data produced by nearly every customer interaction. This book is about analytic techniques that can be used to turn customer data into customer knowledge.

What Is Data Mining?

Although some data mining techniques are quite new, data mining itself is not a new technology, in the sense that people have been analyzing data on computers since the first computers were invented — and without computers for centuries before that. Over the years, data mining has gone by many different names, such as knowledge discovery, business intelligence, predictive modeling, predictive analytics, and so on. The definition of data mining as used by the authors is:

Data mining is a business process for exploring large amounts of data to discover meaningful patterns and rules.

This definition has several parts, all of which are important.

Data Mining Is a Business Process

Data mining is a business process that interacts with other business processes. In particular, a process does not have a beginning and an end: it is ongoing. Data mining starts with data, then through analysis informs or inspires action, which, in turn, creates data that begets more data mining.

The practical consequence is that organizations who want to excel at using their data to improve their business do not view data mining as a sideshow. Instead, their business strategy must include collecting data, analyzing data for long-term benefit, and acting on the results.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!