70,99 €
Praise for the First Edition
“...a well-written book on data analysis and data mining that provides an excellent foundation...”
—CHOICE
“This is a must-read book for learning practical statistics and data analysis...”
—Computing Reviews.com
A proven go-to guide for data analysis, Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining, Second Edition focuses on basic data analysis approaches that are necessary to make timely and accurate decisions in a diverse range of projects. Based on the authors’ practical experience in implementing data analysis and data mining, the new edition provides clear explanations that guide readers from almost every field of study.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 308
Veröffentlichungsjahr: 2014
Second Edition
GLENN J. MYATT WAYNE P. JOHNSON
Copyright © 2014 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Myatt, Glenn J., 1969– [Making sense of data] Making sense of data I : a practical guide to exploratory data analysis and data mining / Glenn J. Myatt, Wayne P. Johnson. – Second edition. pages cm Revised edition of: Making sense of data. c2007. Includes bibliographical references and index. ISBN 978-1-118-40741-7 (paper) 1. Data mining. 2. Mathematical statistics. I. Johnson, Wayne P. II. Title. QA276.M92 2014 006.3′12–dc23
2014007303
ISBN: 9781118407417
PREFACE
1 INTRODUCTION
1.1 Overview
1.2 Sources of Data
1.3 Process for Making Sense of Data
1.4 Overview of Book
1.5 Summary
Further Reading
2 DESCRIBING DATA
2.1 Overview
2.2 Observations and Variables
2.3 Types of Variables
2.4 Central Tendency
2.5 Distribution of the Data
2.6 Confidence Intervals
2.7 Hypothesis Tests
Exercises
Further Reading
3 PREPARING DATA TABLES
3.1 Overview
3.2 Cleaning the Data
3.3 Removing Observations and Variables
3.4 Generating Consistent Scales Across Variables
3.5 New Frequency Distribution
3.6 Converting Text to Numbers
3.7 Converting Continuous Data to Categories
3.8 Combining Variables
3.9 Generating Groups
3.10 Preparing Unstructured Data
Exercises
Further Reading
4 UNDERSTANDING RELATIONSHIPS
4.1 Overview
4.2 Visualizing Relationships Between Variables
4.3 Calculating Metrics About Relationships
Exercises
Further Reading
5 IDENTIFYING AND UNDERSTANDING GROUPS
5.1 Overview
5.2 Clustering
5.3 Association Rules
5.4 Learning Decision Trees from Data
Exercises
Further Reading
6 BUILDING MODELS FROM DATA
6.1 Overview
6.2 Linear Regression
6.3 Logistic Regression
6.4
K
-Nearest Neighbors
6.5 Classification and Regression Trees
6.6 Other Approaches
Exercises
Further Reading
APPENDIX A ANSWERS TO EXERCISES
APPENDIX B HANDS-ON TUTORIALS
B.1 Tutorial Overview
B.2 Access and Installation
B.3 Software Overview
B.4 Reading in Data
B.5 Preparation Tools
B.6 Tables and Graph Tools
B.7 Statistics Tools
B.8 Grouping Tools
B.9 Models Tools
B.10 Apply Model
B.11 Exercises
BIBLIOGRAPHY
INDEX
END USER LICENSE AGREEMENT
Chapter 2
TABLE 2.1
TABLE 2.2
TABLE 2.3
TABLE 2.4
TABLE 2.5
TABLE 2.6
Chapter 3
TABLE 3.1
TABLE 3.2
TABLE 3.3
Chapter 4
TABLE 4.1
TABLE 4.2
TABLE 4.3
TABLE 4.4
TABLE 4.5
TABLE 4.6
TABLE 4.7
TABLE 4.8
Chapter 5
TABLE 5.1
TABLE 5.2
TABLE 5.3
TABLE 5.4
TABLE 5.5
TABLE 5.6
TABLE 5.7
TABLE 5.8
TABLE 5.9
TABLE 5.10
TABLE 5.11
TABLE 5.12
TABLE 5.13
TABLE 5.14
TABLE 5.15
TABLE 5.16
TABLE 5.17
TABLE 5.18
Chapter 6
TABLE 6.1
TABLE 6.2
TABLE 6.3
TABLE 6.4
TABLE 6.5
TABLE 6.6
TABLE 6.7
TABLE 6.8
TABLE 6.9
TABLE 6.10
TABLE 6.11
TABLE 6.12
TABLE 6.13
TABLE 6.14
TABLE 6.15
TABLE 6.16
TABLE 6.17
Appendix A
TABLE A.1
TABLE A.2
TABLE A.3
TABLE A.4
TABLE A.5
TABLE A.6
TABLE A.7
TABLE A.8
Appendix B
TABLE B.1
TABLE B.2
Chapter 1
FIGURE 1.1
Summary of a general framework for a data analysis project.
FIGURE 1.2
Summary of some of the issues to consider when defining and planning a data analysis project.
FIGURE 1.3
Summary of steps to consider when preparing the data.
FIGURE 1.4
Summary of tasks to consider when analyzing the data.
FIGURE 1.5
Summary of deployment options.
FIGURE 1.6
Summary of steps to consider in developing a data analysis or data mining project.
Chapter 2
FIGURE 2.1
Spreadsheet showing a sample of car observation.
FIGURE 2.2
Bar chart for the
Origin
variable from the auto-MPG data table.
FIGURE 2.3
Bar charts for the
Origin
variables from the auto-MPG data table showing the proportion and percentage.
FIGURE 2.4
Bar chart for a variable measured on an ordinal scale,
PLT
.
FIGURE 2.5
Frequency histogram for the variable “Acceleration.”
FIGURE 2.6
Examples of frequency distributions.
FIGURE 2.7
More complex frequency distributions.
FIGURE 2.8
Overview of elements of a box plot.
FIGURE 2.9
Box plot for the variable
MPG
.
FIGURE 2.10
Comparison of frequency histogram and a box plot for the variable
MPG
.
FIGURE 2.11
A box plot with extreme values explicitly shown as circles.
FIGURE 2.12
Frequency distribution showing a positive skew.
FIGURE 2.13
Skewness estimates for two variables.
FIGURE 2.14
Kurtosis estimates for two variables.
FIGURE 2.15
Illustration of the standard
z
-distribution to calculate
z
α/2
.
FIGURE 2.16
Standard
z
-distribution.
Chapter 3
FIGURE 3.1
Histogram showing an outlier.
FIGURE 3.2
Log transformation converting a variable (IC50) to adjust the frequency distribution.
Chapter 4
FIGURE 4.1
Example of a scatterplot where each point corresponds to an observation.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
