29,99 €
This manual is for business analysts to enhance their statistical analysis skills, with case studies focusing mainly on Excel. It covers basic descriptive techniques, linear regression, forecasting, t-Test, chi-square, A/B testing, text data analysis, and Big Data management. Companion files include solution spreadsheets, sample files, and data sets.
The course starts with data shaping and cleaning, installing the Analysis ToolPak, and descriptive statistics. It progresses through histograms, scatter plots, Pareto analysis, correlation, linear and multivariate regression, and forecasting. Advanced topics include inferential statistics, contingency analysis, and A/B testing. The final chapters cover text analytics, big data sets, and data visualization.
These techniques are crucial for informed business decisions. This book guides users from basic to advanced analysis, blending theory with practical skills. Companion files enhance learning, making this manual essential for mastering statistical analysis in business.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 125
Veröffentlichungsjahr: 2024
DATA ANALYSIS FOR BUSINESS DECISIONS
LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY
By purchasing or using this book and its companion files (the “Work”), you agree that this license grants permission to use the contents contained herein, but does not give you the right of ownership to any of the textual content in the book or ownership to any of the information, files, or products contained in it. This license does not permit uploading of the Work onto the Internet or on a network (of any kind) without the written consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc. contained herein is limited to and subject to licensing terms for the respective products, and permission must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network any portion of the textual material (in any media) that is contained in the Work.
MERCURY LEARNING AND INFORMATION (“MLI” or “the Publisher”) and anyone involved in the creation, writing, production, accompanying algorithms, code, or computer programs (“the software”), and any accompanying Web site or software of the Work, cannot and do not warrant the performance or results that might be obtained by using the contents of the Work. The author, developers, and the Publisher have used their best efforts to insure the accuracy and functionality of the textual material and/or programs contained in this package; we, however, make no warranty of any kind, express or implied, regarding the performance of these contents or programs. The Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or due to faulty workmanship).
The author, developers, and the publisher of any accompanying content, and anyone involved in the composition, production, and manufacturing of this work will not be liable for damages of any kind arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or textual material contained in this publication. This includes, but is not limited to, loss of revenue or profit, or other incidental, physical, or consequential damages arising out of the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book and only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” vary from state to state, and might not apply to the purchaser of this product.
Companion files also available for downloading from the publisher by writing to [email protected].
Copyright ©2020 by MERCURY LEARNING AND INFORMATION LLC. All rights reserved.
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.
Publisher: David Pallai
MERCURY LEARNING AND INFORMATION
22841 Quicksilver Drive
Dulles, VA 20166
www.merclearning.com
1-800-232-0223
A. Fortino. Data Analysis for Business Decisions: A Laboratory Manual 2/E.
ISBN: 978-1-68392-592-7
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2020942356
202122321 Printed on acid-free paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 800-232-0223 (toll free).
All of our titles are available for sale in digital format at academiccourseware.com and other digital vendors. Companion files for this title can also be downloaded by writing to [email protected]. The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the book, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.
Dedicated to Kathleen for her patience and support
Contents
Preface
Chapter 1 Shaping and Cleaning Data
Analysis Case 1.1 – Shaping the Data File
Analysis Case 1.2 – Cleaning the Data File
Chapter 2 Installing the Analysis Toolpak
Analysis Case 2.1 – Excel Analysis ToolPak
Installation and Activation
Chapter 3 Descriptive Statistics
Analysis Case 3.1 – Descriptive Statistics
Five-Point Summaries
Analysis Case 3.2 – Additional Analysis Case Using the ORDERS File
Analysis Case 3.3 – Descriptive Statistics
Tabulation and Pivot Tables
Analysis Case 3.4 – Additional Case Using Titanic Data
Analysis Case 3.5 – Additional Case Using SFO Airport Survey Data
Chapter 4 Histograms
Analysis Case 4.1 – Histograms
Frequency Distributions
Analysis Case 4.2 – Additional Case Using Titanic Data
Chapter 5 Pareto Analysis
Analysis Case 5.1 – Pareto Analysis
Which Are the Most Important?
Analysis Case 5.2 – Additional Case Using MOVIES Data
Analysis Case 5.3 – Additional Case Using ORDERS
Analysis Case 5.4 – Additional Case Using SFO Airport Survey Data
Chapter 6 Scatter Plots
Analysis Case 6.1 – Scatter Plots and 2x2 Analysis
Which Are the “Best”?
Analysis Case 6.2 – Additional Case Using World Bank Data
Analysis Case 6.3 – Additional Case Using SFO Airport Survey Data
Chapter 7 Correlation and Linear Regression
Analysis Case 7.1 – Correlation and Linear Regression
How Are the Variables Related?
Analysis Case 7.2 – Additional Case Using ORDERS
Analysis Case 7.3 – Additional Case Using SFO Airport Survey Data
Chapter 8 Multivariate Regression
Analysis Case 8.1 – Multivariate Regression
Predictive Modeling
Analysis Case 8.2 – Additional Case Using SFO Airport Survey Data
Chapter 9 Forecasting and Time Series
Analysis Case 9.1 – Forecasting and Time Series
Predicting Trends and Future Values
Analysis Case 9.2 – Additional Case Using ORDERS
Analysis Case 9.3 – Additional Case Using MOVIES
Analysis Case 9.4 – Additional Forecasting Case Study
Chapter 10 Inferential Statistics
Analysis Case 10.1 – Assurance of Results
Inferential Statistics
Analysis Case 10.2 – Additional Case Using FOOTBALLTRIALS
Analysis Case 10.3 – Additional Exercise Using SFO Airport Survey Data
Chapter 11 Contingency Analysis
Analysis Case 11.1 – Contingency Analysis and Chi-Squared
Test of Variable Independence
Analysis Case 11.2 – Additional Case Using SFO Airport Survey Data
Chapter 12 A/B Testing
Analysis Case 12.1 – Design and Analysis of Trials
A/B Testing
Analysis Case 12.2 – Additional Case Using ORDERS
Analysis Case 12.3 – Additional Case Using SFO Airport Survey Data
Analysis Case 12.4 – Additional Analysis Cases Using Titanic Data
Chapter 13 Text Analytics
Analysis Case 13.1 – Unstructured Text Analysis I
Analysis Case 13.2 – Unstructured Text Analysis II
Using Excel for Sentiment Analysis
Chapter 14 Analyzing Big Data Sets
Analysis Case 14.1 – Big Data Analysis
Using Sampling to Work with Large Data Files
Analysis Case 14.2 – Additional Case Using the BankComplaints Big Data File
Chapter 15 Data Visualization
The Case
The Analysis Template
Analysis Case 15.1 – Story
Create a Visual Story − Is the point of the visual very clear?
Make It a Prop − Has the visual has been simplified and focused?
Emulate Legendary Storytellers − Are past masters and the basic charts that they pioneered emulated?
Case Analysis
Analysis Case 15.2 – Signs
Give Them a Sign − Is the use of signs and symbols appropriate?
It’s Like a Communication System − Is the signal-to-noise ratio high?
Design for Function − Is the chart functionally informational rather than beautiful art?
Case Analysis
Analysis Case 15.3 – Purpose
Consider the Information Need – Does the chart fulfill organizational information needs?
Consider the Audience – Does the chart allow for audience biases, needs, and journeys?
Answer Well-Framed Analytical Questions – Does the visual answer a well-framed analytical question?
Case Analysis
Analysis Case 15.4 – Perception
Use the Eye-Brain System of Seeing – Does the eye of the viewer focus on the most important point being made?
Employ the Gestalt Principles of Perception – Have the principles of the Gestalt psychology of perception been thoughtfully employed in the visual?
Design with Quality – Does the visual inform the viewer and dispel his ignorance?
Case Analysis
Analysis Case 15.5 – Method
Use Color Effectively – Is color used judiciously and sparsely?
Remove All Chart Junk – Is the visual is clear of unnecessary visual elements not leading to a clear point being made?
Tell the Story with the Title – Does the title of the chart convey the point being made with the chart?
Case Analysis
Analysis Case 15.6 – Charts
Use the Right Chart – Does the type of chart being used match the level of judgment required?
Select the Chart Type Effectively – Does the chart type used match the business question being answered?
Enhance Table Data for Emphasis – Are referenceable visuals (tables) readable with appropriate conditional formatting and thumbnail graphs used for emphasis?
Case Analysis
Analysis Case 15.7 – Putting It All Together
A Possible Solution
Chapter 16 Summary of Analysis Techniques
Business Questions and Analysis Techniques for Addressing Them
Data Sources
Index
Preface
This laboratory manual was written for business analysts who wish to increase their skills in conducting statistical analysis of data sets to support business decision-making. Most of the exercises use Excel, today’s most common analysis tool. They range from the most basic descriptive statistical techniques to more advanced techniques, such as multivariate linear regression and forecasting.
Advanced exercises cover inferential statistics for continuous variables (t-Test) and categorical variables (Chi-square), as well as A/B testing. The manual ends with techniques to deal with the analysis of text data (text data mining) and tools to manage the analysis of large data sets (Big Data) using Excel. A set of cases is provided to assist the analyst to improving their data visualization skills.
On the Companion Files
The exercises require access to the data sets used in analyzing the cases. They may be accessed on companion disc. with this book or for downloading by writing to the publisher at [email protected]. A file folder Lab Data has all the files referenced in the exercises. A zip file Lab Data.zip, found in the same repository, can be downloaded to make data available on a local drive. The solution folders within each exercise folder contain some illustrative charts and tables as well as solution spreadsheets. All of the figures (including those in full color) are on the companion files for enlargement and easy-viewing. The analysis techniques presented in each chapter have short companion videos you may use to understand the ideas further. The video lessons may be found on the companion disc. If you wish to stream the video rather than download it, there is a document on the disc with links to all the companion videos to be found on a streaming service. The companion files are also available for download from the publisher by writing to [email protected].
Acknowledgments
Practical books such as this, that are full of cases, are created by many years of trying them out on students until you get them right. It’s a matter of keep changing the exercises and trying things out until they seem to work, and in the end, they help people learn. I wish to thank the legion of students who were very patient with me and helped me perfect these cases. Both my graduate students at the NYU School of Professional Studies and the many American Management Association professionals who attended my AMA seminars deserve my gratitude.
I also wish to thank my colleague, Nicole Morgenstern, for taking a chance with me at AMA. Thank you, Nicole, for sponsoring this work and running interference for me. My thanks to my graduate student, Karen pey-rong Hong, who did a superb job updating all the exercises to the latest version of Excel. The entire team of editors and artists at Mercury Learning was terrific and has my gratitude. A special thanks to Jim Walsh, my editor, who kept asking for more and more and helped shape an excellent book. In the end, it paid off, Jim. Finally, to my loving and patient wife, Kathleen, who not only labored over the manuscript by copyediting, but provided much-needed advice. You were always right, dear.
Dr. Andres Fortino June 2020
CHAPTER 1
SHAPING AND CLEANING DATA
In the first set of exercises, we will look at the importance of shaping and cleaning data files. The initial image in this chapter shows the Data Cleansing Cycle with many activities, starting with importing the data; merging the data sets; standardizing and normalizing data; rebuilding missing data; de-duplicating; and last, verifying and enriching the data set. The object is to produce a data set in Excel in what is called a flat-file format. When expressed in that format, the first row of the table must contain all the variable names (with none missing); every row is of the same nature, and there are no empty rows or columns. All other rows and columns outside of the table area should be clear of data. Once in that format, the table is ready for analysis, and we can safely apply many of the Excel analytic tools.
The source of the data table varies; sometimes we extract it from a DBMS using the SQL language using queries. Other times we may obtain a comma-separated values file (with a .CSV extension), or a formatted text file (with a .TXT extension), or we may have scraped it from an HMTL formatted Web page. In Analysis Case 1.1, we explore loading and shaping data files from several sources. We study how long it takes to load data files of different sizes, including some huge files that tax the limit of Excel. They not only take a long time to load, but they are unwieldy to analyze.
Once we practice loading data in various formats, we explore cleaning it in Analysis Case 1.2. We practice using a small data file that contains several errors that need to be corrected. You are directed to the original data to find the original values. The exercise allows you to utilize many tools in Excel that make the data cleaning process efficient.
The whole process of scraping, uploading, cleaning, annotating, and shaping the data file is referred to as data wrangling. Many studies have shown that this process is not only tedious but can take up to 80% of the overall time needed to perform the analysis. But it is critical for success in the analysis. The more skilled you are in the use of cleaning and shaping tools, and the smarter you are in their use, the sooner you will start the analysis, and the less time you will need to find answers.
Analysis Case 1.1 – Shaping the Data File
1. Using the Lab Data set provided, open the Analysis Case 1.1 folder in it, and find the file ORDERS.csv