26,99 €
Turn jumbles of numbers into graphics, insights, and answers with Excel
With Microsoft Excel, you can, well, excel at data analysis. And Excel Data Analysis For Dummies can help, with clear and easy explanations of the mountain of features for creating, visualizing, and analyzing data. PivotTables, charts, what-if analysis, statistical functions—it's all in here, with examples and ideas for Excel users of all skill levels.
This latest edition covers the most recent updates to Excel and Microsoft 365. You'll beef up your data skills and learn powerful techniques for turning numbers into knowledge. For students, researchers, and business professionals, Excel is the spreadsheet and data application of choice—and Dummies is the best choice for learning how to make those numbers sing.
Excel Data Analysis For Dummies is the go-to resource for Excel users who are looking for better ways to crunch the numbers.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 485
Veröffentlichungsjahr: 2022
Excel® Data Analysis For Dummies®, 5th Edition
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright © 2022 by John Wiley & Sons, Inc., Hoboken, New Jersey
Media and software compilation copyright © 2022 by John Wiley & Sons, Inc. All rights reserved.
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. Microsoft and Excel are registered trademarks of Microsoft Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: WHILE THE PUBLISHER AND AUTHORS HAVE USED THEIR BEST EFFORTS IN PREPARING THIS WORK, THEY MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES REPRESENTATIVES, WRITTEN SALES MATERIALS OR PROMOTIONAL STATEMENTS FOR THIS WORK. THE FACT THAT AN ORGANIZATION, WEBSITE, OR PRODUCT IS REFERRED TO IN THIS WORK AS A CITATION AND/OR POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE PUBLISHER AND AUTHORS ENDORSE THE INFORMATION OR SERVICES THE ORGANIZATION, WEBSITE, OR PRODUCT MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING PROFESSIONAL SERVICES. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR YOUR SITUATION. YOU SHOULD CONSULT WITH A SPECIALIST WHERE APPROPRIATE. FURTHER, READERS SHOULD BE AWARE THAT WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ. NEITHER THE PUBLISHER NOR AUTHORS SHALL BE LIABLE FOR ANY LOSS OF PROFIT OR ANY OTHER COMMERCIAL DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL, INCIDENTAL, CONSEQUENTIAL, OR OTHER DAMAGES.
For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit https://hub.wiley.com/community/support/dummies.
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.
Library of Congress Control Number is available from the publisher.
ISBN 978-1-119-84442-6 (pbk); ISBN 978-1-119-84446-4 (ebk); ISBN 978-1-119-84447-1 (ebk)
Cover
Title Page
Copyright
Introduction
About This Book
What You Can Safely Ignore
Foolish Assumptions
Icons Used in This Book
Beyond the Book
Where to Go from Here
Part 1: Getting Started with Data Analysis
Chapter 1: Learning Basic Data-Analysis Techniques
What Is Data Analysis, Anyway?
Analyzing Data with Conditional Formatting
Summarizing Data with Subtotals
Grouping Related Data
Consolidating Data from Multiple Worksheets
Chapter 2: Working with Data-Analysis Tools
Working with Data Tables
Analyzing Data with Goal Seek
Analyzing Data with Scenarios
Optimizing Data with Solver
Chapter 3: Introducing Excel Tables
What Is a Table and Why Should I Care?
Building a Table
Analyzing Table Information
Chapter 4: Grabbing Data from External Sources
What’s All This about External Data?
Exporting Data from Other Programs
Importing External Data into Excel
Querying External Databases
Chapter 5: Analyzing Table Data with Functions
The Database Functions: Some General Remarks
Retrieving a Value from a Table
Summing a Column's Values
Counting a Column’s Values
Averaging a Column's Values
Determining a Column’s Maximum and Minimum Values
Multiplying a Column’s Values
Deriving a Column’s Standard Deviation
Calculating a Column’s Variance
Part 2: Analyzing Data Using PivotTables and PivotCharts
Chapter 6: Creating and Using PivotTables
Understanding PivotTables
Exploring PivotTable Features
Building a PivotTable from an Excel Range or Table
Creating a PivotTable from External Data
Refreshing PivotTable Data
Adding Multiple Fields to a PivotTable Area
Pivoting a Field to a Different Area
Grouping PivotTable Values
Filtering PivotTable Values
Chapter 7: Performing PivotTable Calculations
Messing around with PivotTable Summary Calculations
Working with PivotTable Subtotals
Introducing Custom Calculations
Inserting a Custom Calculated Field
Inserting a Custom Calculated Item
Editing a Custom Calculation
Deleting a Custom Calculation
Chapter 8: Building PivotCharts
Introducing the PivotChart
Creating a PivotChart
Working with PivotCharts
Part 3: Discovering Advanced Data-Analysis Tools
Chapter 9: Dealing with Data Models
Understanding Excel Data Models
Managing a Data Model with Power Pivot
Transforming Data
Creating a PivotTable or PivotChart from Your Data Model
Chapter 10: Tracking Trends and Making Forecasts
Plotting a Best-Fit Trend Line
Calculating Best-Fit Values
Plotting Forecasted Values
Extending a Linear Trend
Calculating Forecasted Linear Values
Plotting an Exponential Trend Line
Calculating Exponential Trend Values
Plotting a Logarithmic Trend Line
Plotting a Power Trend Line
Plotting a Polynomial Trend Line
Creating a Forecast Sheet
Chapter 11: Analyzing Data Using Statistics
Counting Things
Averaging Things
Finding the Rank
Determining the Nth Largest or Smallest Value
Creating a Grouped Frequency Distribution
Calculating the Variance
Calculating the Standard Deviation
Finding the Correlation
Chapter 12: Analyzing Data Using Descriptive Statistics
Loading the Analysis ToolPak
Generating Descriptive Statistics
Calculating a Moving Average
Determining Rank and Percentile
Generating Random Numbers
Creating a Frequency Distribution
Chapter 13: Analyzing Data Using Inferential Statistics
Sampling Data
Using the t-Test Tools
Performing a z-Test
Determining the Regression
Calculating the Correlation
Calculating the Covariance
Using the Anova Tools
Performing an f-Test
Part 4: The Part of Tens
Chapter 14: Ten Things You Ought to Know about Statistics
Descriptive Statistics Are Straightforward
Averages Aren’t So Simple Sometimes
Standard Deviations Describe Dispersion
An Observation Is an Observation
A Sample Is a Subset of Values
Inferential Statistics Are Cool But Complicated
Probability Distributions Aren't Always Confusing
Parameters Aren't So Complicated
Skewness and Kurtosis Describe a Probability Distribution’s Shape
Confidence Intervals Seem Complicated at First But Are Useful
Chapter 15: Ten Ways to Analyze Financial Data
Calculating Future Value
Calculating Present Value
Determining Loan Payments
Calculating a Loan Payment's Principal and Interest
Calculating Cumulative Loan Principal and Interest
Finding the Required Interest Rate
Determining the Internal Rate of Return
Calculating Straight-Line Depreciation
Returning the Fixed-Declining Balance Depreciation
Determining the Double-Declining Balance Depreciation
Chapter 16: Ten Ways to Raise Your PivotTable Game
Turn the PivotTable Fields Task Pane On and Off
Change the PivotTable Fields Task Pane Layout
Display the Details Behind PivotTable Data
Apply a PivotTable Style
Create a Custom PivotTable Style
Preserve PivotTable Formatting
Rename the PivotTable
Turn Off Grand Totals
Reduce the Size of PivotTable Workbooks
Use a PivotTable Value in a Formula
Appendix: Glossary of Data Analysis and Excel Terms
Index
Author Bio
Advertisement Page
Connect with Dummies
End User License Agreement
Chapter 3
TABLE 3-1 Quick Statistical Measures Available on the Status Bar
TABLE 3-2 Excel’s Comparison Operators
Chapter 12
TABLE 12-1 Measures Generated by the Descriptive Statistics Tool
TABLE 12-2 The Rank and Percentile Tool’s Output Columns
Chapter 1
FIGURE 1-1: The Greater Than dialog box and some highlighted values.
FIGURE 1-2: Use the Duplicate Values rule to highlight worksheet duplicates.
FIGURE 1-3: The Top 10 Items dialog box with the top 5 values highlighted.
FIGURE 1-4: The higher the value, the longer the data bar.
FIGURE 1-5: Excel applies an icon based on each cell’s value.
FIGURE 1-6: Use the New Formatting Rule dialog box to create a custom rule.
FIGURE 1-7: Use the Conditional Formatting Rules Manager to edit your rules.
FIGURE 1-8: Use the Subtotal dialog box to apply subtotals to a range.
FIGURE 1-9: Some subtotals applied to the Total column for each customer.
FIGURE 1-10: When you group a range, Excel displays its outlining tools.
FIGURE 1-11: Consolidate multiple worksheets by adding a range from each one.
FIGURE 1-12: When consolidating by category, tell Excel where your labels are l...
Chapter 2
FIGURE 2-1: This data table has the input values in a column.
FIGURE 2-2: Enter the address of the input cell.
FIGURE 2-3: The data table results.
FIGURE 2-4: For a two-input data table, enter one set of values in a column and...
FIGURE 2-5: Enter the addresses of the input cells.
FIGURE 2-6: The two-input data table results.
FIGURE 2-7: Using Goal Seek to calculate the annual deposit required to end up ...
FIGURE 2-8: Goal Seek took all of a second or two to find a solution.
FIGURE 2-9: Creating a scenario for a mortgage analysis.
FIGURE 2-10: Example values for a scenario’s changing cells.
FIGURE 2-11: The goal for this data model is to find the break-even point (wher...
FIGURE 2-12: The completed Solver Parameters dialog box.
FIGURE 2-13: The Solver Results dialog box and the solution to the break-even p...
FIGURE 2-14: The completed Add Constraint dialog box.
FIGURE 2-15: The Solver Results dialog box and the final solution to the break-...
Chapter 3
FIGURE 3-1: Some table terminology you should know.
FIGURE 3-2: Excel takes a stab at the full range you want to convert.
FIGURE 3-3: The range is now a full-fledged table, just like that.
FIGURE 3-4: Select a column’s cells, and Excel displays a few stats in the stat...
FIGURE 3-5: Excel adds a Total row to the bottom of the table and displays the ...
FIGURE 3-6: Selecting the Quantity column’s Sort & Filter button displays this ...
FIGURE 3-7: The Sort dialog box set up for a two-level sort.
FIGURE 3-8: Select the check box beside each column value that you want to incl...
FIGURE 3-9: The Parts table filtered to show only the records from Division 3.
FIGURE 3-10: The Custom AutoFilter dialog box.
FIGURE 3-11: The filter shown in Figure 3-10 applied to the Parts table.
FIGURE 3-12: A table set up for an advanced filter.
FIGURE 3-13: Put the comparison expressions on one row to match them all.
FIGURE 3-14: Put the comparison expressions on separate rows to match one or mo...
FIGURE 3-15: An advanced filter all set to go.
FIGURE 3-16: The results of the advanced filter in Figure 3-15.
Chapter 4
FIGURE 4-1: Select a table or query in the Navigator dialog box to see a previe...
FIGURE 4-2: Select the delimiter that gets the data into nice, neat columns.
FIGURE 4-3: If needed, edit the column starting points until your columns are c...
FIGURE 4-4: Choose the method you want to use to access the web data.
FIGURE 4-5: Select a table to preview its data.
FIGURE 4-6: Select an XML table to preview its data.
FIGURE 4-7: The Choose Data Source dialog box.
FIGURE 4-8: The completed Create New Data Source dialog box for the Northwind d...
FIGURE 4-9: Any data sources you’ve created appear in the Databases tab.
FIGURE 4-10: The completed Query Wizard — Choose Columns dialog box.
FIGURE 4-11: The Query Wizard – Filter Data dialog box with a filter added.
FIGURE 4-12: The Query Wizard — Sort Order dialog box.
Chapter 5
FIGURE 5-1: A table and its criteria range.
FIGURE 5-2: Use DGET to retrieve a value from a table based on your criteria.
FIGURE 5-3: Use DSUM to add a column’s values based on your criteria.
FIGURE 5-4: Use DCOUNT (or DCOUNTA) to tally a column's values based on your cr...
FIGURE 5-5: Use DAVERAGE to average a column’s values based on your criteria.
FIGURE 5-6: Use DMAX and DMIN to return a column's largest and smallest values ...
FIGURE 5-7: Use DPRODUCT to multiply a column’s values based on your criteria.
FIGURE 5-8: Use DSTDEV (or DSTDEVP) to derive the standard deviation of a colum...
FIGURE 5-9: Use DVAR (or DVARP) to derive the variance of a column’s values bas...
Chapter 6
FIGURE 6-1: Some great data, but how do you make sense of it?
FIGURE 6-2: The PivotTable creates order out of data chaos.
FIGURE 6-3: The features of a typical PivotTable.
FIGURE 6-4: You start with a blank PivotTable and the PivotTable Fields task pa...
FIGURE 6-5: The features of a typical PivotTable.
FIGURE 6-6: Import the external data to a PivotTable Report.
FIGURE 6-7: Select Refresh Data When Opening the File to tell Excel to refresh ...
FIGURE 6-8: The Grouping dialog box.
FIGURE 6-9: From the filter field’s drop-down list, select a report filter.
FIGURE 6-10: Excel filters the PivotTable to show just the selected items in ea...
Chapter 7
FIGURE 7-1: Use the Value Field Settings dialog box to choose a summary calcula...
FIGURE 7-2: A PivotTable that shows sales in two years: 2021 and 2022.
FIGURE 7-3: Use the Value Field Settings dialog box to choose a summary calcula...
FIGURE 7-4: The PivotTable from Figure 7-2 is now using the Difference From cal...
FIGURE 7-5: A PivotTable that shows quarterly sales by region.
FIGURE 7-6: The PivotTable from Figure 7-5, now using the % Of calculation.
FIGURE 7-7: A PivotTable showing monthly order totals.
FIGURE 7-8: The PivotTable from Figure 7-7, with the Running Total In calculati...
FIGURE 7-9: A PivotTable showing units sold by category and region.
FIGURE 7-10: The PivotTable from Figure 7-9, with the Index calculation applied...
FIGURE 7-11: A PivotTable with multiple subtotals.
FIGURE 7-12: A custom calculated field, ready for insertion into the PivotTable...
FIGURE 7-13: The custom calculated field in action.
FIGURE 7-14: A custom calculated item, ready for action.
FIGURE 7-15: Two custom calculated items added to the row area.
Chapter 8
FIGURE 8-1: The major sights to see in the PivotChart landscape.
FIGURE 8-2: Excel kicks things off with a blank PivotTable and PivotChart, and ...
FIGURE 8-3: An embedded PivotChart and its PivotTable.
FIGURE 8-4: Use the Move Chart dialog box to move a PivotChart to another works...
FIGURE 8-5: Pull down the Report Filter field list and then select an item.
FIGURE 8-6: The PivotChart from Figure 8-5 now displayed as a pie chart.
FIGURE 8-7: A data table shown below a PivotChart.
Chapter 9
FIGURE 9-1: In the Orders table, the first record has an Order ID value of 1024...
FIGURE 9-2: In the Order Details table, the first three records have an Order I...
FIGURE 9-3: In the Create Relationship dialog box, relate your two tables by us...
FIGURE 9-4: You can click Select Related Tables to have Excel figure out which ...
FIGURE 9-5: When you import multiple, related tables, you can opt to create onl...
FIGURE 9-6: Be sure to select the Add This Data to the Data Model check box.
FIGURE 9-7: A PivotTable that summarizes data from two related tables: Orders a...
FIGURE 9-8: Excel displays this message if you try to add a field from an unrel...
FIGURE 9-9: Excel can automatically detect the required relationship.
FIGURE 9-10: Click Select Related Tables and Power Pivot will automatically sel...
FIGURE 9-11: Visualize your Data Model tables and relationships with diagram vi...
FIGURE 9-12: See the details of your Data Model relationships by using the Mana...
FIGURE 9-13: The Data Refresh dialog box shows you the progress of the refresh.
Chapter 10
FIGURE 10-1: A chart with a trend line and some geeky numbers.
FIGURE 10-2: An array of best-fit trend values (column F).
FIGURE 10-3: A best-fit trend line, extended four periods into the future.
FIGURE 10-4: The Series dialog box: your best-fit trend best friend.
FIGURE 10-5: An array of forecasted trend values (see the range D14:D17).
FIGURE 10-6: A chart with an exponential trend line.
FIGURE 10-7: An array of exponential trend values (column C) and forecast value...
FIGURE 10-8: A chart with a logarithmic trend line.
FIGURE 10-9: A chart with a power trend line.
FIGURE 10-10: A chart with a polynomial trend line.
FIGURE 10-11: You use the Create Forecast Worksheet dialog box to set up your f...
FIGURE 10-12: The new forecast worksheet showing a table and line chart for the...
Chapter 11
FIGURE 11-1: COUNT returns the tally of the numeric values in a range.
FIGURE 11-2: A parts database.
FIGURE 11-3: MEDIAN returns the median value of a set of numeric values.
FIGURE 11-4: RANK returns the ranking of a value in a set.
FIGURE 11-5: LARGE returns the
n
th largest value in a range or array.
FIGURE 11-6: FREQUENCY tells you how many items in a range appear in each bin.
FIGURE 11-7: VAR.P returns the variance of data that represents an entire popul...
FIGURE 11-8: CORREL calculates the correlation between two sets of values.
Chapter 12
FIGURE 12-1: Loading the Analysis ToolPak add-in.
FIGURE 12-2: The Analysis ToolPak’s tools appear in the Data Analysis dialog bo...
FIGURE 12-3: The Descriptive Statistics dialog box, ready to run.
FIGURE 12-4: The statistics generated by the Descriptive Statistics tool.
FIGURE 12-5: The Moving Average dialog box, good to go.
FIGURE 12-6: Some moving averages generated by the Moving Average tool.
FIGURE 12-7: A completed version of the Rank and Percentile dialog box.
FIGURE 12-8: The product defects with their respective ranks and percentiles.
FIGURE 12-9: The parameters change depending on the Distribution value.
FIGURE 12-10: Random test scores generated by the Random Number Generation tool...
FIGURE 12-11: A worksheet with student grades and a bin range for the distribut...
FIGURE 12-12: The Histogram dialog box, all dressed up.
FIGURE 12-13: A frequency distribution and histogram for some student grades.
Chapter 13
FIGURE 13-1: A table of invoice data from which you might select a sample.
FIGURE 13-2: The Sampling dialog box, ready to sample.
FIGURE 13-3: The sample extracted by the Sampling tool.
FIGURE 13-4: Some sample data you can use to perform t-test calculations.
FIGURE 13-5: The t-Test: Two-Sample Assuming Equal Variances dialog box.
FIGURE 13-6: The results of a t-test.
FIGURE 13-7: The z-Test: Two Sample for Means dialog box.
FIGURE 13-8: The z-test calculation results.
FIGURE 13-9: The completed Regression dialog box.
FIGURE 13-10: The regression analysis results.
FIGURE 13-11: The completed Correlation dialog box.
FIGURE 13-12: The worksheet showing the correlation results for the list price ...
FIGURE 13-13: The completed Covariance dialog box.
FIGURE 13-14: The worksheet showing the covariance results for the list price a...
FIGURE 13-15: The completed Anova: Single Factor dialog box.
FIGURE 13-16: The worksheet showing the Anova results for the list price and un...
FIGURE 13-17: The completed F-Test Two-Sample for Variances dialog box.
FIGURE 13-18: The worksheet showing the F-Test Two-Sample for Variances results...
Chapter 14
FIGURE 14-1: A standard normal distribution.
FIGURE 14-2: A uniform distribution generated by 60,000 dice rolls.
FIGURE 14-3: Data sets that are negatively skewed (left) and positively skewed ...
FIGURE 14-4: Data sets that have negative kurtosis (left) and positive kurtosis...
Chapter 16
FIGURE 16-1: Click Tools to see the PivotTable Fields task pane layout options.
FIGURE 16-2: You've sold 792 earbuds, but what are the details behind this numb...
FIGURE 16-3: The details behind the earbud sales shown in Figure 16-2.
FIGURE 16-4: Use the New PivotTable Style dialog box to define your custom styl...
FIGURE 16-5: The GETPIVOTDATA function doing its thing.
Cover
Title Page
Copyright
Table of Contents
Begin Reading
Appendix: Glossary of Data Analysis and Excel Terms
Index
Author Bio
iii
iv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
99
100
101
102
103
104
105
106
107
108
109
110
111
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
317
318
319
320
321
322
323
324
325
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
The world is bursting at the seams with data. It’s on our computers, it’s in our networks, it’s on the web. Some days, it seems to be in the very air itself, borne on the wind. But here’s the thing: No one actually cares about data. A collection of data — whether it resides on your PC or some giant server somewhere — is really just a bunch of numbers and text, dates and times. No one cares about data because data doesn’t mean anything. Data isn’t cool. You know what’s cool? Knowledge is cool. Insight is cool.
So how do you turn data into knowledge? How do you tweak data to generate insight? You need to organize that data, and then you need to sort it, filter it, run calculations on it, and summarize it. In a word, you need to analyze the data.
Now for the good news: If you have (or can get) that data into Excel, you have a giant basket of data-analysis tools at your disposal. Excel really seems to have been made with data analysis in mind, because it offers such a wide variety of features and techniques for organizing, manipulating, and summarizing just about anything that resides in a worksheet. If you can get your data into Excel, it will help you turn that data into knowledge and insight.
This book takes you on a tour of Excel’s data-analysis tools. You learn everything you need to know to make your data spill its secrets and to uncover your data’s hidden-in-plain-sight wisdom. Best of all, if you already know how to perform the basic Excel chores, you don’t need to learn any other fancy-schmancy Excel techniques to get started in data analysis. Sweet? You bet.
This book contains 16 chapters (and a bonus appendix), but that doesn’t mean that you have to, as the King says gravely in Alice’s Adventures in Wonderland, “Begin at the beginning and go on till you come to the end: Then stop.” If you’ve done a bit of data-analysis work in the past, please feel free to dip into the book wherever it strikes your fancy. The chapters all present their data-analysis info and techniques in readily digestible, bite-sized chunks, so you can certainly graze your way through this book.
However, if you’re brand spanking new to data analysis — particularly if you’re not even sure what data analysis even is — no problem: I’m here to help. To get your data-analysis education off to a solid start, I highly recommend reading the book’s first three chapters to get some of the basics down cold. From there, you can travel to more advanced territory, safe in the knowledge that you’ve got some survival skills to fall back on.
This book consists of several hundred pages. Do I expect you to read every word on every page? Yes, I do. Just kidding! No, of course I don’t. Entire sections — heck, maybe even entire chapters — might contain information that’s not relevant to what you do. That’s fine and my feelings won’t be hurt if you skim through (or — who’s kidding whom? — skip over) those parts of the book.
If time (or attention) is short, what else might you want to ignore? Okay, in many places throughout the book I provide step-by-step instructions to complete some task. Each of those steps includes some bold type that gives you the basic instruction. In many cases, however, below that bold text I offer supplementary information to flesh out or extend or explain the bold instruction. Am I just showing off how much I know about all this stuff? Yes, sometimes. Do you have to read these extended instructions? Nope. Read the bold stuff, for sure, but feel free to skip the details if they seem unnecessary or unimportant.
This book is for people who are new (or relatively new) to Excel data analysis. That doesn’t mean, however, that the book is suitable to people who have never used a PC, Microsoft Windows, or even Excel. So first I assume not only that you have a PC running Microsoft Windows but also that you’ve had some experience with both. (For the purposes of this book, that just means you know how to start and switch between programs.) I also assume that your PC has a recent version of Excel installed. What does “recent” mean? Well, this book is based on Excel 2021, but you should be fine if you’re running Excel 365, Excel 2019, Excel 2016, or even Excel 2013.
As I said before, I do not assume that you’re an Excel expert, but I do assume that you know at least the following Excel basics:
Creating, saving, opening, and switching between workbooks
Creating and switching between worksheets
Finding and running commands on the Ribbon
Entering numbers, text, dates, times, and formulas into worksheet cells
Working with Excel’s basic worksheet functions
Like other books in the For Dummies series, this book uses icons, or little margin pictures, to flag things that don’t quite fit into the flow of the chapter discussion. Here are the icons that I use:
This icon marks text that contains some things that are useful or important enough that you’d do well to store the text somewhere safe in your memory for later recall.
This icon marks text that contains some for-nerds-only technical details or explanations that you’re free to skip.
This icon marks text that contains a shortcut or an easier way to do things, which I hope will make your life — or, at least, the data-analysis portion of your life — more efficient.
This icon marks text that contains a friendly but unusually insistent reminder to avoid doing something. You have been warned.
Examples:
This book’s sample Excel workbooks can be found by going to
www.dummies.com/go/exceldataanalysisfd5e
or at my website:
www.paulmcfedries.com
.
Cheat Sheet:
To locate this book's cheat sheet, go to
www.dummies.com
and search for
Excel Data Analysis For Dummies.
See the cheat sheet for info on Excel database functions, Boolean expressions, and important statistical terms.
Updates:
If this book has any updates after printing, they will be posted to this book's page at
www.dummies.com
.
If you’re just getting your feet wet with Excel data analysis, flip the page and start perusing the first chapter.
If you have some experience with Excel data analysis or you have a special problem or question, use the Table of Contents or the index to find out where I cover that topic and then turn to that page.
Either way, happy analyzing!
Part 1
IN THIS PART …
Understand data analysis and get to know basic analysis features such as conditional formatting and subtotals.
Discover Excel’s built-in tools for analyzing data.
Learn how to build Excel tables that hold and store the data you need to analyze.
Find quick and easy ways to begin your analysis using simple statistics, sorting, and filtering.
Get practical stratagems and common-sense tactics for grabbing data from extra sources.
Chapter 1
IN THIS CHAPTER
Learning about data analysis
Analyzing data by applying conditional formatting
Adding subtotals to summarize data
Grouping related data
Combining data from multiple worksheets
You are awash in data. Information multiplies around you so fast that you wonder how to make sense of it all. You think, “I know what to do. I'll paste the data into Excel. That way, at least the data will be nicely arranged in the worksheet cells, and I can add a little formatting to make things somewhat palatable.” That’s a fine start, but you’re often called upon to do more with your data than make it merely presentable. Your boss, your customer, or perhaps just your curiosity requires you to divine some inner meaning from the jumble of numbers and text that litter your workbooks. In other words, you need to analyze your data to see what nuggets of understanding you can unearth.
This chapter gets you started down that data-analysis path by exploring a few straightforward but useful analytic techniques. After discovering what data analysis entails, you investigate a number of Excel data-analysis techniques, including conditional formatting, data bars, color scales, and icon sets. From there, you dive into some useful methods for summarizing your data, including subtotals, grouping, and consolidation. Before you know it, that untamed wilderness of a worksheet will be nicely groomed and landscaped.
Are you wondering, “What is data analysis, anyway?” That’s an excellent question! Here’s an answer that I unpack for you as I go along: Data analysis is the application of tools and techniques to organize, study, reach conclusions, and sometimes make predictions about a specific collection of information.
For example, a sales manager might use data analysis to study the sales history of a product, determine the overall trend, and produce a forecast of future sales. A scientist might use data analysis to study experimental findings and determine the statistical significance of the results. A family might use data analysis to find the maximum mortgage it can afford or how much it must put aside each month to finance retirement or the kids’ education.
The point of data analysis is to understand information on some deeper, more meaningful level. By definition, raw data is a mere collection of facts that by themselves tell you little or nothing of any importance. To gain some understanding of the data, you must manipulate the data in some meaningful way. The purpose of manipulating data can be something as simple as finding the sum or average of a column of numbers or as complex as employing a full-scale regression analysis to determine the underlying trend of a range of values. Both are examples of data analysis, and Excel offers a number of tools — from the straightforward to the sophisticated — to meet even the most demanding needs.
The data part of data analysis is a collection of numbers, dates, and text that represents the raw information you have to work with. In Excel, this data resides inside a worksheet, which makes the data available for you to apply Excel’s satisfyingly large array of data-analysis tools.
Most data-analysis projects involve large amounts of data, and the fastest and most accurate way to get that data onto a worksheet is to import it from a non-Excel data source. In the simplest scenario, you can copy the data from a text file, a Word table, or an Access datasheet and then paste it into a worksheet. However, most business and scientific data is stored in large databases, so Excel offers tools to import the data you need into your worksheet. I talk about all this in more detail later in the book.
After you have your data in the worksheet, you can use the data as is to apply many data-analysis techniques. However, if you convert the range into a table, Excel treats the data as a simple database and enables you to apply a number of database-specific analysis techniques to the table.
In many cases, you perform data analysis on worksheet values by organizing those values into a data model, a collection of cells designed as a worksheet version of some real-world concept or scenario. The model includes not only the raw data but also one or more cells that represent some analysis of the data. For example, a mortgage amortization model would have the mortgage data — interest rate, principal, and term — and cells that calculate the payment, principal, and interest over the term. For such calculations, you use formulas and Excel’s built-in worksheet functions.
One of the most common data-analysis techniques is what-if analysis, for which you set up worksheet models to analyze hypothetical situations. The “what-if” part means that these situations usually come in the form of a question: “What happens to the monthly payment if the interest rate goes up by 2 percent?” “What will the sales be if you increase the advertising budget by 10 percent?” Excel offers four what-if analysis tools: data tables, Goal Seek, Solver, and scenarios, all of which I cover in this book.
Many Excel worksheets contain hundreds of data values. You could try to make sense of such largish sets of data by creating complex formulas and wielding Excel’s powerful data-analysis tools. However, just as you wouldn’t use a steamroller to crush a tin can, sometimes these sophisticated techniques are too much tool for the job. For example, what if all you want are answers to simple questions such as the following:
Which cell values are less than 0?
What are the top 10 values?
Which cell values are above average, and which are below average?
These simple questions aren’t easy to answer just by glancing at the worksheet, and the more numbers you’re dealing with, the harder it gets. To help you eyeball your worksheets and answer these and similar questions, Excel lets you apply conditional formatting to the cells. Excel applies this special format only to cells that satisfy some condition, which Excel calls a rule. For example, you could apply formatting to display all negative values in a red font, or you could apply a filter to show only the top 10 values.
A conditional format is formatting that Excel applies only to cells that meet the criteria you specify. For example, you can tell Excel to apply the formatting only if a cell’s value is greater or less than some specified amount, between two specified values, or equal to some value. You can also look for cells that contain specified text, dates that occur during a specified time frame, and more.
When you set up your conditional format, you can specify the font, border, and background pattern. This formatting helps to ensure that the cells that meet your criteria stand out from the other cells in the range. Here are the steps to follow:
Select the range you want to work with.
Select just the data values you want to format. Don’t select any surrounding data.
Choose Home ⇒ Conditional Formatting.
Choose Highlight Cells Rules and then select the rule you want to use for the condition.
You have six rules to play around with:
Greater Than:
Applies the conditional format to cells that have a value larger than a value that you specify.
Less Than:
Applies the conditional format to cells that have a value smaller than a value that you specify.
Between:
Applies the conditional format to cells that have a value that is greater than or equal to a minimum value that you specify and less than or equal to a maximum value that you specify.
Equal To:
Applies the conditional format to cells that have a value that is the same as a value that you specify.
Text that Contains:
Applies the conditional format to cells that include the text that you specify.
A Date Occurring:
Applies the conditional format to cells that have a date value that meets the condition that you specify (such as Yesterday, Last Week, or Next Month).
(I cover a seventh rule — Duplicate Values — later in this chapter.) A dialog box appears, the name of which depends on the rule you click in Step 3. For example, Figure 1-1 shows the dialog box for the Greater Than rule.
FIGURE 1-1: The Greater Than dialog box and some highlighted values.
Type the value to use for the condition.
You can also click the up arrow button that appears to the right of the text box and select a worksheet cell that contains the value. Also, depending on the operator, you might need to specify two values.
Use the right drop-down list to select the formatting to apply to cells that match your condition.
If you’re feeling creative, you can make up your own format by selecting the Custom Format command.
Click OK.
Excel applies the formatting to cells that meet the condition you specified.
Excel enables you to specify multiple conditional formats for the same range. For example, you can set up one condition for cells that are greater than some value and a separate condition for cells that are less than some other value. You can apply unique formats to each condition. Keep the range selected and follow Steps 2 through 6 to configure the new condition.
You use conditional formatting mostly to highlight numbers greater than or less than some value, or dates occurring within some range. However, you can use conditional formatting also to look for duplicate values in a range. Why would you want to do that? The main reason is that many range or table columns require unique values. For example, a column of student IDs or part numbers shouldn’t have duplicates.
Unfortunately, scanning such numbers and picking out the repeat values is hard. Not to worry! With conditional formatting, you can specify a font, border, and background pattern that ensures that any duplicate cells in a range or table stand out from the other cells. Here’s what you do:
Select the range that you want to check for duplicates.
Choose Home ⇒ Conditional Formatting.
Choose Highlight Cells Rules ⇒ Duplicate Values.
The Duplicate Values dialog box appears. The left drop-down list has Duplicate selected by default, as shown in Figure 1-2. However, if you want to highlight all the unique values instead of the duplicates, select Unique from this list.
FIGURE 1-2: Use the Duplicate Values rule to highlight worksheet duplicates.
In the right drop-down list, select the formatting to apply to the cells with duplicate values.
You can create your own format by choosing the Custom Format command. In the Format Cells dialog box, use the Font, Border, and Fill tabs to specify the formatting you want to apply, and then click OK.
Click OK.
Excel applies the formatting to any cells that have duplicate values in the range.
When analyzing worksheet data, looking for items that stand out from the norm is often useful. For example, you might want to know which sales reps sold the most last year, or which departments had the lowest gross margins. To quickly and easily view the extreme values in a range, you can apply a conditional format to the top or bottom values of that range.
You can apply such a format by setting up a top/bottom rule, in which Excel applies a conditional format to those items that are at the top or bottom of a range of values. For the top or bottom values, you can specify a number, such as the top 5 or 10, or a percentage, such as the bottom 20 percent. Here’s how it works:
Select the range you want to work with.
Choose Home ⇒ Conditional Formatting.
Choose Top/Bottom Rules and then select the type of rule you want to create.
You have six rules to mess with:
Top 10 Items:
Applies the conditional format to cells that rank in the top
X
, where
X
is a number that you specify (the default is 10).
Top 10 %:
Applies the conditional format to cells that rank in the top
X
%, where
X
is a number that you specify (the default is 10).
Bottom 10 Items:
Applies the conditional format to cells that rank in the bottom
X
, where
X
is a number that you specify (the default is 10).
Bottom 10 %:
Applies the conditional format to cells that rank in the bottom
X
%, where
X
is a number that you specify (the default is 10).
Above Average:
Applies the conditional format to cells that rank above the average value of the range.
Below Average:
Applies the conditional format to cells that rank below the average value of the range.
A dialog box appears, the name of which depends on the rule you selected in Step 3. For example, Figure 1-3 shows the dialog box for the Top Ten Items rule.
Type the value to use for the condition.
You can also click the spin buttons that appear to the right of the text box. Note that you don’t need to enter a value for the Above Average and Below Average rules.
FIGURE 1-3: The Top 10 Items dialog box with the top 5 values highlighted.
In the right drop-down list, select the formatting to apply to cells that match your condition.
When you set up your top/bottom rule, select a format that ensures that the cells that meet your criteria will stand out from the other cells in the range. If none of the predefined formats suits your needs, you can always choose Custom Format and then use the Format Cells dialog box to create a suitable formatting combination. Use the Font, Border, and Fill tabs to specify the formatting you want to apply, and then click OK.
Click OK.
Excel applies the formatting to cells that meet the condition you specified.
In some data-analysis scenarios, you might be interested more in the relative values within a range than the absolute values. For example, if you have a table of products that includes a column showing unit sales, you might want to compare the relative sales of all products.
Comparing relative values is often easiest if you visualize the values, and one of the easiest ways to visualize data in Excel is to use data bars, a data visualization feature that applies colored horizontal bars to each cell in a range of values; these bars appear “behind” (that is, in the background of) the values in the range. The length of the data bar in each cell depends on the value in that cell: the larger the value, the longer the data bar.
Follow these steps to apply data bars to a range:
Select the range you want to work with.
Choose Home ⇒ Conditional Formatting.
Choose Data Bars and then select the fill type of data bars you want to create.
You can apply two type of data bars:
Gradient fill:
The data bars begin with a solid color and then gradually fade to a lighter color.
Solid fill:
The data bars are a solid color.
Excel applies the data bars to each cell in the range. Figure 1-4 shows an example in the Units column.
FIGURE 1-4: The higher the value, the longer the data bar.
If your range includes right-aligned values, gradient-fill data bars are a better choice than solid-fill data bars. Why? Because even the longest gradient-fill bars fade to white toward the right edge of the cell, so your range values will mostly appear on a white background, making them easier to read.
Getting some idea about the overall distribution of values in a range is often useful. For example, you might want to know whether a range has many low values and just a few high values. Color scales can help you analyze your data in this way. A color scale compares the relative values in a range by applying shading to each cell, where the color reflects each cell’s value.
Color scales can also tell you whether your data includes outliers: values that are much higher or lower than the others. Similarly, color scales can help you make value judgments about your data. For example, high sales and low numbers of product defects are good, whereas low margins and high employee turnover rates are bad.
To apply a color scale to a range of values, do the following:
Select the range you want to format.
Choose Home ⇒ Conditional Formatting.
Choose Color Scales and then select the color scale that has the color scheme you want to apply.
Color scales come in two varieties: three-color scales and two-color scales.
If your goal is to look for outliers, go with a three-color scale because it helps the outliers stand out more. A three-color scale is also useful if you want to make value judgments about your data, because you can assign your own values to the colors (such as positive, neutral, and negative). Use a two-color scale when you want to look for patterns in the data, because a two-color scale offers less contrast.
Excel applies the color scale to each cell in your selected range.
Symbols that have common or well-known associations are often useful for analyzing large amounts of data. For example, a check mark usually means that something is good or finished or acceptable, whereas an X means that something is bad or unfinished or unacceptable. Similarly, a green circle is positive, whereas a red circle is negative (think traffic lights). Excel puts these and other symbolic associations to good use with the icon sets feature. You use icon sets to visualize the relative values of cells in a range.
With icon sets, Excel adds a particular icon to each cell in the range, and that icon tells you something about the cell’s value relative to the rest of the range. For example, the highest values might be assigned an upward-pointing arrow, the lowest values a downward-pointing arrow, and the values in between a horizontal arrow.
Here’s how you apply an icon set to a range:
Select the range you want to format with an icon set.
Choose Home ⇒ Conditional Formatting.
Choose Icon Sets and then select the type of icon set you want to apply.
Icon sets come in four categories:
Directional:
Indicates trends and data movement
Shapes:
Points out the high (green) and low (red) values in the range
Indicators:
Adds value judgments
Ratings:
Shows where each cell resides in the overall range of data values
Excel applies the icons to each cell in the range, as shown in Figure 1-5.
FIGURE 1-5: Excel applies an icon based on each cell’s value.
The conditional-formatting rules in Excel — highlight cells rules, top/bottom rules, data bars, color scales, and icon sets — offer an easy way to analyze data through visualization. However, you can tailor your formatting-based data analysis also by creating a custom conditional-formatting rule that suits how you want to analyze and present the data.
Custom conditional-formatting rules are ideal for situations in which normal value judgments — that is, that higher values are good and lower values are bad — don’t apply. In a database of product defects, for example, lower values are better than higher ones. Similarly, data bars are based on the relative numeric values in a range, but you might prefer to base them on the relative percentages or on percentile rankings.
To get the type of data analysis you prefer, follow these steps to create a custom conditional-formatting rule and apply it to your range:
Select the range you want to analyze with a custom conditional-formatting rule.
Choose Home ⇒ Conditional Formatting ⇒ New Rule.
The New Formatting Rule dialog box appears.
In the Select a Rule Type box, select the type of rule you want to create.
Use the controls in the Edit the Rule Description box to edit the rule’s style and formatting.
The controls you see depend on the rule type you selected in Step 3. For example, if you select Icon Sets, you see the controls shown in Figure 1-6.
With Icon Sets, select Reverse Icon Order (as shown in the figure) if you want to reverse the normal icon assignments.
Click OK.
Excel applies the conditional formatting to each cell in the range.
FIGURE 1-6: Use the New Formatting Rule dialog box to create a custom rule.
You can apply conditional formatting based on the results of a formula. That is, you set up a logical formula as the conditional-formatting criteria. For each cell in which that formula returns TRUE, Excel applies the formatting you specify; for all the other cells, Excel doesn't apply the formatting.
In most cases, you use a comparison formula, or you use an IF function, often combined with another logical function such as AND or OR. In each case, your formula's comparison value must reference only the first value in the range. For example, if the range you are working with is a set of dates in A2:A100, the comparison formula =WEEKDAY(A2)=6 would apply conditional formatting to every cell in the range that occurs on a Friday.
The following steps show you how to apply conditional formatting based on the results of a formula:
Select the range you want to work with.Choose Home ⇒ Conditional Formatting ⇒ New Rule.
The New Formatting Rule dialog box appears.
Select Use a Formula to Determine Which Cells to Format.In the Format Values Where this Formula Is True text box, type the logical formula.
The figure shows an example of using a formula to apply conditional formatting.
Choose Format, use the Format Cells dialog box to define the rule’s style and formatting, and then click OK.Click OK.
Excel applies the conditional formatting to each cell in the range in which the logical formula returns TRUE.
When you're messing around with formula-based rules, one useful technique is to apply a conditional format based on a formula that compares all the cells in a range to one value in that range. The simplest case is a formula that applies conditional formatting to those range cells that are equal to a cell value in the range. Here’s the logical formula to use for such a comparison:
rangecell
Here, range is an absolute reference to the range of cells you want to work with, and cell is a relative reference to the comparison cell. For example, to apply a conditional format to those cells in the range A1:A50 that are equal to the value in cell A1, you would use the following logical formula:
=$A$1:$A$50=A1
Conditional-formatting rules are excellent data-visualization tools that can make analyzing your data easier and faster. Whether you're highlighting cells based on criteria, showing cells in the top or bottom of a range, or using features such as data bars, color scales, and icon sets, conditional formatting enables you to interpret your data quickly.
But it doesn't follow that all your conditional-formatting experiments will be successful ones. For example, you might find that the conditional formatting you used isn’t working out because it doesn’t let you visualize your data the way you’d hoped. Similarly, a change in data might require a change in the condition you used. Whatever the reason, you can edit your conditional-formatting rules to ensure that you get the best visualization for your data. Here’s how:
Select a cell in the range that includes the conditional-formatting rule you want to edit.
You can select a single cell, multiple cells, or the entire range.
Choose Home ⇒ Conditional Formatting ⇒ Manage Rules.
The Conditional Formatting Rules Manager dialog box appears, as shown in Figure 1-7.
FIGURE 1-7: Use the Conditional Formatting Rules Manager to edit your rules.
Select the rule you want to modify.
If you don’t see the rule, click the Show Formatting Rules For drop-down list and then select This Worksheet. The list that appears displays every conditional-formatting rule that you’ve applied in the current worksheet.
Choose Edit Rule.
The Edit Formatting Rule dialog box appears.
Make your changes to the rule.
Click OK.
Excel returns you to the Conditional Formatting Rules Manager dialog box.
Select OK.
Excel updates the conditional formatting.
If you have multiple conditional-formatting rules applied to a range, the visualization is affected by the order in which Excel applies the rules. Specifically, if a cell already has a conditional format applied, Excel does not overwrite that format with a new one. For example, suppose that you have two conditional-formatting rules applied to a list of student grades: one for grades over 90 and one for grades over 80. If you apply the over-80 conditional format first, Excel will never apply the over-90 format because those values are already covered by the over-80 format. The solution is to change the order of the rule. In the Conditional Formatting Rules Manager dialog box, select the rule that you want to modify and then click the Move Up and Move Down button to set the order you want. If you want Excel to stop processing the rest of the rules after it has applied a particular rule, select that rule’s Stop If True check box.
Conditional-formatting rules are useful critters, but they don’t work in all scenarios. For example, if your data is essentially random, conditional-formatting rules won’t magically produce patterns in that data. You might also find that