The Analytics Lifecycle Toolkit - Gregory S. Nelson - E-Book

The Analytics Lifecycle Toolkit E-Book

Gregory S. Nelson

0,0
33,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

An evidence-based organizational framework for exceptional analytics team results The Analytics Lifecycle Toolkit provides managers with a practical manual for integrating data management and analytic technologies into their organization. Author Gregory Nelson has encountered hundreds of unique perspectives on analytics optimization from across industries; over the years, successful strategies have proven to share certain practices, skillsets, expertise, and structural traits. In this book, he details the concepts, people and processes that contribute to exemplary results, and shares an organizational framework for analytics team functions and roles. By merging analytic culture with data and technology strategies, this framework creates understanding for analytics leaders and a toolbox for practitioners. Focused on team effectiveness and the design thinking surrounding product creation, the framework is illustrated by real-world case studies to show how effective analytics team leadership works on the ground. Tools and templates include best practices for process improvement, workforce enablement, and leadership support, while guidance includes both conceptual discussion of the analytics life cycle and detailed process descriptions. Readers will be equipped to: * Master fundamental concepts and practices of the analytics life cycle * Understand the knowledge domains and best practices for each stage * Delve into the details of analytical team processes and process optimization * Utilize a robust toolkit designed to support analytic team effectiveness The analytics life cycle includes a diverse set of considerations involving the people, processes, culture, data, and technology, and managers needing stellar analytics performance must understand their unique role in the process of winnowing the big picture down to meaningful action. The Analytics Lifecycle Toolkit provides expert perspective and much-needed insight to managers, while providing practitioners with a new set of tools for optimizing results.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 515

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Preface

Acknowledgments

REFERENCES

PART I: The Foundation of Analytics

CHAPTER 1: Analytics Overview

FUNDAMENTAL CONCEPTS

ANALYTICS CONCEPTS

THE METHODS OF ANALYTICS

THE GOAL OF ANALYTICS

CHAPTER SUMMARY

REFERENCES

NOTE

CHAPTER 2: The People of Analytics

WHO DOES ANALYTICS?

ROLES PEOPLE PLAY

CRITICAL COMPETENCIES FOR ANALYTICS

CHAPTER SUMMARY

REFERENCES

NOTES

CHAPTER 3: Organizational Context for Analytics

ORGANIZATIONAL STRATEGY AND ANALYTICS ALIGNMENT

ORGANIZATIONAL CULTURE

ORGANIZATIONAL DESIGN FOR ANALYTICS

WHICH ORGANIZATIONAL DESIGN IS BEST?

CHAPTER SUMMARY

REFERENCES

CHAPTER 4: Data Strategy, Platforms, and Architecture

DATA STRATEGY

STRATEGY DEVELOPMENT PROCESS

DEVELOPING A DATA STRATEGY ROADMAP

AN AGILE APPROACH TO DEVELOPING YOUR DATA STRATEGY

DATA STRATEGY SUMMARY

PLATFORMS AND ARCHITECTURE ANALYTICS

ANALYTICS ARCHITECTURE

DATA FOR PURPOSE OR POTENTIAL

CHAPTER SUMMARY

REFERENCES

PART II: Analytics Lifecycle Best Practices

CHAPTER 5: The Analytics Lifecycle Toolkit

ANALYTICS LIFECYCLE BEST PRACTICE AREAS

ANALYTICS AS A PRODUCT OF DATA SCIENCE

GOALS OF ANALYTICS

SCALE AND SCOPE OF ANALYTICS PRODUCTS

HOW THE ANALYTICS LIFECYCLE TOOLKIT IS ORGANIZED

DESIGN THINKING FOR ANALYTICS

CHAPTER SUMMARY

REFERENCES

CHAPTER 6: Problem Framing

PROCESS OVERVIEW

WHY DO THIS?

PROCESS AREAS

CHAPTER SUMMARY

TOOLKIT SUMMARY

REFERENCES

CHAPTER 7: Data Sensemaking

PROCESS OVERVIEW

PROCESS AREAS

MAINTAINING AN ANALYTICS JOURNAL

CHAPTER SUMMARY

TOOLKIT SUMMARY

REFERENCES

CHAPTER 8: Analytics Model Development

PROCESS OVERVIEW

PROCESS AREAS

MAKING COMPARISONS

MEASURING ASSOCIATIONS

MAKING PREDICTIONS

CHAPTER SUMMARY

PROBLEM SUMMARY AND EXERCISES

TOOLKIT SUMMARY

REFERENCES

CHAPTER 9: Results Activation

PROCESS OVERVIEW

SOLUTION EVALUATION

OPERATIONALIZATION

PRESENTATION AND STORYTELLING

CHAPTER SUMMARY

EXERCISE

TOOLKIT SUMMARY

REFERENCES

NOTE

CHAPTER 10: Analytics Product Management

PROCESS OVERVIEW

PROCESS AREAS

CHAPTER SUMMARY

REFERENCES

PART III: Sustaining Analytics Success

CHAPTER 11: Actioning Analytics

THE POWER OF ANALYTICS

EFFICIENT AND EFFECTIVE ANALYTICS PROGRAMS

WHY OPERATIONALIZATION OF ANALYTICS FAILS

CHANGE MANAGEMENT

BEST PRACTICES FOR LEADING CHANGE

TROUBLESHOOTING CHANGE

CHAPTER SUMMARY

REFERENCES

CHAPTER 12: Core Competencies for Analytics Teams

INTRODUCTION

ANALYTICS COMPETENCIES DETAILED

IDEALIZED COMPETENCIES FOR ANALYTICS JOB FAMILIES BY KNOWLEDGE DOMAIN

CHAPTER SUMMARY

REFERENCE

CHAPTER 13: The Future of Analytics

THE ANALYTICS LIFECYCLE AS A FRAMEWORK

THE ROLE OF ANALYTICS IN OUR FUTURE

FINAL THOUGHTS

REFERENCES

About the Author

About the Companion Web Site

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Styles of machine learning

Chapter 3

Table 3.1 Eight Elements of Organization Design

Chapter 4

Table 4.1 Six Stages for Data Strategy Execution

Chapter 5

Table 5.1 Types of analytics “project types”

Table 5.2 Best practice area: Problem framing

Table 5.3 Best practice area: Data sensemaking

Table 5.4 Best practice area: Analytics model development

Table 5.5 Best practice area: Results activation

Table 5.6 Best practice area: Analytics Product Lifecycle management

Chapter 6

Table 6.1 Types of diagrams useful in identifying potential cause and effect relationships

Table 6.2 Design thinking tools useful in analytics problem framing

Table 6.3 Problem Statement Table

Table 6.4 Project prioritization matrix

Chapter 7

Table 7.1 Measures important while examining the distributions of quantitative variables

Chapter 8

Table 8.1 Major issues in achieving analytics maturity

Table 8.2 Model development best process areas, common questions, and common methods

Chapter 10

Table 10.1 Objective and subjective measures for calculating return on investment

Chapter 11

Table 11.1 Differences between effectiveness and efficiency in analytics

Table 11.2 Stakeholder perspectives on analytics effectiveness and efficiency

Table 11.3 Change management best practice areas for analytics

Table 11.4 Examples of analytics impacting change

Table 11.5 Examples of communications goals by change curve

List of Illustrations

Chapter 1

Figure 1.1 BI dashboard

Figure 1.2 The difference between health information management, health IT, and informatics

Figure 1.3 The relationship between statistics and other quantitative disciplines

Figure 1.4 Inductive reasoning compared to deductive reasoning

Figure 1.5 Approaches to forecasting and time series analysis

Figure 1.6 Natural language processing conceptual workflow

Figure 1.7 Techniques found in machines learning

Chapter 2

Figure 2.1 The analytics lifecycle supports the decision lifecycle.

Figure 2.2 Using data to support decisions includes a variety of roles

Figure 2.3 Five job families for analytics

Figure 2.4 Competencies for analytics vary by job family.

Figure 2.5 The knowledge domains for analytics. (The Analytics Competency Model)

Figure 2.6 The elusive data scientist combines competencies in technology, data, business domain, and methods.

Figure 2.7 Key competencies needed across analytics roles

Figure 2.8 Decomposition diagram for why a car won't start

Figure 2.9 Analysis and synthesis are two processes used in problem solving.

Figure 2.10 Analytics innovation planning model

Figure 2.11 Cultural alignment—patient-centered care

Chapter 3

Figure 3.1 Analytics alignment: relationship between strategy, organizational capabilities, resources, and management systems

Figure 3.2 Data pipeline supports the Analytics Lifecycle

Figure 3.3 Culture and competencies of the modern analytics organization

Figure 3.4 Analytics culture and readiness rests on five dimensions

Figure 3.5 Organizational design is a function of structure, people, and processes

Figure 3.6 Galbraith's Star Model for the design of organizations

Figure 3.7 Centralized organizational structure for analytics

Figure 3.8 Decentralized organizational structure for analytics

Figure 3.9 Center of excellence organizational structure for analytics

Figure 3.10 The analytics concierge model

Figure 3.11 Staff configuration in the analytics concierge model

Chapter 4

Figure 4.1 Relationship between corporate strategy and the analytics function

Figure 4.2 Analytics strategy development process

Figure 4.3 Relationship between the data pipeline and the Analytics Lifecycle

Figure 4.4 Analytics maturity assessment pyramid

Figure 4.5 Percent of problems by differing complexities

Figure 4.6 The dimensions of crisis analytics

Chapter 5

Figure 5.1 Analytics Lifecycle

Figure 5.2 The motivation for analytics

Figure 5.3 The five best practice areas in the Analytics Lifecycle Toolkit

Figure 5.4 Five steps in design thinking

Figure 5.5 Adaptation of various toolkits (such as design thinking, product thinking, and agile methods) can enable better analytics solutions

Figure 5.6 Design thinking processes applied to analytics

Chapter 6

Figure 6.1 Problem framing best practice area in context

Figure 6.2 Problem framing relates to the empathy stage in design thinking

Figure 6.3 Affinity diagram

Figure 6.4 Tree diagram map from root cause analysis (Determining why sales are slow)

Figure 6.5 Divergent vs. convergent thinking in problem framing

Figure 6.6 Three-step process for brainstorming in problem framing (prioritizing software features)

Figure 6.7 Belief statements translated to SMART goals

Figure 6.8 Impact and risk matrix

Figure 6.9 Potential study designs used in applied analytics settings

Figure 6.10 Visual presentation of the relative priorities for a project

Chapter 7

Figure 7.1 Hypothesis-based versus data-driven analysis

Figure 7.2 Data sensemaking focuses on describing, exploring, and explaining

Figure 7.3 Example of Markov microsimulation for the comparison of various treatment strategies

Figure 7.4 In analytics, we often augment analytics with data that lives outside of the traditional data pipeline.

Figure 7.5 The analytics sandbox

Figure 7.6 Types of variables

Figure 7.7 Descriptive analysis options

Figure 7.8 Distribution of quantitative variables

Chapter 8

Figure 8.1 Vectors of analytics maturity

Figure 8.2 Regression model for predicting grades based on study time

Figure 8.3 Plotting the regression line can help visualize the relationships between two variables

Figure 8.4 Cross Industry Standard Process for Data Mining (CRISP-DM)

Figure 8.5 Statistical modeling cultures: Data modeling culture vs. algorithmic modeling culture

Figure 8.6 Five questions for selecting a statistical model

Figure 8.7 Steps used in the evaluation of a hypothesis

Figure 8.8 Example scatter plot to visualize the relationships between two quantitative variables

Figure 8.9 Example of correlation matrix heat map showing the strength of the relationships between variables (Hemedinger, 2013)

Figure 8.10 Example of scatter plot matrix showing the relationships between variables

Figure 8.11 2 × 2 matrix to help classify the type of question being asked

Figure 8.12 2 × 2 matrix outlining common statistical tests for differences and associations

Figure 8.13 2-×-2 matrix of common statistical tests for prediction

Figure 8.14 Scatter plot revealing patterns in the data

Figure 8.15 A depiction of where feature engineering factors into the analytics process

Figure 8.16 Pattern detection process flow for supervised and unsupervised model development

Chapter 9

Figure 9.1 Results activation in the context of design thinking

Figure 9.2 Planning for telling your data story

Figure 9.3 Example of audience transformation plan

Chapter 10

Figure 10.1 Model assembly approach illustrated

Figure 10.2 Well-rounded analytics product manager maintains balance in business, analytics, and product thinking.

Figure 10.3 Product thinking explained

Figure 10.4 Create-measure-learn feedback loop is at the core to the lean startup model and essential to building an MVP for analytics.

Figure 10.5 Visible progress is tracked and shared.

Figure 10.6 Seattle Children's analytics home page

Figure 10.7 Testimonial from satisfied customer at Seattle Children's

Figure 10.8 Measuring return on investment for analytics can come in many forms.

Figure 10.9 Analytics Lifecycle is iterative.

Figure 10.10 Analytics product prioritization matrix

Figure 10.11 Analytics products must balance the user experience with the value.

Figure 10.12 Structure-process-outcome model for analytics quality

Figure 10.13 Example measures for quality using structure-process-output model

Figure 10.14 Align talent development to analytics aspirations.

Chapter 11

Figure 11.1 2 × 2 Effectiveness and efficiency matrix

Figure 11.2 The Analytics Lifecycle

Figure 11.3 Common challenges in analytics

Figure 11.4 Dimensions of change

Figure 11.5 Emotional reactions to change

Figure 11.6 Change commitment curve

Figure 11.7 Agility in change can occur by focusing on those with the most influence (positive and negative).

Chapter 12

Figure 12.1 The point at which there is a shift from toolset to mindset

Figure 12.2 Analytics career lattice

Chapter 13

Figure 13.1 The linkage between analytics strategy and execution

Guide

Cover

Table of Contents

Begin Reading

Pages

C1

ii

iii

iv

v

vi

viii

xi

xii

xiii

xiv

xv

xvi

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

127

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

349

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

437

438

439

440

441

442

443

444

445

446

447

448

E1

Wiley & SAS Business Series

The Wiley & SAS Business Series presents books that help senior-level managers with their critical management decisions.

Titles in the Wiley & SAS Business Series include:

The Analytic Hospitality Executive

by Kelly A. McGuire

The Analytics Lifecycle Toolkit: A Practical Guide for an Effective Analytics Capability

by Gregory S. Nelson

Analytics: The Agile Way

by Phil Simon

Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications by

Bart Baesens

Bank Fraud: Using Technology to Combat Losses

by Revathi Subramanian

Big Data Analytics: Turning Big Data into Big Money

by Frank Ohlhorst

Big Data, Big Innovation: Enabling Competitive Differentiation through Business Analytics

by Evan Stubbs

Business Analytics for Customer Intelligence

by Gert Laursen

Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure

by Michael Gendron

Business Intelligence and the Cloud: Strategic Implementation Guide

by Michael S. Gendron

Business Transformation: A Roadmap for Maximizing Organizational Insights

by Aiman Zeid

Connecting Organizational Silos: Taking Knowledge Flow Management to the Next Level with Social Media

by Frank Leistner

Data-Driven Healthcare: How Analytics and BI Are Transforming the Industry

by Laura Madsen

Delivering Business Analytics: Practical Guidelines for Best Practice

by Evan Stubbs

Demand-Driven Forecasting: A Structured Approach to Forecasting, Second Edition

by Charles Chase

Demand-Driven Inventory Optimization and Replenishment: Creating a More Efficient Supply Chain

by Robert A. Davis

Developing Human Capital: Using Analytics to Plan and Optimize Your Learning and Development Investments

by Gene Pease, Barbara Beresford, and Lew Walker

The Executive's Guide to Enterprise Social Media Strategy: How Social Networks Are Radically Transforming Your Business

by David Thomas and Mike Barlow

Economic and Business Forecasting: Analyzing and Interpreting Econometric Results

by John Silvia, Azhar Iqbal, Kaylyn Swankoski, Sarah Watt, and Sam Bullard

Economic Modeling in the Post Great Recession Era: Incomplete Data, Imperfect Markets

by John Silvia, Azhar Iqbal, and Sarah Watt House

Enhance Oil & Gas Exploration with Data Driven Geophysical and Petrophysical Models

by Keith Holdaway and Duncan Irving

Foreign Currency Financial Reporting from Euros to Yen to Yuan: A Guide to Fundamental Concepts and Practical Applications

by Robert Rowan

Harness Oil and Gas Big Data with Analytics: Optimize Exploration and Production with Data Driven Models

by Keith Holdaway

Health Analytics: Gaining the Insights to Transform Health Care

by Jason Burke

Heuristics in Analytics: A Practical Perspective of What Influences Our Analytical World

by Carlos Andre Reis Pinheiro and Fiona McNeill

Human Capital Analytics: How to Harness the Potential of Your Organization's Greatest Asset

by Gene Pease, Boyce Byerly, and Jac Fitz-enz

Implement, Improve, and Expand Your Statewide Longitudinal Data System: Creating a Culture of Data in Education

by Jamie McQuiggan and Armistead Sapp

Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards, Second Edition

by Naeem Siddiqi

JMP Connections

by John Wubbel

Killer Analytics: Top 20 Metrics Missing from Your Balance Sheet

by Mark Brown

Machine Learning for Marketers: Hold the Math

by Jim Sterne

On-Camera Coach: Tools and Techniques for Business Professionals in a Video-Driven World

by Karin Reed

A Practical Guide to Analytics for Governments: Using Big Data for Good

by Marie Lowman

Predictive Analytics for Human Resources

by Jac Fitz-enz and John Mattox II

Predictive Business Analytics: Forward-Looking Capabilities to Improve Business Performance

by Lawrence Maisel and Gary Cokins

Profit Driven Business Analytics: A Practitioner's Guide to Transforming Big Data into Added Value

by Wouter Verbeke, Cristian Bravo, and Bart Baesens

Retail Analytics: The Secret Weapon

by Emmett Cox

Social Network Analysis in Telecommunications

by Carlos Andre Reis Pinheiro

Statistical Thinking: Improving Business Performance, Second Edition

by Roger W. Hoerl and Ronald D. Snee

Strategies in Biomedical Data Science: Driving Force for Innovation

by Jay Etchings

Style & Statistics: The Art of Retail Analytics

by Brittany Bullard

Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics

by Bill Franks

Too Big to Ignore: The Business Case for Big Data

by Phil Simon

Using Big Data Analytics: Turning Big Data into Big Money

by Jared Dean

The Value of Business Analytics: Identifying the Path to Profitability

by Evan Stubbs

The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions

by Phil Simon

Win with Advanced Business Analytics: Creating Business Value from Your Data

by Jean Paul Isson and Jesse Harriott

For more information on any of the above titles, please visit www.wiley.com.

The Analytics Lifecycle Toolkit

A Practical Guide for an Effective Analytics Capability

 

 

 

Gregory S. Nelson

 

 

 

 

 

 

Copyright © 2018 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993, or fax (317) 572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Cataloging-in-Publication Data is Available:

ISBN 978-1-119- 42506-9 (Hardcover)

ISBN 978-1-119-42509-0 (ePDF)

ISBN 978-1-119-42510-6 (ePub)

Cover Design: Wiley

Cover Image: © mattjeacock/Getty Images

To Nick and MaryLu, for showing me what it means to be a part of something bigger than yourself.

Preface

The modern enterprise is often characterized as “data rich, but information poor.” This challenge is exacerbated by the pure volume and variety of data generated at the point of interaction (e.g., customers, patients, suppliers) and careening outward. Whether you are preparing, analyzing, presenting, or consuming data, having a strong foundation in data and analytics is paramount for conveying ideas effectively.

In this book, I translate the world of big data, data science, and analytics into a practical, comprehensive guide where you can explore the art and science of analytics best practices through a proven framework for managing analytics teams and processes.

The focus of the book is on creating effective and efficient analytics organizations and processes in order to strengthen the role of data and analytics in producing organizational success.

When I started thinking about writing about this specific topic, it was primarily in response to the lack of information about “the people and process” side of analytics. That is, for over a decade, authors have written about the concept of analytics, its importance in business, and specific implementations of technologies such as Python, R, or SAS, among others. However, those resources generally do not address the tactics of analytics model development or business case development, nor do they address the impact of analytics on operational processes.

The issues that organizations have grappled with over the past 10 years since Tom Davenport and Jeanne Harris published their seminal work Competing on Analytics (Davenport & Harris, 2007) have shifted from “What problems can we solve with analytics?” to “How do we find, nurture, and retain analytics professionals?” This shift from the “what” to the “how” supports the basic premise of this book. I also think the timing is right for the book, as entire industries are transforming themselves with the use of data and analytics. While many organizations have solved the barriers of effectively using analytics in everyday operations as well as strategic decision making, other industries are just now getting on the “analytics bandwagon,” and they see the promise of analytics without a clear roadmap for getting there. For the former, the challenge is one of effectiveness and improved efficiencies. For the latter, the real struggle can often be with creating an organizational culture—or mindset—for analytics, justifying the development of an analytics capability, and organizing for success.

My personal inspiration for this book came from the works of Ralph Kimball. I remember reading his first edition of the Data Warehouse Toolkit (Kimball, 1996) and thinking to myself, “This makes sense.” It was so very different from the conceptual treatments often found in business and technology books, in that Kimball gave us the language, tools, and processes to actually do data warehousing. He provided a solid overview of the areas relevant to someone who was either familiar with or completely new to data warehousing, along with a framework for the data warehousing lifecycle and key process areas. I hope that you will find that The Analytics Lifecycle Toolkit lives up to this inspiration and that it provides a comprehensive and practical guide to the Analytics Lifecycle with focus on creating an effective analytics capability for your organization.

This book differs from other “how-to” books in that it is not designed as a cookbook of analytics models, but rather, is a primer on the best practices and processes used in analytics. It is intended for:

Organizational leaders and analytics executives

who need to understand what it means to build and maintain an analytics capability and culture, including those in newly minted chief analytics officer or chief data officer positions.

Analytics teams

on the front lines of designing, developing, and delivering analytics as a service or as a product. This group includes analytics product managers, team leads, analysts, project managers, statisticians, scientists, engineers, data scientists, and the “quants” who build analytics models.

Aspiring data champions

, those who use data or consume analytics products in their role as fact-based, problem solvers. The data champion is anyone who wishes to use data to improve performance, support a decision, or change the trajectory of some business process.

This book is organized in three sections:

The Foundation of Analytics:

Starts by outlining what analytics is and how it can be applied to a number of problems in the organization. The focus shifts to analytics as an organizational capability, outlining a different perspective on how analytics can serve the organization's purpose, and how analytics (and data) strategy informs what we do and how we deliver those capabilities. Then this section will address how to deliver analytics capabilities through resources—that is, people, processes, technology, and data.

Analytics Lifecycle Best Practices:

Introduces analytics products and how to support the design, development, and delivery of analytics products and/or services. The lifecycle is then broken down into five best practice areas with specific processes that support analytics product development.

Sustaining Analytics Success:

Rounds out the discussion of how to ensure that analytics products have the greatest impact on the organization and sustain improvements. The discussion includes how to measure effectiveness and efficiency for analytics programs and apply lessons learned from other disciplines such as behavioral economics, social psychology, and change management.

In the first chapter, you will see that the language of analytics can be confusing and even down right daunting. Terms like the science of, the discipline of, and the best practice of generally refer to the usual manner in which analytics are conceptualized.

However, terms like method, methodology, or approach typically mean the processes used in common practice.

One of my goals in writing The Analytics Lifecycle Toolkit is to assume nothing and to clarify things along the way. To that end, I will do my best to make analytics accessible by providing explicit examples and using precise language wherever possible.

You've made it this far, so perhaps you agree that this topic is interesting and worth the price of admission. But if you need 10 more reasons, here they are:

This book:

Offers a practical guide to understanding the complete analytics lifecycle and how to translate that into organizational design and efficient processes.

Provides a framework for building an analytics team in the organization, including functions and team design.

Explores the people and process side of analytics with a focus on analytics team effectiveness and design thinking around the creation of analytics products.

Discusses the analytics job families and roles needed for a successful analytics program.

Includes case studies from real-world experiences.

Bridges concepts appropriate to an analytics culture such as data-centrism and numeracy with data and technology strategies.

Creates understanding and awareness for analytics leaders and a toolbox for practitioners.

Provides access to a library of tools and templates that include areas of best practice that support leadership, process improvement, and workforce enablement.

Begins with fundamentals of the analytics lifecycle, discusses the knowledge domains and best practice areas, and then details the analytics team processes.

Was written by someone who does analytics for a living and has seen hundreds of unique customer perspectives and applications across multiple industries.

Hopefully, this book will provide some useful guidance for those just starting their analytics journey and some tips for those more experienced. Happy trails!

Acknowledgments

This work would not have been possible without the support of my colleagues and clients who gave me the space to write. I am especially indebted to Monica Horvath, PhD, for picking up the pieces I dropped along the way. Not only did she provide scrutiny during technical review of this book, but was my sounding board and co-conspirator for the past several years at ThotWave as we helped clients improve the “people and process side of analytics.” Much of the content around organizational design and our analytics competency model was rooted in these efforts.

I am grateful to all of those with whom I have had the pleasure to work during this project. I learn from each of my clients at ThotWave and my professional colleagues throughout the industry as they continue to teach me a great deal about the real-world implications of analytics and the real struggles that organizations have.

I am indebted to those who agreed to review drafts of this book. In particular, I want to thank Anne Milley from JMP Software; Marc Vaglio-Luarin, analytics product manager from Qlik Software; Linda Burtch, founder of Burtch Works; Mark Tabladillo, lead data scientist from Microsoft; Randy Betancourt from Accenture; Robert Gladden, chief analytics officer at Highmark Health; Mary Beth Ainsworth, product marketing at SAS for artificial intelligence and text analytics; and Teddy Benson from the Walt Disney Company. Your contributions to this work made it a better product.

I would especially like to thank my personal copyeditor, MaryLu Giver. Despite the massive amount of red ink, she was encouraging, thorough, and incredibly kind. In addition, thanks goes to the editorial team at Wiley and, in particular, Julie Kerr, who made the process of publishing a book easy and allowed me to focus on the writing.

Nobody has been more important to me in the pursuit of this project than the members of my family. I would like to thank my family, whose love and guidance are with me in whatever I pursue. They are the ultimate role models. Most importantly, I wish to thank my loving and supportive wife, Susan, who makes me a better person, and my daughter and grandson, who give me hope.

REFERENCES

Davenport, T. H., & Harris, J. G. (2007).

Competing on analytics: the new science of winning

. Boston: Harvard Business School Press.

Kimball, R. (1996).

The data warehouse toolkit: practical techniques for building dimensional data warehouses

. New York: John Wiley & Sons.

PART IThe Foundation of Analytics

CHAPTER 1Analytics Overview

…what enables the wise commander to strike and conquer, and achieve things beyond the reach of ordinary men, is foreknowledge. Now, this foreknowledge cannot be elicited from spirits…

The Art of War, Sun Tzu (as seen in Giles, 1994)

FUNDAMENTAL CONCEPTS

Peter Drucker first spoke of the “knowledge economy” in his book The Age of Discontinuity (Drucker, 1969). The knowledge economy refers to the use of knowledge “to generate tangible and intangible value.” Nearly 50 years later, organizations have virtually transformed themselves to meet this challenge, and data and analytics have become central to that transformation.

In this chapter, we highlight the “fundamentals” of analytics by hopefully creating a level playing field for those interested in the moving from the concept of analytics to the practice of analytics. The fundamentals include defining both data and analytics using terms that I hope resonate. In addition, I think it is important to consider analytics in the wider context of how it is used and the value derived from these efforts. Finally, in this chapter, I relate analytics to other widely used terms as a way to find both common ground and differentiation with often-confused terminologies.

Data

Data permeates just about every part of our lives, from the digital footprint we leave with our cell phones, to health records, purchase history, and utilization of resources such as energy. While not impossible, it would require dedication and uncanny persistence to live “off-the-grid” in this digital world. Beyond the pure generation of data, we are also voracious consumers of data, reviewing our online spending habits, monitoring our fitness regimes, and reviewing those frequent flyer points for that Caribbean vacation.

But what is data? At its most general form, data is simply information that has been stored for later use. Earliest forms of recording information might have been notches on bones (Sack, 2012). Fast forward to the 1950s, and people recorded digital information on Mylar strips (magnetic tape), then punch cards, and later disks. Modern data processing is relatively young but has set the foundation for how we think about the collection, storage, management, and use of information.

Until recently, we cataloged information that wasn't necessarily computable (e.g., videos, images); but through massive technological change, the class of “unstorable” data is quickly vanishing. Stored information, or data, is simply a model of the real world encoded in a manner that is usable, or for our purposes “computable” (Wolfram, 2010).

The fact that data is a persistent record or “model” of something that happened in the real world is an important distinction in analytics. George Box, a statistician considered by many as “one of the greatest statistical minds of the 20th century” (Champkin, 2013) was often quoted as saying: “All models are wrong, but some are useful.” All too often, we find something in the data that doesn't make sense or is just plain wrong. Remember that data has been translated from the real, physical world into something that represents the real world—George's “model.” Just as the mechanical speedometer is a standard for measuring speed (and a pretty good proxy for measuring velocity), the model is really measuring tire rotation, not speed. (For those interested in a late-night distraction, I refer you to Woodford's 2016 article “Speedometers” that explains how speedometers work.) In sum, data is stored information and serves as the foundation for all of analytics. In visual analytics, for example, we make sense out of the data using visualization techniques that enable us to perform analytical reasoning through interactive, visual interfaces.

Analytics

Analytics may be one of the most overused yet least understood terms used in business. For some, it relates to the technologies used to “beat data into submission,” or it is simply an extension of business intelligence and data warehousing. And yet for others, it relates to the statistical, mathematical, or quantitative methods used in the development of models.

According to Merriam-Webster (Merriam-Webster, 2017), analytics is “the method of logical analysis.” Dictionary.com (dictionary.com, 2017) defines analytics as “the science of logical analysis.” Unfortunately, both definitions use the root word of analysis in the definition, which seems a bit like cheating.

The origin of the word analysis goes all the way back to the 1580s, where the term is rooted in Medieval Latin (analticus) and Greek (analtikós), and means to break up or to loosen. Throughout this book, I frame analytics as a structured approach to data-driven problem solving—one that helps us break up problems through careful consideration of the facts.

What Is Analytics?

There has been much debate over the definition of analytics (Rose, 2016). While the purpose of this book is not to redefine or challenge anyone's definition, for the current discussion I define analytics as:

a comprehensive, data-driven strategy for problem solving

I intentionally resist using a definition that views analytics as a “process,” a “science,” or a “discipline.” Instead, I cast analytics as a comprehensive strategy, and as you will see in Part II of this book, it encompasses best practice areas that contain processes, along with roles and deliverables.

Analytics uses logic, inductive and deductive reasoning, critical thinking, and quantitative methods—along with data—to examine phenomena and determine its essential features. Analytics is rooted in the scientific method (Shuttleworth, 2009), including problem identification and understanding, theory generation, hypothesis testing, and the communication of results.

Inductive Reasoning

Inductive reasoning refers to the idea that accumulated evidence is used to support a conclusion but with some level of uncertainty. That is, there is a chance (probability) that the final conclusions may differ from the underling premises. With inductive reasoning, we make broad generalizations from specific observations or data.

Deductive Reasoning

Deductive reasoning on the other hand makes an assertion about some general case and then seeks to prove or disprove it with data (using statistical inference or experimentation). We propose a theory about the way the world works and then test our hypotheses.

We will explore this in more detail later in this chapter.

Analytics can be used to solve big hairy problems such as those faced by UPS that helped them “save more than 1.5 million gallons of fuel and reduced carbon dioxide emissions by 14,000 metric tons” (Schlangenstein, 2013) as well operational problems like optimizing the scheduling of operating rooms for Cleveland Clinic (Schouten, 2013). With success stories like those, it is no wonder that analytics is an attractive bedfellow with technology vendors (hardware and software) and other various proponents. Of course, the danger in the overuse of analytics can be seen in the pairing of the term with other words such as:

Big data analytics

Prescriptive analytics

Business analytics

Operational analytics

Advanced analytics

Real-time analytics

Edge or ambient analytics

While these pairings offer distinctive qualifiers on the type and context to which analytics is applied, it often creates confusion, especially in C-suites, where technology vendors offer the latest analytics solutions to solve their every pain. My perspective (and one that is shared with lots of other like-minded, rational beings) is that analytics is not a technology but that technology serves as an enabler.

Analytics is also often referred to as “any solution that supports the identification of meaningful patterns and relationships among data.” Analytics is used to parse large or small, simple or complex, structured and unstructured, quantitative or qualitative data for the express purposes of understanding, predicting, and optimizing. Advanced analytics is a subset of analytics that uses highly developed and computationally sophisticated techniques with the intent of supporting the fact-based decision process—usually in an automated or semi-automated manner.

Advanced analytics typically incorporates techniques such as data mining, econometrics, forecasting, optimization, predictive modeling, simulations, statistics, and text mining.

How Analytics Differs from Other Concepts

Vincent Granville, who operates Data Science Central, a social network for data scientists, compared 16 analytics disciplines to data science (Granville, 2014). Without repeating those here (but definitely worth the read!), it is useful to highlight the differences between analytics and similar concepts as a way to clarify the meaning of analytics. Here, analytics will be described as it relates to concepts and methods:

Concepts

Business intelligence and reporting

Big data

Data science

Edge (and ambient) analytics

Informatics

Artificial intelligence and cognitive computing

Methods

Applied statistics and mathematics

Forecasting and time series

NLP, text mining, and text analytics

Machine learning and data mining

To start with, it is important to distinguish between concepts and methods.

Concepts

Concepts are generalized constructs that help us understand what something is or how it works.

Methods

Methods, in this context, are the specific techniques or approaches that are used to implement an analytic solution.

Another way to think about this is that methods describe approaches to different types of problems. For example, we might consider something as an optimization problem or a forecasting problem, whereas big data is a mental model that helps us understand the complexity of modern data challenges. Similarly, as we will see later in this chapter, machine learning can be thought of as simply the current state of artificial intelligence—the latter being the concept and the former being the method.

ANALYTICS CONCEPTS

An analytics concept can be thought of as an abstract idea or a general notion. We differentiate concepts from implementation to highlight the fact that the idea necessarily can take on multiple forms when implemented. For example, the concept of artificial intelligence can be seen in self-driving cars, chatbots, or recommendation engines. The specific implementations are essentially the current state implementation of the concept.

In the following section, I outline my interpretation of what I see as the fundamental definition of business intelligence, reporting, big data, data science, edge analytics, informatics, and the world of artificial intelligence and cognitive computing.

Business Intelligence and Reporting

There is little consensus as to how analytics and business intelligence differ. Some categorize analytics as a subset of business intelligence, while others position analytics in an entirely different box. In a paper I wrote in 2010 (Nelson, 2010), I defined business intelligence (BI) as “a management strategy used to create a more structured and effective approach to decision making…BI includes those common elements of reporting, querying, Online Analytical Processing (OLAP), dashboards, scorecards and even analytics. The umbrella term ‘BI’ also can refer to the processes of acquiring, cleansing, integrating, and storing data.”

There are those who would classify the difference between analytics and business intelligence as differences between (1) the complexity of the quantitative methods used (i.e., algorithms, mathematics, statistics) and (2) whether the focus of the results is historical or future-oriented. That is to say, business intelligence is focused on the presentation of historical results using relatively simple math, while analytics is thought of as much more computationally sophisticated and capable of predicting outcomes of interest, determining causal relationships, identifying optimal solutions, and sometimes also used to prescribe actions to take.

The limit of most business intelligence applications lies not in the constraints of technology but, rather, in the depth of analysis and the true insights created that inform action. Telling me, for example, that something happened doesn't help me understand what to do to change the future—often, that is left for offline analysis. The promise of analytics is that it creates actionable insights about what happened (and where, why, and under what conditions), what is likely to occur in the future, and then what can be done to influence and optimize that future.

Note that the BI dashboard depicted in Figure 1.1 relays facts about the past such as sales, call volumes, products, and accounts, making it easy to get a quick snapshot of the current state of the organization's sales position or activities.

Figure 1.1 BI dashboard

Source: © QlikTech International AB. Reprinted with permission.

Business intelligence and its little sister, “reporting,” are the techniques used to display information about a phenomenon, usually at the tail end of the data pipeline where visual access to data and results are surfaced. Analytics, on the other hand, goes beyond description to actually understand the phenomenon to predict, optimize, and prescribe appropriate actions.

Business intelligence has traditionally suffered from two shortcomings. These shortcomings are related to the fact that BI typically (1) focuses on creating awareness of the facts of the past in that it measures and monitors rather predicting and optimizing; and (2) it is often not quantitatively sophisticated enough to build accurate insights that could be used to influence meaningful change (although the right report or visualization can influence change).

In cases where business intelligence is properly coupled with in-depth “analysis” rather than the mere awareness of facts, it gets closer to analytics. But it often lacks the advanced statistical and mathematical sophistication or “learning” seen in advanced analytics solutions.

To that end, I view analytics as a natural evolution of the concepts contained within the general framework of business intelligence. It places more emphasis on the full pipeline of activities necessary to create insights that fuel action. Analytics is more than just the predefined visual elements used in self-service dashboards or reporting interfaces.

Big Data

Big data is a way to describe the cacophony of information that organizations must deal with in their efforts to turn data into insights. The phrase big data was first used by Michael Cox and David Ellsworth in 1997 (Cox, 1997) who referred to the “problem” as follows:

Visualization provides an interesting challenge for computer systems: data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk. We call this the problem of big data. When data sets do not fit in main memory (in core), or when they do not fit even on local disk, the most common solution is to acquire more resources.

Think of big data as a concept that highlights the challenge of utilizing traditional methods of data analysis because of the size and complexity of that data. We contrast big data with traditional “small” data by its volume (how much data we have), velocity (how fast the data is coming at us), and variety (numbers, text, images, video).1

If big data is a concept used to describe the complexity of today's information, analytics is used to help us analyze that complexity in proactive (predictive and prescriptive) ways versus reactive ways (i.e., the realm of business intelligence).

Data Science

It would seem that defining big data was a cakewalk as compared with data science as such little consistency can be found in the term. There is a lot of debate about what it means and whether it is different at all from analytics. Even those who would attempt a definition might do so by discussing the people (data scientists), the skills they need to have, the roles they play, the tools and technologies used, where they work, and their educational backgrounds. But this doesn't give one a meaningful definition.

Rather than describing data science by the people or the types of problems they address, it might be more accurate to define it as follows:

Data Science is the scientific discipline of using quantitative methods from fields like statistics and mathematics along with modern technologies to develop algorithms designed to discover patterns, predict outcomes, and find optimal solutions to complex problems.

The difference between data science and analytics is that data science can help support or even automate the analysis of data, but analytics is a human-centered strategy that takes advantage of a variety of tools, including those found in data science, to understand the true nature of the phenomenon.

Data science is perhaps the broadest of these concepts in that it relates to the entirety of the science and practice of dealing with “data.” I think data science is analytics engineered by computer scientists. In practice, however, data science tends to focus on macro, generalized problems, whereas analytics tends to address particular challenges within an industry or problem space. In Chapter 10, I extend this by defining the relationship between data science and analytics by referring to data science as an enabler of analytics.

Edge (and Ambient) Analytics

Analytics is a predominant activity for most modern organizations that see it as their directive to democratize data through data-driven, human-centered processes. Edge analytics refers to distributed analytics where the analytics are built into the machinery and systems where information is generated or collected as part of the “unconscious” activities of an organization.

Edge analytics is often associated with smart devices where the analytical computation is performed on data at the point of collection (e.g., equipment, sensor, network switch, or other device). Rather than relying on traditional data-pipeline methods where data is collected, transmitted, cleansed, integrated, and warehoused, analytics are embedded within the device or nearby.

DEMOCRATIZE DATA

The democratization of data refers to “freeing” data so that everyone that can and should have access to data is given the tools and the rights to explore the data and these are not limited to the privileged few.

As an example, consider the fact that traditional credit card fraud detection relies on a machine (e.g., card reader) and a connection to an authorization “broker” to validate the transaction by sending a request and very quickly (hundredths of a millisecond) applying an algorithm to authorize or flag the operation and the device receives the authorization. In edge analytics, the algorithm would run on the instrument itself (think smart chip reader with embedded analytics).

Edge analytics is often linked with the Internet of Things (IoT), and a recent IDC FutureScape for IoT report found that “by 2018, 40% of IoT data will be stored, processed, analyzed and acted upon at the edge of the network where it is created” (Marr, 2016). As IoT evolves, we will likely see more attention paid to the Analytics of Things (AoT), which refers to the opportunity of analytics to bring unique value to IoT data.

Ambient analytics is a related term whose name implies that “analytics are everywhere.” Just as the lighting or acoustics of a room often go unnoticed but set the stage for mood, ambient analytics will support and influence the context of how we work and play. We are seeing ambient intelligence play out in everyday scenarios, such as detecting glucose levels and administering insulin. Similarly, home automation devices can detect when you are nearing your home and adjust the temperature and turn on lighting. Ambient analytics goes beyond simple decision rules and utilizes algorithms to decide on the appropriate course of action.

There is little doubt that edge and ambient analytics will continue to challenge the traditional human-centered processes for operationalizing (e.g., understanding, deciding, and acting) analytics.

Informatics

Informatics is a discipline that lies at the intersection of information technology (IT) and information management. In practice, informatics relates to the technologies used to process data for storage and retrieval. In essence, informatics deals with the realm of how information is managed and refers to the ecosystem of data and systems that support process workflows rather than the analysis of the data found therein.

Often used in information sciences and used heavily in healthcare and research, health informatics is a specialization that sits between health IT and health information management and links information technology, communications, and health care to improve the quality and safety of patient care. It lies at the heart of where people, information, and technology intersect.

Health policy refers to decisions, plans, and actions that are undertaken to achieve specific health-care goals within a society. Because health policy makers want to see health care become more affordable, safer, and of higher quality, information technology and health informatics are often the means prescribed to do this. In fact, one of the biggest mandates is to position data resources as to enable a 360-degree view of every patient, and only data sharing can accomplish this (see Figure 1.2).

Figure 1.2 The difference between health information management, health IT, and informatics

Analytics integrates with all of these concepts and relies on the underlying data, supporting technologies, and information management processes.

Artificial Intelligence and Cognitive Computing

Artificial intelligence (AI) is “the science of making computers do things that require intelligence when done by humans” (Copeland, 2000).

The difference between artificial intelligence (AI) and machine learning is that AI refers to the broad concept of using computers to perform the “intelligence” work of discovering patterns, whereas machine learning is a part of AI that relates to the notion that computers can learn from data.

Machine learning is a subset of artificial intelligence that can learn from and make predictions based on data. Rather than following a particular set of rules or instructions, an algorithm is trained to spot patterns in large amounts of data.

Artificial intelligence (and machine learning) can be used in the Analytics Lifecycle to support discovery (e.g., how data is structured, what patterns exist). The application of artificial intelligence found in analytics usually comes in the form of machine learning (as seen above) or cognitive computing.

Cognitive computing is a unique case that combines artificial intelligence and machine-learning algorithms in an approach that attempts to reproduce (or mimic) the behavior of the human brain (Feldman, 2017).

Cognitive computing systems are designed to solve problems the way people solve problems, by thinking, reasoning, and remembering. This approach gives cognitive computing systems an advantage that allows them to “learn and adapt as new data arrives” and to “explore and uncover things you would never know to ask about.” (Saffron Technologies, 2017). The advantage of cognitive computing is that once it learns—unlike humans—it never forgets.

In the battle of man vs. algorithm, unfortunately, man often loses. The promise of Artificial Intelligence is just that. So if we're going to be smart humans, we must learn to be humble in situations where our intuitive judgment simply is not as good as a set of simple rules.

Farnham Street Blog (Parrish, 2017, Do Algorithms Beat Us at Complex Decision Making?)

In slightly pejorative terms, artificial intelligence acts on behalf of a human, whereas cognitive computing provides information to help people decide.

Learn More

To learn more about the difference between AI and cognitive computing, please review the referenced article by Steve Hoffenberg (Hoffenberg, 2016).

THE METHODS OF ANALYTICS

In the prior section, we discussed analytics and some of the related concepts such as big data and data science. We now turn our attention to the practical methods used in analytics, including the tools in the analytics toolbox.

Specifically, in this section, I will outline the methods found in statistics, time series analysis, natural language processing, machine learning, and operations research.

Applied Statistics and Mathematics

Like many of the concepts that have already been discussed, there is a wide disparity about how people define statistics and how it differs from mathematics in general. Some would argue that statistics is a branch of mathematics (Merriam-Webster, 2017b), and others (like John Tukey (Brillinger, 2002)) suggest that it is a science. Most would agree that like physics, statistics uses mathematics but is not math (Milley, 2012).

For present purpose, statistics deals with the collection, organization, analysis, interpretation, and presentation of data. Using that broad definition, it sounds an awful lot like analytics. However, analytics and data science both use the quantitative underpinnings of statistics but their focus is wider than that of traditional statistics. While there are dozens of perspectives about the conceptual relationship between statistics and other disciplines (Taylor, 2016), I have represented what I see as the relationships among those concepts discussed here in Figure 1.3.

Figure 1.3 The relationship between statistics and other quantitative disciplines

Mathematics has a certain absolute and determinable quality about it, and the way that math is taught (at least in US schools) imbues a deterministic way of viewing the quantitative world around them. That is, we are taught to believe that all facts and events can be explained. Statistics, on the other hand, views quantitative data as probabilistic or stochastic. That is, facts may lead to conclusions that may be generally true (beyond simple randomness), but it must be acknowledged that there is some random probability distribution or pattern that cannot be predicted precisely.

Learn More

To learn more about the history of statistics and how it transformed science, please see David Salsburg's book The Lady Tasting Tea (Salsburg, 2002).

As shown in Figure 1.4, mathematical thinking is deductive (i.e., it infers a particular instance by applying a general law or principle) whereas statistical reasoning is inductive (i.e., it infers general laws from specific instances).

Figure 1.4 Inductive reasoning compared to deductive reasoning

This difference is important in the context of analytics in that we apply both inductive and deductive reasoning to analytics problem solving. Thus, the application of both mathematics and statistics to analytics is appropriate and necessary. If analytics is a comprehensive strategy, then statistics and mathematics are tools in our proverbial analytics toolbox that help deliver on that strategy.

Linear programming, for example, can be used to support special types of problems in analytics that are loosely defined as an optimization problem. For example, The Walt Disney Company uses linear, nonlinear, mixed integer, and dynamic programming in its data science work to support the optimization of restaurant seating, reduce wait times for park rides, and schedule staff (i.e., Cast Members). Note that I do not call out operations research, mathematical optimization, decision sciences, or actuarial sciences separately for the purposes of this discussion, as my perspective is that they serve as tools in our proverbial analytics toolkit—just as critical thinking and problem solving.

LINEAR PROGRAMMING

Linear programming is a mathematical method for problem solving where the output is a function of a linear model. For example, we might want to optimize emergency department throughput by looking at several factors, including surgical complexity, number of staff required, and potential complications, for example.

Forecasting and Time Series

In discussing the methods that support analytics, forecasting and time series methods are grouped together, not because they are the same thing but, rather, because they both fall into the same class of problem—the process of characterizing and predicting time-ordered data based on historical information.

Forecasting and time series refer to methods for analyzing time-sequenced data to extract meaningful characteristics from the data. Most often, forecasts are seen as trends represented as a visual display of historical data values, with some providing future predictions. Time series analysis is different than forecasting. You need time series data to make forecasts, but not all time series analysis is done to make a forecast. For example, time series analysis can be used to find patterns or similar features in multiple time series, or to perform statistical process control. Similarly, seasonality can be used to identify patterns.

Time series analysis utilizes a variety of approaches, including both quantitative and qualitative methods. The objective of time series analysis is to discover a pattern in the historical data (or time series) and then extrapolate the trend into the future. In Figure 1.5, note that there are generally four types of time series approaches.

Figure 1.5 Approaches to forecasting and time series analysis