The Data Warehouse Toolkit - Ralph Kimball - E-Book

The Data Warehouse Toolkit E-Book

Ralph Kimball

4,6
47,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more. * Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence * Begins with fundamental design recommendations and progresses through increasingly complex scenarios * Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more * Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 970

Veröffentlichungsjahr: 2013

Bewertungen
4,6 (16 Bewertungen)
12
1
3
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Title Page

Copyright

About the Authors

Credits

Acknowledgements

Introduction

Intended Audience

Chapter Preview

Website Resources

Summary

Chapter 1: Data Warehousing, Business Intelligence, and Dimensional Modeling Primer

Different Worlds of Data Capture and Data Analysis

Goals of Data Warehousing and Business Intelligence

Dimensional Modeling Introduction

Kimball’s DW/BI Architecture

Alternative DW/BI Architectures

Dimensional Modeling Myths

More Reasons to Think Dimensionally

Agile Considerations

Summary

Chapter 2: Kimball Dimensional Modeling Techniques Overview

Fundamental Concepts

Basic Fact Table Techniques

Basic Dimension Table Techniques

Integration via Conformed Dimensions

Dealing with Slowly Changing Dimension Attributes

Dealing with Dimension Hierarchies

Advanced Fact Table Techniques

Advanced Dimension Techniques

Special Purpose Schemas

Chapter 3: Retail Sales

Four-Step Dimensional Design Process

Retail Case Study

Dimension Table Details

Retail Schema in Action

Retail Schema Extensibility

Factless Fact Tables

Dimension and Fact Table Keys

Resisting Normalization Urges

Summary

Chapter 4: Inventory

Value Chain Introduction

Inventory Models

Fact Table Types

Value Chain Integration

Enterprise Data Warehouse Bus Architecture

Conformed Dimensions

Conformed Facts

Summary

Chapter 5: Procurement

Procurement Case Study

Procurement Transactions and Bus Matrix

Slowly Changing Dimension Basics

Hybrid Slowly Changing Dimension Techniques

Slowly Changing Dimension Recap

Summary

Chapter 6: Order Management

Order Management Bus Matrix

Order Transactions

Invoice Transactions

Accumulating Snapshot for Order Fulfillment Pipeline

Summary

Chapter 7: Accounting

Accounting Case Study and Bus Matrix

General Ledger Data

Budgeting Process

Dimension Attribute Hierarchies

Consolidated Fact Tables

Role of OLAP and Packaged Analytic Solutions

Summary

Chapter 8: Customer Relationship Management

CRM Overview

Customer Dimension Attributes

Bridge Tables for Multivalued Dimensions

Complex Customer Behavior

Customer Data Integration Approaches

Low Latency Reality Check

Summary

Chapter 9: Human Resources Management

Employee Profile Tracking

Headcount Periodic Snapshot

Bus Matrix for HR Processes

Packaged Analytic Solutions and Data Models

Recursive Employee Hierarchies

Multivalued Skill Keyword Attributes

Survey Questionnaire Data

Summary

Chapter 10: Financial Services

Banking Case Study and Bus Matrix

Dimension Triage to Avoid Too Few Dimensions

Supertype and Subtype Schemas for Heterogeneous Products

Hot Swappable Dimensions

Summary

Chapter 11: Telecommunications

Telecommunications Case Study and Bus Matrix

General Design Review Considerations

Design Review Guidelines

Draft Design Exercise Discussion

Remodeling Existing Data Structures

Geographic Location Dimension

Summary

Chapter 12: Transportation

Airline Case Study and Bus Matrix

Extensions to Other Industries

Combining Correlated Dimensions

More Date and Time Considerations

Localization Recap

Summary

Chapter 13: Education

University Case Study and Bus Matrix

Accumulating Snapshot Fact Tables

Factless Fact Tables

More Educational Analytic Opportunities

Summary

Chapter 14: Healthcare

Healthcare Case Study and Bus Matrix

Claims Billing and Payments

Electronic Medical Records

Facility/Equipment Inventory Utilization

Dealing with Retroactive Changes

Summary

Chapter 15: Electronic Commerce

Clickstream Source Data

Clickstream Dimensional Models

Integrating Clickstream into Web Retailer’s Bus Matrix

Profitability Across Channels Including Web

Summary

Chapter 16: Insurance

Insurance Case Study

Policy Transactions

Premium Periodic Snapshot

More Insurance Case Study Background

Claim Transactions

Claim Accumulating Snapshot

Policy/Claim Consolidated Periodic Snapshot

Factless Accident Events

Common Dimensional Modeling Mistakes to Avoid

Summary

Chapter 17: Kimball DW/BI Lifecycle Overview

Lifecycle Roadmap

Lifecycle Launch Activities

Lifecycle Technology Track

Lifecycle Data Track

Lifecycle BI Applications Track

Lifecycle Wrap-up Activities

Common Pitfalls to Avoid

Summary

Chapter 18: Dimensional Modeling Process and Tasks

Modeling Process Overview

Get Organized

Design the Dimensional Model

Summary

Chapter 19: ETL Subsystems and Techniques

Round Up the Requirements

The 34 Subsystems of ETL

Extracting: Getting Data into the Data Warehouse

Cleaning and Conforming Data

Delivering: Prepare for Presentation

Managing the ETL Environment

Summary

Chapter 20: ETL System Design and Development Process and Tasks

ETL Process Overview

Develop the ETL Plan

Develop One-Time Historic Load Processing

Develop Incremental ETL Processing

Real-Time Implications

Summary

Chapter 21: Big Data Analytics

Big Data Overview

Recommended Best Practices for Big Data

Summary

Index

Advertisement

End User License Agreement

Pages

iii

iv

v

vi

vii

xxvii

xxviii

xxvix

xxx

xxxi

xxxii

xxxiii

xxxiv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

565

543

564

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

Guide

Cover

Contends

Start Reading

List of Illustrations

Figure 1-1

Figure 1.2

Figure 1.3

Figure 1.4

Figure 1.5

Figure 1.6

Figure 1.7

Figure 1.8

Figure 1.9

Figure 1.10

Figure 3.1

Figure 3.2

Figure 3.3

Figure 3.4

Figure 3.5

Figure 3.6

Figure 3.7

Figure 3.8

Figure 3.9

Figure 3.10

Figure 3.11

Figure 3.12

Figure 3.13

Figure 3.14

Figure 3.15

Figure 3.16

Figure 3.17

Figure 4.1

Figure 4.2

Figure 4.3

Figure 4.4

Figure 4.5

Figure 4.6

Figure 4.7

Figure 4.8

Figure 4.9

Figure 4.10

Figure 4.11

Figure 4.12

Figure 4.13

Figure 4.14

Figure 4.15

Figure 5.1

Figure 5.2

Figure 5.3

Figure 5.4

Figure 5.5

Figure 5.6

Figure 5.7

Figure 5.8

Figure 5.9

Figure 5.10

Figure 5.11

Figure 5.12

Figure 5.13

Figure 5.14

Figure 5.15

Figure 5.16

Figure 5.17

Figure 6.1

Figure 6.2

Figure 6.3

Figure 6.4

Figure 6.5

Figure 6.6

Figure 6.7

Figure 6.8

Figure 6.9

Figure 6.10

Figure 6.11

Figure 6.12

Figure 6.13

Figure 6.14

Figure 6.15

Figure 6.16

Figure 6.17

Figure 6.18

Figure 6.19

Figure 6.20

Figure 7.1

Figure 7.2

Figure 7.3

Figure 7.4

Figure 7.5

Figure 7.6

Figure 7.7

Figure 7.8

Figure 7.9

Figure 7.10

Figure 7.11

Figure 7.12

Figure 7.13

Figure 7.14

Figure 7.15

Figure 7.16

Figure 7.17

Figure 7.18

Figure 7.19

Figure 8.1

Figure 8.2

Figure 8.3

Figure 8.4

Figure 8.5

Figure 8.6

Figure 8.7

Figure 8.8

Figure 8.9

Figure 8.10

Figure 8.11

Figure 8.12

Figure 8.13

Figure 8.14

Figure 8.15

Figure 9.1

Figure 9.2

Figure 9.4

Figure 9.3

Figure 9.5

Figure 9.6

Figure 9.7

Figure 9.8

Figure 9.9

Figure 9.10

Figure 10.1

Figure 10.2

Figure 10.3

Figure 10.5

Figure 10.4

Figure 10.6

Figure 10.7

Figure 10.8

Figure 10.9

Figure 11.1

Figure 11.2

Figure 11.3

Figure 12.1

Figure 12.2

Figure 12.3

Figure 12.4

Figure 12.5

Figure 12.6

Figure 12.7

Figure 12.8

Figure 12.9

Figure 13.1

Figure 13.2

Figure 13.3

Figure 13.4

Figure 13.5

Figure 13.6

Figure 14.1

Figure 14.2

Figure 14.3

Figure 14.4

Figure 14.5

Figure 14.6

Figure 15.1

Figure 15.2

Figure 15.3

Figure 15.4

Figure 15.5

Figure 15.6

Figure 15.7

Figure 15.8

Figure 15.9

Figure 16.1

Figure 16.2

Figure 16.3

Figure 16.4

Figure 16.5

Figure 16.6

Figure 16.7

Figure 16.8

Figure 16.9

Figure 16.10

Figure 16.11

Figure 17.1

Figure 17.2

Figure 18.1

Figure 18.2

Figure 18.3

Figure 18.4

Figure 19.1

Figure 19.2

Figure 19.3

Figure 19.4

Figure 19.5

Figure 19.6

Figure 19.7

Figure 19.8

Figure 19.9

Figure 19.10

Figure 19.11

Figure 19.12

Figure 20.1

Figure 20.2

Figure 20.3

Figure 20.4

Figure 20.5

Figure 20.6

Figure 21.1

Figure 21.2

Figure 21.3

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, Third Edition

Published byJohn Wiley & Sons, Inc.10475 Crosspoint BoulevardIndianapolis, IN 46256www.wiley.com

Copyright © 2013 by Ralph Kimball and Margy Ross

Published by John Wiley & Sons, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN: 978-1-118-53080-1ISBN: 978-1-118-53077-1 (ebk)ISBN: 978-1-118-73228-1 (ebk)ISBN: 978-1-118-73219-9 (ebk)

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646- 8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or website may provide or recommendations it may make. Further, readers should be aware that Internet websites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-ondemand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Control Number: 2013936841

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

About the Authors

Ralph Kimball founded the Kimball Group. Since the mid-1980s, he has been the data warehouse and business intelligence industry’s thought leader on the dimensional approach. He has educated tens of thousands of IT professionals. The Toolkit books written by Ralph and his colleagues have been the industry’s best sellers since 1996. Prior to working at Metaphor and founding Red Brick Systems, Ralph coinvented the Star workstation, the first commercial product with windows, icons, and a mouse, at Xerox’s Palo Alto Research Center (PARC). Ralph has a PhD in electrical engineering from Stanford University.

Margy Ross is president of the Kimball Group. She has focused exclusively on data warehousing and business intelligence since 1982 with an emphasis on business requirements and dimensional modeling. Like Ralph, Margy has taught the dimensional best practices to thousands of students; she also coauthored five Toolkit books with Ralph. Margy previously worked at Metaphor and cofounded DecisionWorks Consulting. She graduated with a BS in industrial engineering from Northwestern University.

Credits

Executive Editor

Robert Elliott

Project Editor

Maureen Spears

Senior Production Editor

Kathleen Wisor

Copy Editor

Apostrophe Editing Services

Editorial Manager

Mary Beth Wakefield

Freelancer Editorial Manager

Rosemarie Graham

Associate Director of Marketing

David Mayhew

Marketing Manager

Ashley Zurcher

Business Manager

Amy Knies

Production Manager

Tim Tate

Vice President and Executive Group Publisher

Richard Swadley

Vice President and Executive Publisher

Neil Edde

Associate Publisher

Jim Minatel

Project Coordinator, Cover

Katie Crocker

Proofreader

Word One, New York

Indexer

Johnna VanHoose Dinse

Cover Image

iStockphoto.com / teekid

Cover Designer

Ryan Sneed

Acknowledgments

First, thanks to the hundreds of thousands who have read our Toolkit books, attended our courses, and engaged us in consulting projects. We have learned as much from you as we have taught. Collectively, you have had a profoundly positive impact on the data warehousing and business intelligence industry. Congratulations!

Our Kimball Group colleagues, Bob Becker, Joy Mundy, and Warren Thornthwaite, have worked with us to apply the techniques described in this book literally thousands of times, over nearly 30 years of working together. Every technique in this book has been thoroughly vetted by practice in the real world. We appreciate their input and feedback on this book—and more important, the years we have shared as business partners, along with Julie Kimball.

Bob Elliott, our executive editor at John Wiley & Sons, project editor Maureen Spears, and the rest of the Wiley team have supported this project with skill and enthusiasm. As always, it has been a pleasure to work with them.

To our families, thank you for your unconditional support throughout our careers. Spouses Julie Kimball and Scott Ross and children Sara Hayden Smith, Brian Kimball, and Katie Ross all contributed in countless ways to this book.

Introduction

The data warehousing and business intelligence (DW/BI) industry certainly has matured since Ralph Kimball published the first edition of The Data Warehouse Toolkit (Wiley) in 1996. Although large corporate early adopters paved the way, DW/BI has since been embraced by organizations of all sizes. The industry has built thousands of DW/BI systems. The volume of data continues to grow as warehouses are populated with increasingly atomic data and updated with greater frequency. Over the course of our careers, we have seen databases grow from megabytes to gigabytes to terabytes to petabytes, yet the basic challenge of DW/BI systems has remained remarkably constant. Our job is to marshal an organization’s data and bring it to business users for their decision making. Collectively, you’ve delivered on this objective; business professionals everywhere are making better decisions and generating payback on their DW/BI investments.

Since the first edition of The Data Warehouse Toolkit was published, dimensional modeling has been broadly accepted as the dominant technique for DW/BI presentation. Practitioners and pundits alike have recognized that the presentation of data must be grounded in simplicity if it is to stand any chance of success. Simplicity is the fundamental key that allows users to easily understand databases and software to efficiently navigate databases. In many ways, dimensional modeling amounts to holding the fort against assaults on simplicity. By consistently returning to a business-driven perspective and by refusing to compromise on the goals of user understandability and query performance, you establish a coherent design that serves the organization’s analytic needs. This dimensionally modeled framework becomes the . Based on our experience and the overwhelming feedback from numerous practitioners from companies like your own, we believe that dimensional modeling is absolutely critical to a successful DW/BI initiative.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!