62,99 €
Data as a Service shows how organizations can leverage "data as a service" by providing real-life case studies on the various and innovative architectures and related patterns * Comprehensive approach to introducing data as a service in any organization * A reusable and flexible SOA based architecture framework * Roadmap to introduce 'big data as a service' for potential clients * Presents a thorough description of each component in the DaaS reference architecture so readers can implement solutions
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 558
Veröffentlichungsjahr: 2015
IEEE Press Editorial Board Tariq Samad, Editor in Chief
George W. Arnold
Jeffrey Nanzer
Dmitry Goldgof
Ray Perez
Ekram Hossain
Linda Shafer
Mary Lanzerotti
Zidong Wang
Vladimir Lumelsky
MengChu Zhou
Pui-In Mak
George Zobrist
Technical Reviewer
Frank Ferrante, College of William and Mary
IEEE Computer Society is the world's leading computing membership organization and the trusted information and career-development source for a global workforce of technology leaders including: professors, researchers, software engineers, IT professionals, employers, and students. The unmatched source for technology information, inspiration, and collaboration, the IEEE Computer Society is the source that computing professionals trust to provide high-quality, state-of-the-art information on an on-demand basis. The Computer Society provides a wide range of forums for top minds to come together, including technical conferences, publications, and a comprehensive digital library, unique training webinars, professional training, and the TechLeader Training Partner Program to help organizations increase their staff's technical knowledge and expertise, as well as the personalized information tool myComputer. To find out more about the community for technology leaders, visit http://www.computer.org.
The IEEE Computer Society and Wiley partnership allows the CS Press authored book program to produce a number of exciting new titles in areas of computer science, computing, and networking with a special focus on software engineering. IEEE Computer Society members continue to receive a 15\% discount on these titles when purchased through Wiley or at wiley.com/ieeecs.
To submit questions about the program or send proposals, please contact Mary Hatcher, Editor, Wiley-IEEE Press: Email: [email protected], Telephone: 201-748-6903, John Wiley & Sons, Inc., 111 River Street, MS 8-01, Hoboken, NJ 07030-5774.
Pushpak Sarkar
Copyright © 2015 by the IEEE Computer Society. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN: 978-1-119-04658-5
Dedicated to my parents and family for making me believe that
Guest Introduction
Guest Introduction
Preface (Includes the Reader's Guide)
The Reader's Guide
PART 1: Overview of Fundamental Concepts Includes Chapters 1 to 3
PART 2: DaaS Architecture Framework and Components Includes Chapters 4 to 8
PART 3: DaaS Solution Blueprints Includes Chapters 9 to 11
PART 4: Ensuring Organizational Success Includes Chapters 12 to 14
What Is Not Covered in this Book
Acknowledgments
Part One: Overview of Fundamental Concepts
Chapter 1: Introduction to DaaS
Topics covered in this chapter
Data-Driven Enterprise
Defining a Service
Drivers for Providing Data as a Service
Data as a Service Framework: A Paradigm Shift
Chapter 2: DaaS Strategy and Reference Architecture
Topics Covered in this Chapter
Enterprise Data Strategy, Goals, and Principles
Critical Success Factors
Reference Architecture of the DaaS Framework
How to leverage the DaaS Reference Architecture
Summary
Chapter 3: Data Asset Management
Topics Covered in this Chapter
Introduction to Major Categories of Enterprise Data
Transaction Data (Includes Big Data)
Significance of EIM in Supporting the DaaS Program
Role of Enterprise Data Architect
Summary
Part Two: DaaS Architecture Framework and Components
Chapter 4: Enterprise Data Services
Topics Covered in this Chapter
Emergence of Enterprise Data Services
Need for an Enterprise Perspective
Emergence of Enterprise Data Services
Publication of Enterprise Data
Interdependencies between DaaS, EIM, and SOA
Case Study: Amazon's Adoption of Public Data Service Interfaces
Summary
Chapter 5: Enterprise and Canonical Modeling
Topics Covered in this Chapter
A Model-Driven Approach Toward Developing Reusable Data Services
Defining a Standards-Driven Approach toward Developing New Data Services
Role of the Enterprise Data Model
Developing the Canonical Model
Enterprise Data Model
Canonical Model
Implementing the Canonical Model
Publishing Data Services with the Canonical Model as a Foundation
Implementing the Canonical Model in Real-life Projects
Data Services Roll Out and Future Releases
Case Study: DaaS in Real Life, Electronic-Data Interchange in U.S. Healthcare Exchanges
Summary
Chapter 6: Business Glossary for DaaS
Topics Covered in this Chapter
Problem of Meaning and the Case for a Shared Business Glossary
Using Metadata in Various Disciplines
Role of an Organization's Business Glossary
Enterprise Metadata Repository
Implementing the Enterprise Metadata Repository
Metadata Standards for Enterprise Data Services
Metadata Governance
Summary
Chapter 7: SOA and Data Integration
Topics Covered in this Chapter
SOA as an Enabler of Data Integration
Role of Enterprise Service Bus
What is a Data Service?
Foundational Components of a Data Service
Service Interface
Major Service Categories
Overview of Data Virtualization
Consolidated Data Infrastructure Platform
Summary
Chapter 8: Data Quality and Standards
Topics Covered in this Chapter
Where to Begin Data Standardization Efforts in Your Organization
Role of Data Discovery/Profiling to Identify DaaS Quality Issues
Data Quality and the Investment Paradox
Quality of a Data Service
Setting Up Standards in a DaaS environment
Summary
Part Three: DaaS Solution Blueprints
Chapter 9: Reference Data Services
Topics Covered in this Chapter
Delivering Market and Reference Data Using Real-Time Data Services
Comparing Usage of Reference Data Against Master Data
Understanding Challenges of Reference Data Management
Other Reference Data Management Challenges
Role of Reference Data Standards and Vocabulary Management
Collaborative Reference Data Management Implementation Using Business Process Management/Workflow
Summary
Chapter 10: Master Data Services
Topics Covered in this Chapter
Introduction to Master Data Services
Pros and Cons of Master Data Services (Virtual Master Data Management)
Leveraging the Golden Source to Resolve Deep-Rooted Source Differences
Future Trends in Master Data Management Using DaaS
Comparing Master Data Services Approach (Virtual) with Master Data Management Approach Involving Physical Consolidation
Case Study: Master Data Services for a Premier Investment Bank
Detailed Scope and Benefits
Proposed Solution Architecture for Master Data Services
Enterprise and Canonical Model for Master Data Management Implementation
Summary
Chapter 11: Big Data and Analytical Services
Topics Covered in this Chapter
Big Data
Big Data Analytics
Relationship Between DaaS and Big Data Analytics
Future Impact of DaaS on Big Data Analytics
Extending DaaS Reference Architecture for Big Data and Cloud Services
Fostering an Enterprise Data Mindset
Case Study: Big DaaS in the Automotive Industry
Summary
Part Four: Ensuring Organizational Success
Chapter 12: DaaS Governance Framework
Topics Covered in this Chapter
Role of Data Governance
Data Governance
People Governance
Process Governance
Service Governance
Technology Governance
Summary
Chapter 13: Securing the DaaS Environment
Topics covered in this chapter
Impact of Data Breach on DaaS Operations
Major Security Considerations for DaaS
Multilayered Security for the DaaS Environment
Identity and Access Management
Data Entitlements to Safeguard Privacy
Impact of Increased Privacy Regulations on Data Providers
Information Risk Management
Important Data Security and Privacy Regulations that Impact DaaS
Checklist to Protect Data Providers from Data Breaches
Summary
Chapter 14: Taking DaaS from Concept to Reality
Topics Covered in this Chapter
Service Performance Measurement Using the Balanced Scorecard
Implementing the Performance Scorecard to Improve Data Services
Embarking on the DaaS Journey with a Vision
Using AGILE Principles for New Data Services Development
Sustaining DaaS in an Organization: How to Keep the Program Going
In Conclusion
Appendix A: Data Standards Initiatives and Resources
Appendix B: Data Privacy & Security Regulations
Appendix C: Terms and Acronyms
Appendix D: Bibliography
Internet Resources and Further Reading
Index
EULA
Preface
Figure 1 Key topics covered in the book by chapter
Figure 2 Roadmap the book's different chapters
Chapter 1
Figure 1.1 Daas in the business environment
Figure 1.2 Data Service Bus
Figure 1.3 Key features of a service
Figure 1.4 Real-life example of data services sold by D&B Hoovers (company search and results)
Figure 1.5 Overview of Cloud-based Data Services
Figure 1.6 Example of Data Services provided by a leading UN Data agency
Figure 1.7 Example of DaaS in the Retail Sector
Figure 1.8 Key phases of enabling the DaaS vision phase
Figure 1.9 Data Services Blueprint: key activities and deliverables
Figure 1.10 Service Delivery Model (SDM)
Chapter 2
Figure 2.1 Accessing and sharing data with Enterprise Data Services
Figure 2.2 Identifying critical success factors
Figure 2.3 DaaS reference architecture description
Figure 2.4 Reference architecture for DaaS framework
Figure 2.5 Evolution of data analysis needs within an organization
Figure 2.6 Trade-offs between data sharing and data privacy/legal considerations
Figure 2.7 Linking DaaS reference architecture components
Chapter 3
Figure 3.1 Scope of data asset management within a typical enterprise
Figure 3.2 Key considerations for data asset management
Figure 3.3 Key categories of enterprise data and their usage in the real world
Figure 3.4 Differentiating between enterprise data and other types of data
Figure 3.5 Examples of master data elements
Figure 3.6 Reference data example for ISO-specified currency codes
Figure 3.7 Example of real-life usage of enterprise data
Figure 3.8 Role of EIM in building future DaaS roadmaps
Chapter 4
Figure 4.1 Business and technology trends driving the future growth of DaaS
Figure 4.2 Building blocks of a typical enterprise data service
Figure 4.3 Publication of enterprise data services within the DaaS framework
Figure 4.4 Getting virtual enterprise access with data services
Figure 4.5 Multi-disciplinary approach to building reusable data services
Figure 4.6 Evolving role of big data analytics
Figure 4.7 Overview of data services at Amazon
Figure 4.8 Amazon's use of standardized data services to exchange data
Chapter 5
Figure 5.1 Role of canonical models in efficient data access and exchange
Figure 5.2 Mapping a standardized data exchange model to the XML messages used
Figure 5.3 Overview of an enterprise and canonical model in a DAAS environment
Figure 5.4 Developing the canonical model
Figure 5.5 Comparing the EDM with the canonical model
Figure 5.6 Service integration and reuse with and without a canonical model
Figure 5.7 Critical success factors for developing the canonical model
Figure 5.8 Responsibility assignment matrix for canonical model-related tasks
Figure 5.9 Standard process for deployment of reusable data services across the enterprise
Figure 5.10 Sample canonical model-based XML schema of an address messaging block
Figure 5.11 Evolving role of healthcare information exchanges
Figure 5.12 Major list of healthcare EDI transactions
Figure 5.13 Healthcare claim processing using EDI
Chapter 6
Figure 6.1 Role of the business glossary in the organization
Figure 6.2 Example of metadata required by DaaS consumers in the online retail sector
Figure 6.3 Major components stored in the business glossary
Figure 6.4 Real-life example of a business glossary
Figure 6.5 Example of varied business term definitions across multiple divisions
Figure 6.6 Varying instances of product definitions
Figure 6.7 Metadata repository: Initial setup
Figure 6.8 Mapping semantic inconsistencies across customer applications in an enterprise
Figure 6.9 Structural definition of data
Figure 6.10 Illustrative example of value domains
Chapter 7
Figure 7.1 Functional components of SOA
Figure 7.2 Data service to check airline flight status
Figure 7.3 Multilayer services used for airline reservations
Figure 7.4 Data service interface is deployed based on enterprise-data definitions
Figure 7.5 Service categories and types
Figure 7.6 Comparing data integration and data virtualization approaches
Figure 7.7 Conceptual framework for data virtualization
Figure 7.8 Representative technology for hosting data services
Chapter 8
Figure 8.1 Achieving data interoperability in a real-life environment
Figure 8.2 Identifying the root cause of quality problems for a data service
Figure 8.3 Data profiling results on reference data
Figure 8.4 Data profiling process flow
Figure 8.5 Periodic data assessments can drive data quality improvement efforts
Figure 8.6 Major dimensions of DaaS quality assessment
Figure 8.7 Major categories of data standards
Figure 8.8 Usage of KPIs for data service quality monitoring and improvement
Chapter 9
Figure 9.1 An example of reference data from the airline sector
Figure 9.2 Reference data services for market data
Figure 9.3 Overview of reference data management
Figure 9.4 Example of hierarchies associated with reference data for ISO country codes and state codes
Figure 9.5 Exchange of LEI reference data using data services
Figure 9.6 Leveraging BRMS reference data management
Figure 9.7 Diagnosis codes for sprained and strained ankles: ICD-9 CM versus ICD-10 CM
Figure 9.8 Healthcare claim submission and payment exchanges
Chapter 10
Figure 10.1 Role of the enterprise data model
Figure 10.2 Virtual master data management approach leveraging data services
Figure 10.3 Role of ODS/staging area in master data services implementation
Figure 10.4 Solution blueprint for master data services in banking
Figure 10.5 DaaS supports a 360-degree view of a customer with party identifier
Figure 10.6 ELDM for customer (demographics) subject area
Figure 10.7 Canonical messaging blocks (XML) for data exchange of customer master data
Figure 10.8 Mapping XML message schemas to existing financial applications in the organization
Figure 10.9 Overview of data integration environment in the global bank
Chapter 11
Figure 11.1 Real-life applications of big data and analytics
Figure 11.2 Major components of a big data analytics solution
Figure 11.3 Kayak's price trend predictor
Figure 11.4 Integrating big data and Cloud services with enterprise data warehouse
Figure 11.5 Reference architecture components required to support big data analytics
Figure 11.6 Real-time mobile DaaS application with data visualization
Figure 11.7 Lifecycle of real-time analytical processing of big data using stream computing
Figure 11.8 Use of data virtualization in big data analytics environment
Figure 11.9 Leveraging the metadata repository for ensuring big data privacy
Figure 11.10 Data collection from automobile sensors to support big DaaS
Chapter 12
Figure 12.1 Critical pillars for IT governance to support a DaaS program
Figure 12.2 Why do we need governance?
Figure 12.3 Major components of enterprise information management (EIM)
Figure 12.4 Mission statement for DaaS governance
Figure 12.5 High-level process for governance of enterprise data
Figure 12.6 Enterprise and local governance committees
Figure 12.7 Sub-committees set up for specialized areas in data governance
Figure 12.8 Co-ordination across individual data services development teams
Chapter 13
Figure 13.1 Multi-layered security framework for data services
Figure 13.2 Multiple levels of IT security
Figure 13.3 Multiple identity profiles for a consumer
Figure 13.4 Data privacy and entitlements for sensitive client data (external)
Chapter 14
Figure 14.1 Initial assessment and planning for setting up a DaaS framework
Figure 14.2 Example of balanced scorecard for driving data service quality improvements
Figure 14.3 Data maturity curve for data providers
Figure 14.4 Role of feedback/learning process in data services scorecard initiatives
Figure 14.5 Key benefits of adopting DaaS
Cover
Table of Contents
Preface
xiii
xiv
xv
xvi
xvii
xviii
xix
xxi
xxii
xxiii
xxiv
xxv
xxvii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
204
205
206
207
208
209
210
211
212
213
214
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
237
238
239
240
241
242
243
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
With the advent of social media and the Internet of Things (IoT), businesses are receiving a lot more data than they ever did in the past. The volume of data is increasing exponentially, the variety is increasing, and so is the velocity of its arrival. Companies who can analyze this data, derive insights and share their learnings across business lines within the company and with the ecosystem of partners externally in an effective manner to transform their businesses are invariably the ones who are going to win. This specific trend has been captured in Accenture's Technology Vision 2013 as “Data Velocity” and “Design for Analytics” and again in 2014 as “Data Supply Chain.” Personally, as a Managing Director of Accenture, I have seen this trend resonate with our Fortune 500 clients across Accenture's five Operating Groups: Communications, Media & High Tech, Financial Services, Health and Public Services, Resources, and Products.
Given the need to consume data from heterogeneous sources, both internal and external to a company, hosted either in premises or in the cloud, and on the flip side, to make its own data available in exactly the same reuseable form for partners to consume, companies can no longer afford to keep data locked into silos of applications, nor can they treat it as a second class object when it comes to architecting its IT infrastructure. Data needs to be decoupled from applications so that the data generated by one application can be used effectively by a completely different set of applications, and the insights generated by analyzing the data within one business line of a company can be shared with other business lines in order to maximize the Return on Investment (RoI) on the data available to the company as a whole. I have seen this happen with a leading drugstore in the United States where sharing of data between the store's loyalty program and the sales department helped better targeting of products leading to significantly increased sales.
The most effective way of sharing the data and insights is to make data a first class object in the design of IT architecture and make it available as a service. Once exposed as a service, any application, whether internal or external to a company, can consume data in a seamless manner and use it creatively to make a tangible difference to business. In fact, there are several examples of completely new businesses created across industries from healthcare to insurance to automotive to real estate, fuelled by the sharing of data in the form of APIs by a company with its ecosystem of partners; and the huge impact created, in turn, by the ecosystem on the company's existing business due to the sharing of data, leading to mutual business benefits. For example, GM exposed their OnStar Application Programming Interface (API) to power a new business service via a start-up called RelayRides that enabled individuals to rent their personal cars, thereby disrupting the rental car business. We have seen the same trend with Walgreens who is offering access to its data through a variety of APIs and Software Development Kits (SDKs) to fuel new businesses with its ecosystem of partners.
Similarly, there is a plethora of examples of how companies have successfully exploited the synergy across their business lines by sharing data and insights within the company, leading to higher efficiency and creation of new revenue streams. The previously cited example of the leading drug store sharing data between the customer loyalty program and the sales department fits this category. Thus, data sharing internally as well as externally has proven to be transformational for businesses across industries.
With business transformations happening across the globe based on the availability of huge amount of data and its analysis, this book on Data as a Service, providing a comprehensive view into the world of Data Engineering and its implications on business, is a must read for every IT professional and business leader.
SANJOY PAUL, PhD Managing Director – Accenture Technology Labs
When I wrote my first book, Data Crush, I attempted to capture the ways in which the technical innovations of mobility, Cloud computing, and big data were leading to entirely new social and business phenomena. Several of the impacts that these new technologies have had on our world are driving the demand for Data as a Service, hence I was elated when Pushpak asked me to introduce his work, that you now hold in your hands. There are three social forces that are making Data as a Service a new business imperative, and they are quantification, appification, and cloudification. Let us look at each in turn.
Quantification is the growing trend of measuring absolutely everything, across all aspects of business. I recently met the CIO of a commercial property management company that is spending over $1 billion to quantify his business. Over a two year period, his company will connect to the Internet every lightbulb in every one of their buildings. When I asked him what data he hoped to learn from these connected bulbs his response was, “I have no idea, but what I do know is that if I don't have the data there's nothing to analyze.” You will likely see this sort of pervasive data collection occurring throughout every process in every organization over the coming decade.
Appification is our growing expectation of instant gratification, at little or no cost, regardless of how irrational this expectation may be. Indeed, we are becoming so appified that we expect our needs to be met predictively. Delivering on this expectation demands that organizations not only analyze data, they must do so perpetually and rapidly. The notion thatbusiness insights only come from a Research and Development department, or from IT is outdated, because there simply is not time to push analytics to a central organization. Rather, appification means that organizations must collect, digest, and act upon data as close to the customer as possible, in both time and space.
Finally, Cloudification is the notion that the paradigm of building and owning the assets of your business has become obsolete. Cloud initially entered the world of applications with Software as a Service, and is rapidly spreading to all other aspects of business operations. More and more, companies will simply aggregate third-party services in order to meet customer needs, rather than produce those outputs themselves. Data management and analysis will follow this trend, leading to Data as a Service being the standard mode of putting data to work in organizations.
Acting upon these societal forces is challenging. Much of this mode of operating runs counter to how we have run IT for half of a century. Nonetheless, it is imperative that organizations embrace Data as a Service if they hope to remain relevant in our accelerating world. This book provides a practical, implementable approach to reaching this goal. I trust that you will find Pushpak's guidance valuable as you work to meet the new expectations of an ever-more-competitive world.
CHRISTOPHER SURDAK Engineer, ex-Rocket Scientist, Juris Doctor,Technology Evangelist and author of “Data Crush,”GetAbstract's International Book of the Year for 2014
Typically, once every couple of decades a disruptive new technology emerges that fundamentally changes the business landscape. Innovative, high tech products that often start a trend come to the mainstream market with such rapidity that they transform the existing way of doing business. These trends also create a new market that eventually disrupts the existing market and related network, often displacing the earlier technology.
In most cases, organizations that understand underlying competitive dynamics of innovation and who adapt to these disruptive trends, win. Today such fundamental shifts take place in the world of data and analytics daily, and they are changing the global business landscape significantly.
If one closely observes the global marketplace, it is safe to say that many businesses are trying to harness an unprecedentedly large amount of data to derive new insights that support their competitive analyses. A huge amount of data that is gathered from diverse channels (e.g., social media, clickstream analysis) need to be translated by businesses to enable concrete actions. Organizations that understand the competitive dynamics at play and those that can then predictively analyze that data will win, whereas those that fail to recognize this challenge and respond to it will become extinct.
While data has always been considered an essential part of IT infrastructure across most organizations to support their business operations, today it is recognized as the key commodity upon which an enterprise runs its business and day-to-day operations. A complete paradigm shift has occurred in which data is increasingly recognized as an asset that can be commercially sold as a service, in and of itself.
Based on the author's first-hand experience and expertise, this book offers a proven framework for sharing core enterprise data using reusable data services. The book covers how organizations can generate business revenues by providing Data as a Service to their clients for fee-based subscriptions. The book goes on to explain in detail how to acquire and distribute data across heterogeneous platforms effectively using enterprise SOA principles, industry data standards, and leveraging new technologies such as data virtualization, cloud, and big data stream computing. The book also offers the following:
Presents a comprehensive approach for introducing Data as a Service (DaaS) in any organization for the first time.
Recommended best practices and industry standards for sharing master, reference, and big data with data consumers.
Commercialization aspects of Data as a Service and its potential for generating revenues.
Covers real-world applications of DaaS such as big Data as a Service.
Real-life case studies on various innovative architecture blueprints and related patterns.
The topics covered in this book are wide ranging, starting with a presentation on the need for providing DaaS and the technical challenges involved in making that transformation. Some of the areas of the book that may particularly appeal to readers include:
How DaaS can become a strategic enabler for sharing data with customers on company products they are interested in purchasing, browsing online, or viewing on social media.
How the DaaS framework can help many organizations recognize monetizable intent and dependency of their customers on accessing their data while buying their company products.
How enhanced on-demand data services can lead to potential clients by organizations that plan on mining customer, social media, and online conversations over a big data platform, using sophisticated predictive algorithms and data analytics tools.
How to adopt best practices for successfully deploying reusable data services in your organization along with a reference architecture comprising common sets of data standards, guidelines, and processes.
Covering so much ground—from canonical modeling to data governance and XML based services—can be challenging for some readers, so the book offers a roadmap to help guide you through it.
The Reader's Guide is provided to help readers determine who should read the book and why they need to read the book. A summary of each chapter to explain the step-by-step approach required for the successful introduction of DaaS in any organization is also provided.
The successful adoption of DaaS in any organization is based on three fundamental areas—architecture, adopting organizational processes, and ensuring the appropriate technology components are deployed. However, this should be based on real-world experiences and lessons learned from prior IT/DaaS implementations. This is one of the reasons this book includes case studies in several chapters.
The next section will guide readers on how best to use the book by sharing details of every chapter. It will also help guide readers to determine the best approach to use the DaaS framework in their current IT landscape within their organization. Figures 1.1 and 1.2 illustrate key topics in the book along with the suggested roadmap.
Figure 1.1 Key topics covered in the book by chapter
Figure 1.2 Roadmap the book's different chapters
The introductory section of the book introduces you to Data as a Service (DaaS). It also provides readers with a clear overview on how an organization can deliver on the promise of providing DaaS to its business stakeholders and end customers.
Chapter 1: “Introduction to DaaS” provides a high-level overview on the core concepts of the DaaS framework. It also explores commercialization aspects of Data as a Service, its immense potential for generating revenues for most organizations, as well as some of its common limitations. It describes the details of service delivery management while suggesting necessary key steps for preparing the blueprint for enterprise data services in your organization.
Chapter 2: “DaaS Strategy and Reference Architecture” provides an overview of DaaS reference architecture along with the key components that make up the DaaS framework. It also explains the long-term significance of formally creating an enterprise data strategy in an organization that formulates a long-term roadmap to deliver Data as a Service (DaaS).
Chapter 3: “Data Asset Management” explores the significance of enterprise data and the foundational role it plays to make enterprise data services successful in any organization. It explains the underlying principles of data asset management and why companies need to treat data as a corporate asset. It also examines the various major types of enterprise data and contrasts their major features.
This section of the book focuses on the architecture framework and components required to deploy DaaS in your organization. It also describes in detail common patterns, standards, and processes that can help shape the DaaS Reference Architecture. This section also provides readers with a high-level overview on best practices from a few related disciplines (e.g., EIM, EA, SOA, data services) to make DaaS a scalable data delivery mechanism for organizations.
Chapter 4: “Enterprise Data Services” describes the core concepts about enterprise data services as a fundamental component of the DaaS framework. It illustrates with examples how several organizations have successfully developed a set of standardized service interfaces (termed EDS) to enable data sharing with their various stakeholders (customers, vendors, regulatory agencies, government, etc.).
Chapter 5: “Enterprise and Canonical Modeling” explains the significance of enterprise and canonical modeling and its foundational role to promote consistent and reliable data exchange across disparate systems spread out over the organization. It also explains the significance of the enterprise data model (EDM) as the foundational component required for building a robust and mature set of data structures that can be reused across the entire organization.
Chapter 6: “Business Glossary for DaaS” environment provides a detailed overview of the underlying reasons why organizations need to develop a standardized business glossary for data services published for user consumption. Storing glossary terms in a shared metadata repository across the organization will improve the overall productivity of both the businesses and the external subscribers to enterprise data services (EDS).
Chapter 7: “SOA and Data Integration” provides a high-level overview on key data acquisition and integration patterns with service-oriented architecture (SOA) as the underlying foundation. It also covers a few technologies, e.g., data virtualization, stream computing for big data, data federation, which can be leveraged by the DaaS framework to publish data services with enhanced efficiency, performance, and a scalable architecture.
Chapter 8: “Data Quality and Standards” provides details on how to ensure that the quality of data published by enterprise data services is suitable and fit for public consumption. It explains the significance of data standards for the success of any DaaS program. The chapter also discusses the role of data profiling as a foundational process for the success of any DaaS quality program. Finally, it looks at some of the major data profiling and quality measures that are critical for implementing a DaaS project in real life.
This section of the book provides a number of important solution blueprints where the DaaS framework can benefit organizations across several industries. Solution blueprints of data services can be very useful for readers as they can help explain the relationship between the architecture patterns explained earlier to the specific business requirements of organizations to exchange various types of enterprise data. Solution blueprints are based on the DaaS reference architecture also explained in the earlier sections of the book. Finally, this section covers a variety of real-life case studies on how organizations have successfully utilized the DaaS framework and its architectural patterns to improve their business efficiency over the long term.
Chapter 9: “Reference Data Services” presents a detailed overview on how DaaS can be deployed successfully in organizations for disseminating shared reference data to downstream data subscribers and consumers. It also presents real-life case studies on reference data services from the financial and healthcare sectors.
Chapter 10: “Master Data Services” provides a detailed architectural pattern for designing and developing Master Data Services (MDS) that can be reused across an enterprise by using common design components and standards. It also evaluates how MDS can be utilized by organizations as an effective alternative to the existing styles of MDM implementation without physically consolidating master data in a single hub. A detailed case study on a MDS implementation at a large financial institution is presented.
Chapter 11: “Big Data and Analytical Services” explains how big data analytics users can leverage data services to access data they need for advanced analytics and take decisions in real time. This chapter includes several case studies presented from organizations that have successfully implemented big data and mobile-based analytics services, leveraging the DaaS framework. It provides a detailed solution blueprint for designing and developing big Data as a Service that can be reused across the enterprise by using the design components and standards proposed under the DaaS framework.
Introducing DaaS is uncharted territory for many organizations. Not all businesses are likely to face the same urgency for providing Data as a Service to their consumer, nor will they encounter the same challenges. An organizational roadmap has been included containing several best practices with respect to DaaS program management and service delivery-related aspects. Adopting these best practices and guidelines will ensure that the DaaS program continues to be useful and provides business value to stakeholders over the long term.
Chapter 12: “DaaS Governance” explores the critical nature of data governance in DaaS and how people, process, and technology factors can be leveraged to successfully deploy data services within any organization. This chapter also suggests various governance policies and controls that an organization can utilize to track and monitor the overall user experience while using a reusable enterprise data service (EDS). It examines the emerging role of the chief data officer (CDO) across organizations, as a key change agent to align data initiatives with the business strategy of an organization.
Chapter 13: “Securing the DaaS Environment” explains why data security and privacy-related issues have become such a critical consideration for any organization interested in publishing data services. It also demonstrates the key features of a comprehensive information risk management program that can mitigate risks to the DaaS program. It provides a practical list of data security and privacy measures that can be deployed by any organization planning to set up DaaS operations.
Chapter 14: “Taking DaaS from Concept to Reality” discusses best practices with respect to DaaS project management and delivery. Adopting these best practices and guidelines will ensure that the DaaS program continues to be useful and relevant to stakeholders over the long term. It discusses the benefits of employing AGILE methodology for new data services development as an alternative to the traditional software development life cycle. The chapter also illustrates steps to build a DaaS performance scorecard monitoring overall service performance of a data provider organization.
Again, I strongly reiterate that adopting DaaS will decouple data from underlying business and application complexities, although technology constraints will not become entirely irrelevant. The flexibility gained from the de-coupling, should help IT organizations react more flexibly and quickly to technological changes. At the same time, business decision makers can focus on what they really need from their data organization and not how they circumvent their existing system or platform-related constraints. As is explained with numerous illustrative examples from the real-world, DaaS can potentially also offer a new monetization capability to some organizations by leveraging data as a revenue generating service. In short, reading this book will provide an excellent overview to the exciting possibilities of leveraging data assets in your organization as well as uncover its inherent commercial value in the business market.
This book should appeal to any practitioner interested in implementing or selling the value of the DaaS program to business stakeholders. It should be of value to a diverse business and technical audience, ranging from business executives to experienced IT architects to those new to the topic of DaaS. Given the wide range of readers, who may benefit from reading this book, there is no pre-determined order or sequence suggested on how to read it.
Some of the ways this book can be useful to specific reader communities are listed here.
Business executives: If you are a stakeholder responsible for providing direction or governing data in your organization, then this book gives you an excellent overview of the exciting possibilities to leverage your organization's data so as to meet the needs of your consumers as well as formulate the economic value proposition of providing Data as a Service. If your organization has plans to become a DaaS service provider, this book will help you understand the requirements of your data customers and suggest service-based solutions that can help address the customer's data needs.
Enterprise architects: If you are an enterprise architect, the book provides a good introduction to the key enterprise design considerations while developing a data services strategy. In addition to this benefit, you will learn how DaaS can add to your overall business strategy, by ensuring long-term improvements to the data infrastructure of an enterprise.
Data architects: If you are a data architect, this book gives you valuable advice on the design of a valuable data foundation layer. You will learn how to ensure long-term improvements to the data infrastructure of an enterprise while leveraging the DaaS framework for fulfilling the master data, reference data, and analytical data needs of your consumers.
SOA architects: If you are a SOA/data services architect, this book provides detailed guidelines on how to apply various technology and architecture patterns while deploying DaaS in your organization. It will also make you aware of the various data security standards and best practices to ensure integrity of published data services.
IT applications designers or developers: If you are an experienced applications designer or developer, then you will find this book useful to understand the entire process of developing data services with an awareness on the specific benefits of data reuse and how reusing service patterns can help with quicker deployment of applications in your organization. The book also gives practical advice and detailed guidelines on how your business applications can save development time and costs by leveraging reusable data services.
Systems management and IT/MIS students: If you are relatively unfamiliar with the role of data in IT Systems Management, this book provides you an excellent introduction to key data related disciplines like enterprise modeling, data governance, metadata, and SOA from a data practitioner's perspective.
As mentioned earlier, this book should serve most readers as a comprehensive guide for setting up DaaS in their organizations. While the book attempts to cover all the key business and technical aspects of DaaS, one size rarely fits all. Subsequently, the book does not attempt to cover any physical implementation or related details such as those recommended by software products and vendor tools that are specific to your individual organization's needs. There are several organizational and IT aspects that are unique to every industry and country regarding implementation and deployment of DaaS solutions. Therefore, such detailed decision-making at the organizational level is best left to the people who know their organization needs closely. However, guidance has been provided throughout this book on how to address some of these implementation challenges from a larger perspective.
The creation of this book on such a complex and innovative area such as Data as a Service required the participation and support of a number of individuals. In fact, this book would not have been possible without their active support and encouragement.
I want to thank a number of thought leaders in data management, architecture, and analytics who have provided me their guidance and insights while writing the book: John Zachman, Prof. Peter Aiken, Dr. Sanjoy Paul, Aaron Zornes, Steve Hoberman, Krish Krishnan, and John Ladley.
I also want to thank Shiraz Kassam, Dr. Arka Mukherjee, and Dr. S. Kaisar Alam for helping me stay inspired while writing this book and sustain the effort. I want to acknowledge the contributions of Prithvijit Mazumder and Aditya Mehta in helping review and enhance various portions of the work. I want to thank Ms. Shreya Sarkar for her terrific edits to the initial manuscript and also the editorial team from Wiley, Mary Hatcher and Brady Chin, for their continued advice, help, and support during the authorship of this book.
Last but not least, I owe special gratitude to my family and friends for their time, patience, encouragement, and support in innumerable ways.
This chapter introduces the Data as a Service (DaaS) framework and the approach taken by several organizations to introduce DaaS into their organization.
It provides an introductory overview of the underlying drivers for transformation of data as a monetized asset and evaluates how commercial trends in the marketplace will further drive this service trend.
It also suggests several key steps for preparing the blueprint for Enterprise Data Services in your organization. These steps include establishing a service delivery model (SDM) comprised of a service catalog, service governance, and a resourcing strategy.
Finally, this chapter looks at commercialization aspects of data as a service, its potential for generating revenues as well as some of its common limitations.
The most profound technologies are those that disappear. They weave themselves into the fabric of our everyday life until they are indistinguishable from it.
—Late Prof. Mark Weiser (Father of Ubiquitous Computing)
This book offers a huge undertaking to its readers. It aims to offer a definitive roadmap on how to significantly transform your organization by providing Data as a Service (DaaS) to consumers of your data across the enterprise. It also suggests ways to explore the promise of data and its expanded role as a strategic business enabler.
Using DaaS as the unifying conceptual framework, the book shows readers how they can successfully integrate distributed systems across heterogeneous platforms virtually and publish data to subscribers securely using industry data standards and governance mechanisms.
This introductory chapter provides an overview of the exciting possibilities around leveraging reusable data services across any organization as well as the economic value proposition of providing DaaS to your customers. It also explains the overall approach and necessary steps for any data provider to establish a service delivery model (SDM) for offering DaaS to subscribers.
In the words of Peter Drucker, a world-renowned management visionary, an information-based organization requires “clear, simple, and common objectives that translate into actions.”
In this chapter, we examine what these guiding objectives are and how they define the new persona of a successful information-based organization.
The DaaS framework presented in this book entails a paradigm shift in a fundamental sense, a shift that can help any organization transform itself into a data services-driven organization. Indeed, the DaaS framework can offer end users the capability to have convenient and timely access to data from multiple, heterogeneous data sources within the company as reusable data services. These data services can be useful to external and internal data subscribers, business partners, regulatory agencies, etc., (Figure 1.1). Additionally, this capability can be leveraged by some organizations interested in becoming commercial data providers, by publishing data for their customers and subscribers as a marketable service.
Figure 1.1 Daas in the business environment
For example, if we look at the high-tech sector, the underlying shift toward IT services is being driven by new advances in technology and its resulting societal consequences. In effect, many organizations need to change how they do business. They will need to respect demands from an increasingly tech-savvy generation of customers who now spend more time interacting with each other on mobile devices, through texting, and on social media sites.
All these factors have created a marketplace that will be dominated by organizations that understand new trends driving the global market. Organizations need to anticipate these changes before their competitors do and provide services rapidly whenever requested by their customers. Companies that undergo this business transformation are data-driven enterprises.
To become more prompt and effective in responding to business or market demands, any service-based organization needs to place a larger emphasis on information sharing. The challenges faced while exchanging data usually result from a fragmented data environment made up of different platforms having no common standards. Consequently, the data entities and attributes of these systems often do not share the same syntax and semantics or even a common meaning, which is a necessary condition for systems to reliably share information. Currently, the majority of systems also have not been designed for data interoperability and sharing. This is where the DaaS framework can enhance the implementation of data services with the basic concept of a real-world Data Service Bus. The Data Service Bus can act as a key foundation for data reuse in any DaaS deployment.
For effective sharing of enterprise data across divisions, it is essential for large organizations to build an underlying data foundation (similar to a bus architecture) that provides a consistent view of enterprise-level data in the organization. The concept of a data service bus, which is a logical data abstraction layer created at the enterprise level, can act as a foundation for virtually sharing and reusing information across IT applications. However, it should not be confused with the enterprise service bus (ESB). In some ways, the Data Service Bus can be compared to a data broker that facilitates exchange of enterprise data from a DaaS Provider, or Data Provider, to its subscribers.
In my view, the true potential of DaaS can be realized by an organization if it sets up a well-architected Data Service Bus, comprising common data modules for reuse by downstream applications and customers as well as using standardized Enterprise Data Services. In addition to the data foundation layer, successful DaaS deployments also need to maintain standardized business logic and rules to process data that downstream systems can exploit (Figure 1.2).
Figure 1.2 Data Service Bus
To align the Data Service Bus with long-term business strategy, an organization interested in setting up DaaS should also establish an overall data strategy that integrates data from both internal and external data sources (social media, twitter feeds, etc.). Also recommended are the adoption of a few architectural principles and goals that will enable data sharing and interoperability across the enterprise as part of the DaaS architectural framework. This topic is explained in greater detail in Chapter 2 of this book.
Let us now try to understand the concept of a data-driven organization and what it means in the context of data-oriented services.
Over the last few years, businesses have increasingly felt pressure to transform into providers of value-added services. Often, these services become necessary for customers to fulfill some of their daily needs. This concept is not entirely new or radically different from the traditional definition of a service. As per the Merriam-Webster's Collegiate dictionary, service is defined as a “facility supplying some public demand.” Consequently, in real life, we find the utility company providing households with water or electricity services. Similarly, a life insurance company exists in the service marketplace, primarily for fulfilling the need felt by most people for security and well-being (Figure 1.3).
Figure 1.3 Key features of a service
Any type of service displays a few common characteristics:
It provides the means of providing a clear
value
to customers.
It facilitates
outcomes
that the customers want to achieve.
It is delivered through a few
capabilities
, while managing associated
risks
.
In the context of DaaS, a data service is referred to as a remotely accessible, self-contained module that provides data to authorized service consumers to help them carry out their business. Consumers can access the service in a standardized manner that is well documented and listed in a service catalog. The catalog can provide consumers with the ability to find whether a service exists and its functionality.
The increasing pressure to provide data services to customers is being confronted by organizations around the world. Along with other business drivers, this pressure is often caused by several technology advances in the IT sector.
Over the last few years, we have witnessed a large trend toward “social shopping.” Many online shoppers embrace the social-media ecosystem as their preferred channel. These shoppers usually conduct their own informal research by browsing products that they need or they find the latest products or services through what others find interesting on social media. For example, Facebook makes this process quite convenient by registering our likes and dislikes. Shoppers then compare online prices offered by different retailers, before committing to their actual purchase. Consequently, with this trend, a larger segment of customers have become dependent on the social network ecosystem and their online behavior will affect businesses on a significant scale in the future (Shih, 2009).
As an outcome of this new trend, customers are likely to feel encouraged by taking a more proactive role themselves, while deciding on their day-to-day purchases. Over the past few years, several online retailers (e.g., Amazon, Groupon, Alibaba) are seeing huge growth in their business globally, by providing customers with useful data that can help them decide on what products to purchase. In the face of new competition, many traditional retailers such as Walmart and Target have also followed suit. Similarly, supermarket chains such as UK-based Tesco have grown to be a market leader in recent years by transforming themselves to data-driven enterprises.
Leveraging data, predictive analytics, and customer insight have become part of retailers' competitive weaponry. In most of these cases, however, the customer has become the real beneficiary because they can now take fuller advantage of personalized discounts and reward coupons offered by web-based and traditional retailers.
While the majority of business organizations offer DaaS to their customers as a complimentary service, some companies have been able to identify corporate data assets that they can rent to customers on a fee-based model also called monetization. Using monetization, several data providers within the DaaS market have generated revenues to seize initiative and grow their data services commercially. A good example of a business monetizing DaaS in the current market is Dun & Bradstreet (D&B), in particular, a subsidiary named Hoovers (Figure 1.4). This pioneer organization provides business data to their corporate clients and individual subscribers for a specific service fee. The D&B Hoovers website can stream data to its client organizations in the form of a list of specific leads, which go directly to sales teams who then contact people to make sales. There are several other firms in the market who have also been taking the lead as DaaS pioneers, providing various kinds of data services to interested subscribers. Some of these data services range from providing financial data to supplying data on a manufacturer's parts catalog for distributors as part of the supply-chain and logistics management (Soderling, 2010).
Figure 1.4 Real-life example of data services sold by D&B Hoovers (company search and results)
Another good example of an organization monetizing DaaS in the current market is cloud-based data services provider Treasure Data, a company recently named among the coolest big data vendors by Gartner. This company provides DaaS to several clients charging them a flat monthly rate for data offerings.
As part of their services, Treasure Data collects, manages, and analyzes massive volumes of big data for their clients (Figure 1.5). They can also store the client's data on the Cloud, based on a pre-built data model that supports easy data integration and export (storing different types of data formats).
Figure 1.5 Overview of Cloud-based Data Services
The data provider can quickly set up the data requirements for their client in the cloud environment in a matter of weeks. The client can then focus on analyzing data without worrying about database administration or the other underlying DBinfrastructure-related maintenance issues. This includes 24-hour support and monitoring, seven days a week, after the initial implementation.
Today, a similar story is taking shape in the public sector and government. Data is delivered by these agencies to their consumers in several innovative ways. For example, the United Nations Statistics Division now provides statistical data as an online data service to its members across the world (Figure 1.6). They disseminate information on country-specific statistics such as country gross domestic product (GDP), population, education, life expectancy, crime, and so on.
Figure 1.6 Example of Data Services provided by a leading UN Data agency
Similarly, several community-based organizations in the healthcare sector are creating results from big-data analyses of patient data accessible to physicians and healthcare workers in real-time through data services to save innumerable lives. A prime example of this was witnessed recently when Harvard's HealthMap service (http://healthmap.org) spotted the Ebola outbreak and alerted the medical community before the World Health Organization formally announced the epidemic. HealthMap's role in tracking Ebola was heavily dependent on using big data analytics to harness public health information. HealthMap compiles, collates, and creates a visual report of global disease outbreaks, after sifting through millions of social media posts from health care workers in the affected African countries blogging about their work.
Finally, the advent of new technology (e.g., mobile computing, big data) will expand exponentially as a higher number of customers in the world become more tech-savvy. For example, in the insurance sector, customers are finding it convenient to use automobile insurers such as geico.com
