E-Book
32,99 €

MC Microsoft Certified Azure Data Fundamentals Study Guide E-Book

Jake Switzer

0,0

32,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

The most authoritative and complete study guide for people beginning to work with data in the Azure cloud In MC Azure Data Fundamentals Study Guide: Exam DP-900, expert Cloud Solution Architect Jake Switzer delivers a hands-on blueprint to acing the DP-900 Azure data certification. The book prepares you for the test - and for a new career in Azure data analytics, architecture, science, and more - with a laser-focus on the job roles and responsibilities of Azure data professionals. You'll receive a foundational knowledge of core data concepts, like relational and non-relational data and transactional and analytical data workloads, while diving deep into every competency covered on the DP-900 exam. You'll also get: * Access to complimentary online study tools, including hundreds of practice exam questions, electronic flashcards, and a searchable glossary * Additional prep assistance with access to Sybex's superior interactive online learning environment and test bank * Walkthroughs of skills and knowledge that are absolutely necessary for current and aspiring Azure data pros in introductory roles Perfect for anyone just beginning to work with data in the cloud, MC Azure Data Fundamentals Study Guide: Exam DP-900 is a can't-miss resource for anyone prepping for the DP-900 exam or considering a new career working with Azure data.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 632

Veröffentlichungsjahr: 2022

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Title Page

Acknowledgments

About the Author

About the Technical Editor

Introduction

Who Should Read This Book?

Interactive Online Learning Environment and Test Bank

DP-900 Exam Objectives

Domain 1: Describe Core Data Components

Domain 2: Describe How to Work with Relational Data on Azure

Domain 3: Describe How to Work with Nonrelational Data on Azure

Domain 4: Describe an Analytics Workload on Azure

Assessment Test

Answers to the Assessment Test

Chapter 1: Core Data Concepts

Describe Types of Core Data Workloads

Describe Data Analytics Core Concepts

Summary

Exam Essentials

Review Questions

Chapter 2: Relational Databases in Azure

Relational Database Features

Relational Database Offerings in Azure

Management Tasks for Relational Databases in Azure

Query Techniques for SQL

Summary

Exam Essentials

Review Questions

Chapter 3: Nonrelational Databases in Azure

Nonrelational Database Features

Azure Cosmos DB

Management Tasks for Azure Cosmos DB

Summary

Exam Essentials

Review Questions

Chapter 4: File, Object, and Data Lake Storage

File and Object Storage Features

Azure Storage

Management Tasks for Azure Storage

Summary

Exam Essentials

Review Questions

Chapter 5: Modern Data Warehouses in Azure

Analytical Workload Features

Modern Data Warehouse Components

End-to-End Analytics with Azure Synapse Analytics

Summary

Exam Essentials

Review Questions

Chapter 6: Reporting with Power BI

Power BI at a Glance

Summary

Exam Essentials

Review Questions

Appendix: Answers to the Review Questions

Chapter 1: Core Data Concepts

Chapter 2: Relational Databases in Azure

Chapter 3: Nonrelational Databases in Azure

Chapter 4: File, Object, and Data Lake Storage

Chapter 5: Modern Data Warehouses in Azure

Chapter 6: Reporting with Power BI

Index

End User License Agreement

List of Tables

Chapter 2

TABLE 2.1 Available Pay-As-You-Go SQL Server images

TABLE 2.2 Available bring your own license SQL Server images

TABLE 2.3 Azure SQL MI service tier characteristics

TABLE 2.4 DTU-based purchasing model service tier characteristics

TABLE 2.5 vCore-based purchasing model service tier characteristics

TABLE 2.6 Dedicated SQL pool service level objectives

TABLE 2.7 Azure Database for MySQL service tier resource options

TABLE 2.8 Common sqlcmd switches

TABLE 2.9 DDL commands

TABLE 2.10 Common SQL data types

TABLE 2.11 Common SQL constraints

TABLE 2.12 DML commands

Chapter 3

TABLE 3.1 Azure Cosmos DB API-specific names for containers

TABLE 3.2 Azure Cosmos DB API-specific names for items

TABLE 3.3 Azure Table storage vs. Azure Cosmos DB Table API

Chapter 4

TABLE 4.1 Storage service URL endpoint patterns

TABLE 4.2 Storage account redundancy options

TABLE 4.3 Blob and Directory ACL Permissions

TABLE 4.4 Supported AzCopy authorization methods

Chapter 5

TABLE 5.1 Standard and premium tier DBU prices

List of Illustrations

Chapter 1

FIGURE 1.1 Key-value store

FIGURE 1.2 Document database

FIGURE 1.3 Columnar database

FIGURE 1.4 Graph database

FIGURE 1.5 Structured data

FIGURE 1.6 JSON example

FIGURE 1.7 Common architecture for batch processing in Azure

FIGURE 1.8 Live stream processing

FIGURE 1.9 On-demand stream processing

FIGURE 1.10 Lambda architecture

FIGURE 1.11 ETL workflow

FIGURE 1.12 ADF control flow

FIGURE 1.13 Ordering data flow processing with a control flow

FIGURE 1.14 ADF mapping data flow

FIGURE 1.15 Azure Databricks and SQL stored procedure control flow

FIGURE 1.16 ELT workflow

FIGURE 1.17 Analytics Maturity Model

FIGURE 1.18 Table

FIGURE 1.19 Matrix

FIGURE 1.20 Column chart

FIGURE 1.21 Line chart

FIGURE 1.22 Pie chart

FIGURE 1.23 Scatter plot

FIGURE 1.24 Map

Chapter 2

FIGURE 2.1 OLTP ERD

FIGURE 2.2 Star schema

FIGURE 2.3 View definition

FIGURE 2.4 Azure SQL abstraction vs. administrative effort

FIGURE 2.5 Select a SQL virtual machine image.

FIGURE 2.6 Create a SQL virtual machine: Basics tab.

FIGURE 2.7 Create a SQL virtual machine: Networking tab.

FIGURE 2.8 Create a SQL virtual machine: Settings tab.

FIGURE 2.9 Create Azure SQL Database Managed Instance: Basics tab.

FIGURE 2.10 Create Azure SQL Database Managed Instance: Networking tab.

FIGURE 2.11 Configuring an Azure SQL Database

FIGURE 2.12 Create Azure SQL Database: Basics tab.

FIGURE 2.13 Create Azure SQL Database: Networking tab.

FIGURE 2.14 Scaling an Azure SQL MI

FIGURE 2.15 Create MySQL server: Basics tab.

FIGURE 2.16 Azure Cloud Shell icon

FIGURE 2.17 Adding an AAD Administrator

FIGURE 2.18 Connecting to a user database with SSMS

FIGURE 2.19 Script View as ALTER To statement

FIGURE 2.20 SSMS execution plan

FIGURE 2.21 Types of SQL joins

Chapter 3

FIGURE 3.1 Key-Value store

FIGURE 3.2 Document database

FIGURE 3.3 Columnar database

FIGURE 3.4 Graph database

FIGURE 3.5 Select Azure Cosmos DB API.

FIGURE 3.6 Create an Azure Cosmos DB Account: Basics tab.

FIGURE 3.7 Create an Azure Cosmos DB Account: Networking tab.

FIGURE 3.8 Create an Azure Cosmos DB Account: Backup Policy tab.

FIGURE 3.9 Azure Cosmos DB Data Explorer button

FIGURE 3.10 Azure Cosmos DB Data Explorer splash page

FIGURE 3.11 New Database

FIGURE 3.12 New Container

FIGURE 3.13 Adding a new write region replica

FIGURE 3.14 Updating the default consistency to bounded staleness

FIGURE 3.15 Azure Cosmos DB Explorer Open Full Screen icon

FIGURE 3.16 Azure Cosmos DB Explorer pop-up screen

Chapter 4

FIGURE 4.1 Create a Storage Account: Basics tab.

FIGURE 4.2 Create a Storage Account: Advanced tab security configurations.

FIGURE 4.3 Create a Storage Account: Advanced tab storage configurations.

FIGURE 4.4 Create a Storage Account: Networking tab.

FIGURE 4.5 Create a Storage Account: Data Protection tab.

FIGURE 4.6 File shares button

FIGURE 4.7 Create a New File Share button.

FIGURE 4.8 New file share

FIGURE 4.9 Connect button

FIGURE 4.10 Connect page

FIGURE 4.11 Containers button

FIGURE 4.12 Create a New Container button.

FIGURE 4.13 New Container

FIGURE 4.14 Upload button

FIGURE 4.15 Upload blob

FIGURE 4.16 Networking button

FIGURE 4.17 Virtual Network and Firewall Access

FIGURE 4.18 Exceptions and Networking Routing

FIGURE 4.19 Shared access signature configuration options

FIGURE 4.20 Configure Active Directory for Azure Files.

FIGURE 4.21 Manage ACLs in the Azure Portal.

FIGURE 4.22 How ADLS evaluates identity access

FIGURE 4.23 Connect to Azure Storage button

FIGURE 4.24 Storage account display

FIGURE 4.25 Upload Files pop-up page

Chapter 5

FIGURE 5.1 Batch processing example

FIGURE 5.2 On-demand stream processing example

FIGURE 5.3 Lambda architecture workflow

FIGURE 5.4 Apache Spark distributed architecture

FIGURE 5.5 Create an Azure Databricks workspace: Basics tab.

FIGURE 5.6 Create an Azure Databricks workspace: Networking tab.

FIGURE 5.7 Launch Workspace button

FIGURE 5.8 Azure Databricks home page

FIGURE 5.9 Azure Databricks workspace personas

FIGURE 5.10 Azure Databricks Workspace tab

FIGURE 5.11 Azure Databricks Compute page

FIGURE 5.12 Azure Databricks Jobs page

FIGURE 5.13 Azure Databricks Create Cluster page

FIGURE 5.14 Azure Databricks Create Notebook page

FIGURE 5.15 Creating a mount point with Azure AD credential passthrough

FIGURE 5.16 Create an ADF Instance: Basics tab.

FIGURE 5.17 Create an ADF Instance: Git configuration tab.

FIGURE 5.18 Azure Data Factory overview page

FIGURE 5.19 Azure Data Factory Studio home page

FIGURE 5.20 Azure Data Factory Studio Author page

FIGURE 5.21 Azure Data Factory Studio Monitor page

FIGURE 5.22 Azure Data Factory Studio Manage page

FIGURE 5.23 Creating a blank ADF pipeline

FIGURE 5.24 The ADF Pipeline Creation page

FIGURE 5.25 Copy Data Activity: General tab

FIGURE 5.26 New Dataset page

FIGURE 5.27 New linked service page: Azure SQL Database

FIGURE 5.28 Set properties page for a new dataset: Azure SQL Database

FIGURE 5.29 Copy Data Activity: Source tab

FIGURE 5.30 New linked service page: ADLS

FIGURE 5.31 Set properties page for a new dataset: ADLS

FIGURE 5.32 Using the Azure Data Factory Studio to edit a CSV dataset

FIGURE 5.33 Copy Data Activity: Mapping tab

FIGURE 5.34 Using the Publish all button to save the pipeline and datasets....

FIGURE 5.35 Create an Azure Synapse Analytics workspace: Basics tab.

FIGURE 5.36 Create an Azure Synapse Analytics workspace: Security tab.

FIGURE 5.37 Azure Synapse workspace overview page

FIGURE 5.38 Synapse Studio home page

FIGURE 5.39 Synapse Studio Data page

FIGURE 5.40 Synapse Studio Develop page

FIGURE 5.41 Synapse Studio Integrate page

FIGURE 5.42 Synapse Studio Monitor page

FIGURE 5.43 Synapse Studio Manage page

FIGURE 5.44 New dedicated SQL pool: Basics tab

FIGURE 5.45 New dedicated SQL pool: Additional settings tab

FIGURE 5.46 Pausing a dedicated SQL pool

FIGURE 5.47 Synapse Studio SQL script window

FIGURE 5.48 Choosing the Built-in SQL pool

FIGURE 5.49 Executing Serverless SQL Pool Queries in Synapse Studio

Chapter 6

FIGURE 6.1 A blank report in the Power BI Desktop Report view. You can switc...

FIGURE 6.2 Get Data page

FIGURE 6.3 SQL Server database connection page

FIGURE 6.4 Select the tables or views that will be imported into the Power B...

FIGURE 6.5 Power Query Editor

FIGURE 6.6 M Code can be viewed and edited through the Power Query Advanced ...

FIGURE 6.7 Power BI Desktop Model view

FIGURE 6.8 Power BI Desktop Data view

FIGURE 6.9 Power BI Desktop Report view on the Order Quantity Sold Per Item ...

FIGURE 6.10 Power BI Service Workspace home page

FIGURE 6.11 Power BI Service Report view

FIGURE 6.12 The pin icon on a report visual

FIGURE 6.13 Power BI dashboard

FIGURE 6.14 Power BI Q&A output

FIGURE 6.15 Power BI Report Builder Getting Started window

FIGURE 6.16 A blank report in Power BI Report Builder

Guide

Cover Page

Title Page

Acknowledgments

About the Author

About the Technical Editor

Introduction

Table of Contents

Begin Reading

Appendix: Answers to the Review Questions

Index

End User License Agreement

Pages

iii

vii

xvii

xviii

xix

xxi

xxii

xxiii

xxiv

xxv

xxvi

xxvii

xxviii

xxix

xxx

xxxi

xxxii

xxxiii

xxxiv

xxxv

xxxvi

xxxvii

xxxviii

xxxix

xli

xlii

xliii

xliv

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

329

330

331

332

333

334

335

336

337

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

MC Microsoft Certified Azure Data Fundamentals

Study GuideEXAM DP-900

Jake Switzer

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada and the United Kingdom.

978-1-119-85583-5

978-1-119-85585-9 (ebk.)

978-1-119-85584-2 (ebk.)

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Website is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, readers should be aware the Internet Websites listed in this work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Control Number: 2021950194

Trademarks: WILEY, the Wiley logo, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Microsoft and Azure are registered trademarks of Microsoft Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book. MC Microsoft Certified Azure Data Fundamentals Study Guide is an independent publication and is neither affiliated with, nor authorized, sponsored, or approved by, Microsoft Corporation.

Cover image: ©Jeremy Woodhouse/Getty Images

Cover design: Wiley

Acknowledgments

While I have been able to work on several exciting opportunities in my professional career at Microsoft, including delivering live presentations and working with some of the biggest brand name organizations the world, this was my first time tackling a technical book. This project was both intense and incredibly rewarding, as it allowed me to share what I believe are the fundamental skills anyone will need to start a successful career with the Microsoft data stack. However, this would not have been possible without the support from the following people.

First and foremost, I would like to thank my wife, Kaiya, for her love and support during the writing of this book. It is from her that I gather inspiration to be my best self every day. Thanks to my mom and dad for their unrelenting support and helping me make the most of every opportunity.

I would also like to thank my colleague Susanne Tedrick, author of WOMEN OF COLOR IN TECH: A Blueprint for Inspiring and Mentoring the Next Generation of Technology Innovators, (Wiley, 2020) for reaching out to me when this opportunity became available and to Kenyon Brown, the acquisitions editor, for helping me get it off the ground. Many thanks to Ayman El-Ghazali, the technical editor for this book and a mentor of mine throughout my time at Microsoft. Special thanks to Jon Flynn and Tash Tahir, two of my colleagues at Microsoft, for taking the time out of their busy schedule to review the content.

Finally, thank you to the entire team who made this book come together, including David Clark (project editor), Pete Gaughan (managing editor), Judy Flynn (copyeditor), and Barath Kumar Rajasekaran, who polished the rough content and made sure the project kept moving. Thanks also to all of the people who work behind the scenes with the production of this book.

About the Author

Jake Switzer has been using technology to build data-oriented solutions since his time as a student at the University of Alabama. He has held delivery and advisory roles at Microsoft for over nine years, including as a consultant and cloud solution architect. Jake has designed and developed data platform and advanced analytics solutions for an assortment of Microsoft enterprise customers to ensure that their specific business needs were met. Over the last few years, he has focused on advising Microsoft's sports customers how to design and build modern data solutions in Azure. His responsibilities in this role include providing architecture guidance, building proof of concepts, aiding in production deployments, and troubleshooting support issues. He is well-versed in a variety of data engineering technologies and frameworks such as SQL Server, Apache Spark, Azure Data Factory, Azure Databricks, Azure Synapse Analytics, and Power BI. In his free time, he enjoys spending time outdoors hiking and can be found most weekends cooking and sharing a scotch with his wife.

About the Technical Editor

Ayman El-Ghazali is a seasoned data and analytics professional, being in the industry since 2006. His passion for technology started when he was just a boy playing DOS games on his father’s computer. From there, he pursued studies in computer science while attending high school in Egypt and continued his journey to earning both a bachelor of science and a master of science in Information Systems from Drexel University. On a personal note, Ayman enjoys playing and watching soccer, training in martial arts (mostly Brazilian Jiu Jitsu), and enjoying time with his wife, kids, friends, and family. For more information about his background and his work, please visit his blog thesqlpro.com or linkedin.com/in/aymansqldba.

Introduction

Hello! I am Jake Switzer, and as a data & advanced analytics cloud solution architect at Microsoft, I work with several Microsoft customers on designing and implementing data solutions in Azure. These questions vary day-to-day from very deep technical questions to questions like “What is the right data processing solution for a new data feed that I want to analyze?” or “Why should I move from my on-premises SQL Server solution to a cloud-based data solution?” While these questions vary in difficulty and specificity, they can all be traced back to one common topic: Azure data fundamentals.

If you are picking up this book for the first time, then I assume you are starting your journey as a data practitioner in Azure. The content in this book will not only prepare you for the DP-900 Microsoft Certified Azure Data Fundamentals exam, it will also give you a broad understanding of data solutions in Azure. This book is intended to help you understand the different approaches to storing data in Azure as well as how you can turn raw data into information used to make valuable business decisions. While this exam will not dive deep into specific technical features of the products listed in this book, you will need a broad understanding of these technologies, which will serve as a starting point for becoming more technical with each technology if you so choose.

Who Should Read This Book?

This book is appropriate for anyone who wants to understand Azure data fundamentals in a broad sense and prepare for the DP-900 exam. Technical individuals such as data engineers, data scientists, and DBAs who work with data can greatly benefit from Azure data fundamentals training. This will help them transition their existing skills, whether they are in on-premises data solutions or solutions in other cloud platforms, to a career in Azure. Along with understanding highly technical roles, this book can also help analysts and project managers understand how to use technologies such as Power BI and other Azure data services to help them in their roles. Technical sellers will also find value from this book as they will gain the necessary knowledge for sales discussions where Azure data services are critical to winning business with a potential customer.

What's Included in the Book?

This book consists of six chapters plus supplementary information: a glossary, this introduction, flashcards, and the assessment test after the introduction. The chapters are organized as follows:

Chapter 1

, “Core Data Concepts,” covers the foundations of data storage and analysis techniques. It defines the different types of data, data processing patterns, and categories of data analytics.

Chapter 2

, “Relational Databases in Azure,” covers the different relational database options in Azure and when to use which one. This includes IaaS and PaaS offerings such as SQL Server in a VM, Azure SQL Database, and Azure SQL Managed Instance.

Chapter 2

defines best practices for deploying, migrating to, securing, managing, and querying relational databases in Azure. This chapter also includes the open-source relational database PaaS options that are available in Azure.

Chapter 3

, “Nonrelational Databases in Azure,” covers the different types of NoSQL databases and how to implement them with Azure Cosmos DB. This chapter defines the different Azure Cosmos DB APIs and explores how Azure Cosmos DB provides security, high availability, and consistency for NoSQL data.

Chapter 4

, “File, Object, and Data Lake Storage,” explores the file and object storage options in Azure Storage, including Azure Files, Azure Blob storage, and Azure Data Lake Storage Gen2 (ADLS). This chapter covers deployment, security, and management options for Azure Storage services.

Chapter 5

, “Modern Data Warehouses in Azure,” explores common data processing patterns and features used by analytical workloads. This chapter covers several common Azure services that are used to build modern data warehouses, such as Azure HDInsight, Azure Databricks, Azure Data Factory, and Azure Synapse Analytics.

Chapter 6

, “Reporting with Power BI,” explores the different components of Power BI, such as Power BI Desktop, Power BI service, and Power BI Report Builder. This chapter covers the common steps used in a Power BI workflow and the different aspects of interactive reports, paginated reports, and dashboards.

Each chapter begins with a list of the objectives that are covered in that chapter. The book does not cover the objectives in order, so you should not be alarmed at some of the odd ordering of the objectives within the book. At the end of the chapter, you will find the following elements that you can use to prepare for the exam:

Exam Essentials

—This section summarizes the most important information that was covered in the chapter. You should be able to answer questions relevant to this information.

Review Questions

—Each chapter concludes with review questions. You should answer these questions and check your answers against the ones provided after the questions. If you can't answer at least 80 percent of these questions correctly, go back and review the chapter, or at least those sections that seem to be giving you difficulty.

The review questions, assessment test, and other testing elements included in this book are not derived from the exam questions, so do not memorize the answers to these questions and assume that doing so will enable you to pass the exam. You should learn the underlying topic, as described in the text of the book. This will let you answer the questions provided with this book and pass the exam. Learning the underlying topic is also the approach that will serve you best in the workplace.

To get the most out of this book, you should read each chapter from start to finish and then check your memory and understanding with the end-of-chapter elements. Even if you are already familiar with a topic, you should skim the chapter; Azure data services are complex enough that there are often multiple ways to accomplish a task, so you may learn something even if you are already competent in an area.

Recommended Home Lab Setup

There are multiple objectives in the DP-900 exam that will require you to download and install different desktop tools. These tools are described in their respective chapters, with instructions on where to download them and how to use them.

In addition to these tools, it is important to have access to a Microsoft Azure subscription. Because Microsoft Azure is a cloud-based offering, you only need a computer with a connection to the Internet to set up a free Azure subscription for experimentation. You can create a free Azure subscription by going to https://azure.microsoft.com/en-us/free and clicking Start Free. You will need to log in with a Microsoft account, such as a Hotmail, Live, or Outlook account. The Azure website will step you through the process of signing up for your free subscription. While you will need to provide contact information and a credit card number, Microsoft will not charge the credit card unless you upgrade to a paid subscription.

Like all exams, the Azure Data Fundamentals certification exam from Microsoft is updated periodically and may eventually be retired or replaced. In the event Microsoft is no longer offering this exam, the old editions of our books and online tools may be retired. If you have purchased this book after the exam was retired or are attempting to register in the Sybex online learning environment after the exam was retired, please know that we make no guarantees that this exam's online Sybex tools will be available once the exam is no longer available.

Interactive Online Learning Environment and Test Bank

We've put together some really great online tools to help you pass the MC Microsoft Certified Azure Data Fundamentals exam. The interactive online learning environment that accompanies this study guide provides a test bank and study tools to help you prepare for the exam. By using these tools, you can dramatically increase your chances of passing the exam on your first try.

The test bank includes the following:

Sample Tests

Many sample tests are provided throughout this book and online, including the assessment test, which you'll find at the end of this introduction, and the chapter review questions at the end of each chapter. In addition, there is a bonus practice exam. Use all of these practice questions to test your knowledge of the material. The online test bank runs on multiple devices.

Flashcards

The online text bank includes more than 100 flashcards specifically written to hit you hard, so don't get discouraged if you don't ace your way through them at first! They're there to ensure that you're really ready for the exam. And no worries—armed with the assessment test, review questions, practice exam, and flashcards, you'll be more than prepared when exam day comes! Questions are provided in digital flashcard format (a question followed by a single correct answer). You can use the flashcards to reinforce your learning and provide last-minute test prep before the exam.

Other Study Tools

A glossary of key terms from this book and their definitions is available as a fully searchable PDF.

Go to www.wiley.com/go/sybextestprep to register and gain access to this interactive online learning environment and test bank with study tools.

DP-900 Exam Objectives

MC Microsoft Certified Azure Data Fundamentals Study Guide: Exam DP-900 has been written to cover every exam objective at a level appropriate to its exam weighting. The following table provides a breakdown of this book's exam coverage, showing you the weight of each section and the chapter where each objective or subobjective is covered:

Subject Area

% of Exam

Describe core data concepts

15–20%

Describe how to work with relational data on Azure

25–30%

Describe how to work with nonrelational data on Azure

25–30%

Describe an analytics workload on Azure

25–30%

Total

100%

Domain 1: Describe Core Data Components

Subdomain 1a: Describe types of core data workloads

Exam Objective

Chapter

1-1 Describe batch data

1-2 Describe streaming data

1-3 Describe the difference between batch and streaming data

1-4 Describe the characteristics of relational data

Subdomain 1b: Describe data analytics core concepts

Exam Objective

Chapter

1-5 Describe data visualization (e.g., visualization, reporting, business intelligence (BI))

1-6 Describe basic chart types such as bar charts and pie charts

1-7 Describe analytics techniques (e.g., descriptive, diagnostic, predictive, prescriptive, cognitive)

1-8 Describe ELT and ETL processing

1-9 Describe the concepts of data processing

Domain 2: Describe How to Work with Relational Data on Azure

Subdomain 2a: Describe relational data workloads

Exam Objective

Chapter

2-1 Identify the right data offering for a relational workload

2-2 Describe relational data structures (e.g., tables, index, views)

Subdomain 2b: Describe relational Azure data services

Exam Objective

Chapter

2-3 Describe and compare PaaS, IaaS, and SaaS solutions

2-4 Describe Azure SQL database services including Azure SQL Database, Azure SQL Managed Instance, and SQL Server on Azure Virtual Machine

2-5 Describe Azure Synapse Analytics

2-6 Describe Azure Database for PostgreSQL, Azure Database for MariaDB, and Azure Database for MySQL

Subdomain 2c: Identify basic management tasks for relational data

Exam Objective

Chapter

2-7 Describe provisioning and deployment of relational data services

2-8 Describe method for deployment including the Azure portal, Azure Resource Manager templates, Azure PowerShell, and the Azure command-line interface (CLI)

2-9 Identify data security components (e.g., firewall, authentication)

2-10 Identify basic connectivity issues (e.g., accessing from on-premises, access with Azure VNETs, access from Internet, authentication, firewalls)

2-11 Identify query tools (e.g., Azure Data Studio, SQL Server Management Studio, sqlcmd utility, etc.)

Subdomain 2d: Describe query techniques for data using SQL language

Exam Objective

Chapter

2-12 Compare Data Definition Language (DDL) versus Data Manipulation Language (DML)

2-13 Query relational data in Azure SQL Database, Azure Database for PostgreSQL, and Azure Database for MySQL

Domain 3: Describe How to Work with Nonrelational Data on Azure

Subdomain 3a: Describe nonrelational data workloads

Exam Objective

Chapter

3-1 Describe the characteristics of nonrelational data

3-2 Describe the types of nonrelational and NoSQL data

3-3 Recommend the correct data store

3-4 Determine when to use nonrelational data

Subdomain 3b: Describe nonrelational data offerings on Azure

Exam Objective

Chapter

3-5 Identify Azure data services for nonrelational workloads

3-6 Describe Azure Cosmos DB APIs

3-7 Describe Azure Table storage

3-8 Describe Azure Blob storage

3-9 Describe Azure File storage

Subdomain 3c: Identify basic management tasks for nonrelational data

Exam Objective

Chapter

3-10 Describe provisioning and deployment of nonrelational data services

3, 4

3-11 Describe method for deployment including the Azure portal, Azure Resource Manager templates, Azure PowerShell, and the Azure command-line interface (CLI)

3, 4

3-12 Identify data security components (e.g., firewall, authentication, encryption)

3, 4

3-13 Identify basic connectivity issues (e.g., accessing from on-premises, access with Azure VNETs, access from Internet, authentication, firewalls)

3, 4

3-14 Identify management tools for nonrelational data

3, 4

Domain 4: Describe an Analytics Workload on Azure

Subdomain 4a: Describe analytics workloads

Exam Objective

Chapter

4-1 Describe transactional workloads

4-2 Describe the difference between a transactional and an analytics workload

4-3 Describe the difference between batch and real time

4-4 Describe data warehousing workloads

4-5 Determine when a data warehouse solution is needed

Subdomain 4b: Describe the components of a modern data warehouse

Exam Objective

Chapter

4-6 Describe Azure data services for modern data warehousing such as Azure Data Lake, Azure Synapse Analytics, Azure Databricks, and Azure HDInsight

4-7 Describe modern data warehousing architecture and workload

Subdomain 4c: Describe data ingestion and processing on Azure

Exam Objective

Chapter

4-8 Describe common practices for data loading

4-9 Describe the components of Azure Data Factory (e.g., pipeline, activities, etc.)

4-10 Describe data processing options (e.g., Azure HDInsight, Azure Databricks Azure Synapse Analytics, Azure Data Factory)

Subdomain 4d: Describe data visualization in Microsoft Power BI

Exam Objective

Chapter

4-11 Describe the role of paginated reporting

4-12 Describe the role of interactive reports

4-13 Describe the role of dashboards

4-14 Describe the workflow in Power BI

Exam domains and objectives are subject to change at any time without prior notice and at Microsoft's sole discretion. Please visit Microsoft's website for the most current information.

Assessment Test

Which of the four Vs of big data is related to the speed at which data is processed?

Volume

Velocity

Value

Variety

Which of the following components is not included in the Lambda architecture design pattern?

Batch layer

Serving layer

Speed layer

Transactional layer

Which of the following transactional database properties ensures that once a transaction is committed, it will remain committed even if there is a system failure?

Consistency

Atomicity

Durability

Resilience

Which of the following technologies can be used to orchestrate the flow of data in a data processing pipeline?

Azure SQL Database

Azure Data Factory

Azure Data Lake Storage Gen2

Azure Synapse Analytics dedicated SQL pools

Is the italicized portion of the following statement true, or does it need to be replaced with one of the other fragments that appear below? Azure Synapse Analytics dedicated SQL pools is an example of a

relational

database.

Nonrelational

NoSQL

Object

No change needed

Which of the following is not a core component of a relational database?

Document

Index

Table

View

Which of the following is the most optimal solution for storing images, telemetry data, and data that is used for distributed analytics solutions?

Azure SQL Database

Azure Blob Storage

Azure Cosmos DB Gremlin API

Azure Files

What data processing approach is typically used to process data for traditional business intelligence solutions?

ELT

Batch

Streaming

ETL

Data that is transformed so that it meets the schema requirements of a destination table is an example of what type of data processing strategy?

Schema-on-upload

Schema-on-read

Schema-on-write

Analytical processing

What technology in Azure allows data engineers to build data processing pipelines with a graphical user interface?

Azure Data Factory mapping data flows

SSIS

Azure Databricks

Azure Logic Apps

Which of the following methods is used to manage the order in which data processing activities are executed?

Data flow

Management flow

Control flow

Orchestration flow

You have been tasked with taking data stored as parquet files in Azure Data Lake Storage Gen2 and loading the most recent three years of data into an Azure Synapse Analytics data warehouse. However, you must first query the parquet data to determine which rows fall within the last three years. Which of the following options will allow you to query the parquet data without requiring you to physically store the data in the data warehouse first?

Azure Synapse Analytic serverless SQL pools

Synapse Pipelines

Synapse Link

Linked Service

Is the italicized portion of the following statement true, or does it need to be replaced with one of the other fragments that appear below?

Prescriptive

analytics involves examining historical data to determine why certain events happened.

Predictive

Diagnostic

Cognitive

No change needed

You are a data analyst for a company that sells different types of bicycles. For an upcoming review of this past quarter's sales, you would like to build a report that shows how well different types of bikes have done in the company's various sales territories. One requirement for this report is that it includes a visualization that displays total sales for each bike subcategory. Which of the following visuals best serves this requirement?

Line chart

Column chart

Scatter plot

Map

What type of index is optimal for database tables that are used in queries that perform large aggregations of data?

Columnstore

Clustered

Nonclustered

Unique

Which Azure SQL option is an example of an IaaS offering?

Azure SQL Database

Azure SQL Managed Instance

SQL Server on an Azure Virtual Machine

Azure Synapse Analytics dedicated SQL pools

Which Azure SQL option requires the least amount of administrative effort and is typically used when building modern cloud applications?

Azure SQL Managed Instance

Azure SQL Database

Azure Synapse Analytics Serverless SQL Pools

SQL Server on an Azure Virtual Machine

You are developing a database platform that will serve an OLTP system and will need to store more than 10 TB of data. The database platform will need to minimize administrative effort as much as possible. Which of the following database and service tier options is the most appropriate for this use case?

Azure SQL Database Hyperscale

Azure SQL Database Elastic Pool

Azure SQL MI, Business Critical

Azure Synapse Analytics dedicated SQL pools

Which of the following options will give specific IP addresses access to an Azure SQL Database's logical server?

Virtual network firewall rules

Private Link

Server-level IP firewall rules

Database-level IP firewall rules

What free tool can be used to determine potential compatibility issues when planning a SQL Server database upgrade or a migration to Azure SQL?

Data Migration Planner

Data Migration Assistant

Database Migration Recommender

Database Migration Service

Which of the following tools can be used to automate Azure resource deployments?

Azure PowerShell

Azure CLI

Azure Resource Manager templates

All of the above

How often does Azure perform a full database backup of an Azure SQL Database?

Once a month

Once a week

Once a day

Once an hour

Which of the following commands is an example of a DML command?

SELECT

CREATE

ALTER

DROP

Which SQL Server feature can be used to obfuscate sensitive data in different columns?

Always Encrypted

Transparent Data Encryption

Dynamic data masking

Column-Level Security

Which of the following open-source databases is available as a PaaS offering in Azure?

PostgreSQL

MySQL

MariaDB

All of the above

Which of the following describes Read Committed isolation for SQL Server?

Transactions running with Read Committed isolation issue locks on involved data at the time of data modification to prevent other transactions from reading dirty data. This is the default isolation level for SQL Server–based database engines.

Transactions running with Read Committed isolation issue read and write locks on involved data until the end of the transaction.

Read Committed isolation is the lowest isolation level, only guaranteeing that physically corrupt data is not read.

Read Committed isolation is the highest isolation level, completely isolating transactions from one another.

When following a star schema design pattern for a data warehouse, which of the following table types is used to store metrics?

Measure table

Dimension table

Materialized table

Fact table

When configuring a SQL Server instance on an Azure VM, what is the recommended storage configuration for the disk, log, and tempdb files?

Place data and log files on the same disk and tempdb on a separate disk.

Place data, log, and tempdb files on separate disks.

Place log and tempdb files on the same disk and data files on a different disk.

Place data and tempdb files on the same disk and log files on a separate disk.

Is the italicized portion of the following statement true, or does it need to be replaced with one of the other fragments that appear below?

Nonrepeatable

reads occur when a transaction reads the same row several times and returns different data each time.

Phantom

Dirty

Inconsistent

No change needed

What type of join will retrieve all data from the left table of a join condition and only data that meets the join condition from the table on the right?

Full inner join

Left inner join

Left outer join

Right outer join

Which of the following nonrelational database types is optimal for storing the relationships between multiple entities?

Graph database

Document database

Key-value store

Columnar database

Which of the following statements is not true about a document in a document database?

Different schemas can be used across multiple documents.

Documents are typically stored as semi-structured data formats, such as JSON, BSON, and XML.

Queries performing specific lookups or filters can only search by a document's key and not by one of the data values.

Documents can easily be distributed across multiple storage devices.

You are designing a data storage solution that will store transactions made on an e-commerce site. The schema for these transactions is very fluid and is typically different for each transaction. There is also a requirement for the database to be able to scale globally, with some of the replicated regions being able to be written to. Which of the following is the most appropriate?

Azure SQL Database

Azure Cosmos DB API for MongoDB

Azure Cosmos DB Cassandra API

Azure Cosmos DB Core (SQL) API

Which of the following is a difference between Azure Table storage and the Azure Cosmos DB Table API?

Entities in Azure Table storage maintain a defined schema, while entities in the Azure Cosmos DB Table API have flexible schemas.

Azure Table storage offers single region replication, while the Azure Cosmos DB Table API offers multi-region replication.

Queries can only perform searches on keys when interacting with Azure Table storage, while the Azure Cosmos DB Table API allows queries to search on keys and values.

The maximum entity size in Azure Table storage is 2 MB, while the Azure Cosmos DB Table API has a maximum entity size of 4 MB.

What is the unit of measure used to represent the throughput required to read and write data stored in Azure Cosmos DB?

Database transaction units (DTUs)

Request Units (RUs)

Throughput units (TUs)

Cosmos DB transaction units (CDTUs)

What type of keys does an Azure Cosmos DB account generate to provide access to its resources? How many are created?

One read-write key and one read-only key

Two read-write keys and one read-only key

One read-write key and two read-only keys

Two read-write keys and two read-only keys

Which consistency level guarantees that all reads will return the most recent version of a document while potentially resulting in slower write performance due to application connections being paused while transactions are committed?

Session

Bounded staleness

Strong

Eventual

What is the name of the field that is used to distribute Azure Cosmos DB data across storage?

Partition key

Distribution key

Primary key

Foreign key

You have been asked to isolate an Azure Cosmos DB account by associating it with a subnet in a virtual network. Which of the following services can you use to attach a private IP address from the subnet to the account?

Private endpoint

Service endpoint

IP endpoint

Access endpoint

As the data architect for your company, you have been tasked with designing a storage solution that is optimized for storing videos, images, audio files, and each file's associated metadata. Which type of data store should you use?

Graph

Document

Object

Columnar

Which of the following storage services is used to replace existing on-premises file shares and is accessible via SMB or NFS protocols?

Azure Blob storage

Azure Files

Azure Data Lake Storage Gen2

Azure Cosmos DB File API

Which of the following access tiers is available for file shares that are hosted on a standard Azure storage account?

Transaction optimized

Hot

Cool

All of the above

What object is used to organize data in Azure Blob Storage?

Container

Answers to the Assessment Test

B. The velocity at which data is processed is defined as either being processed in scheduled batches or streamed in real time. See

Chapter 1

for more information.

D. Lambda architectures have a batch and a serving layer for batch processed data and a speed layer for stream processed data. See

Chapter 1

for more information.

C. When adhering to ACID properties, transactional databases must ensure that transactions are durable and will be available for querying after the database is brought back online from a database failure. See

Chapter 1

for more information.

B. Azure Data Factory can be used to orchestrate the flow of data in a data processing pipeline. It can schedule the order of when different transformation activities need to occur, and allows users to incorporate error handling logic. See

Chapter 1

for more information.

D. Azure Synapse Analytics dedicated SQL pools is a relational database offering that follows a distributed, multi-parallel processing architecture. See

Chapter 1

for more information.

A. Documents represent user-defined content in a NoSQL database such as Azure Cosmos DB or MongoDB. Tables, indexes, and views are core components of a relational database. See

Chapter 1

for more information.

B. Azure Blob Storage is optimized for storing objects such as images and telemetry data. It is also an optimal data store for data that is used by distributed analytics platforms such as Azure Databricks and Azure HDInsight. See

Chapter 1

for more information.

D. The ETL, or Extract, Transform, and Load, approach has been used to build business intelligence solutions for years. This approach involved extracting data from source systems, transforming it to adhere to business rules, and loading it into a data model used for analysis. See

Chapter 1

for more information.

C. Conforming data to a predefined schema is known as schema-on-write. Schema-on-read, on the other hand, is the process of defining a schema as data is read from a storage location. See

Chapter 1

for more information.

A. Azure Data Factory mapping data flows is a graphical tool that gives data engineers the ability to extract data from one or more source systems, transform data through a series of different activities, and then load the data into a destination data store for reporting. See

Chapter 1

for more information.

C. Control flows are used to enforce the correct processing order of data movement and data transformation activities. See

Chapter 1

for more information.

A. Azure Synapse Analytics serverless SQL pools is an interactive service that allows developers to query data in ADLS or Azure Blob Storage. See

Chapter 1

for more information.

B. Diagnostic analytics uses historical data to answer questions about why different events have happened, whereas prescriptive analytics answers questions about what actions should be taken to achieve a particular goal. See

Chapter 1

for more information.

B. Column charts display aggregations for categorical data. See

Chapter 1

for more information.

A. Columnstore indexes compress data in a column-wise format that is ideal for large-scale scans of data that is done when performing aggregations. See

Chapter 2

for more information.

C. Virtual machines are an Infrastructure as a Service (IaaS) offering in Azure. These allow organizations to offload the management of their hardware infrastructure to Azure while providing a mirror image of how the service was hosted in their on-premises environment. The SQL Server on an Azure VM option allows organizations the ability to have full control over the OS and database engine while not needing to host any of the hardware. See

Chapter 2

for more information.

B. Azure SQL Database is a fully managed PaaS relational database in Azure. The hardware, OS, and database engine are completely managed by Microsoft, allowing developers to focus on application development instead of needing to spend time implementing database features such as backup management, high availability, disaster recovery, and advanced threat protection. See

Chapter 2

for more information.

A. Azure SQL Database Hyperscale is used for very large OLTP databases (>4 TB) and can automatically scale storage and compute. It uses a scale-out architecture to store data on filegroups across multiple nodes. See

Chapter 2

for more information.

C. Server-level IP firewall rules for Azure SQL Database opens port 1433 for all databases on a logical server to a specified IP address. See

Chapter 2

for more information.

B. The Data Migration Assistant can be used to detect compatibility issues between versions of SQL Server and make recommendations on how to address them. The Azure Database Migration Service uses the Data Migration Assistant to assess an on-premises SQL Server database's compatibility with the different versions of Azure SQL. See

Chapter 2

for more information.

D. Resource deployments in Azure can be scripted out and automated with Azure PowerShell, Azure CLI, and Infrastructure as Code templates such as Azure Resource Manager templates. See

Chapter 2

for more information.

B. Azure creates a full database backup once a week, while creating differential backups every 12 to 24 hours and transaction log backups every 5 to 10 minutes. See

Chapter 2

for more information.

A. Data Manipulation Language (DML) commands are used to interact with data stored in a database. DML commands can be used to retrieve and aggregate data for analysis, insert new rows, or edit existing rows. See

Chapter 2

for more information.

C. Dynamic data masking obfuscates sensitive data in a database table. It allows users to specify which columns to mask with one of several available masking patterns. See

Chapter 2

for more information.

D. PostgreSQL, MySQL, and MariaDB are available on Azure as PaaS offerings. See

Chapter 2

for more information.

A. Read Committed transactions issue locks on involved data at the time of data modification to prevent other transactions from reading dirty data. However, data can be modified by other transactions, which can result in non-repeatable or phantom reads. See

Chapter 2

for more information.

D. Fact tables store measurable observations or events such as sales totals and inventory. See

Chapter 2

for more information.

B. The recommended configuration for SQL Server storage is to place data, log, and tempdb files on separate drives. See

Chapter 2

for more information.

D. Nonrepeatable reads occur when a transaction reads the same row several times and returns different data each time. See

Chapter 2

for more information.

C. Left outer joins retrieve all data from the table on the left side of the join condition and data that meets the join condition from the table on the right. See

Chapter 2

for more information.

A. Graph databases are specialized databases that focus on storing the relationship between data entities. Applications reading data from graph databases traverse the network of entities, analyzing their relationships. See

Chapter 3

for more information.

C. Unlike key-value stores, documents can be queried by both their key and different data values. See

Chapter 3

for more information.

D. Azure Cosmos DB Core (SQL) API is the native document database API for Azure Cosmos DB. It stores data in a JSON format, allowing documents storing transactions to maintain different schemas. Azure Cosmos DB can be globally distributed to multiple regions around the world, even allowing users to set one or more of the replicated regions to allow write operations. See

Chapter 3

for more information.

B. Azure Table storage only supports one additional replica, which can optionally support read-only workloads. The Azure Cosmos DB Table API supports multi-region replication and supports both read-only and read-write replicas. See

Chapter 3

for more information.

B. Request Units (RUs) are units of compute resources that are used to measure the throughput required to read and write data in Azure Cosmos DB. See

Chapter 3

for more information.

D. Azure Cosmos DB provides primary and secondary keys for read-write and read-only access. This allows users to regenerate and rotate keys without requiring any downtime. See

Chapter 3

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: