E-Book
46,99 €

MCA Microsoft Certified Associate Azure Data Engineer Study Guide E-Book

Benjamin Perkins

0,0

46,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Serie: Sybex Study Guide
Sprache: Englisch

Beschreibung

Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools. This Study Guide also offers: * Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field * Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety * Complimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 1502

Veröffentlichungsjahr: 2023

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Cover

Title Page

Acknowledgments

About the Author

About the Technical Editor

Table of Exercises

Introduction

Who This Book Is For

What This Book Covers

How This Book Is Structured

What You Need to Use This Book

Interactive Online Learning Environment and TestBank

DP‐203 Exam Objectives

Reader Support for This Book

Assessment Test

Answers to Assessment Test

PART I: Azure Data Engineer Certification and Azure Products

Chapter 1: Gaining the Azure Data Engineer Associate Certification

The Journey to Certification

How to Pass Exam DP‐203

Azure Product Name Recognition

Azure Data Analytics

Azure Storage Products

Azure Databases

Azure Security

Azure Networking

Azure Compute

Azure Management and Governance

Summary

Exam Essentials

Review Questions

Chapter 2: CREATE DATABASE dbName; GO

The Brainjammer

A Historical Look at Data

Data Structures, Types, and Concepts

Data Programming and Querying for Data Engineers

Understanding Big Data Processing

Summary

Exam Essentials

Review Questions

PART II: Design and Implement Data Storage

Chapter 3: Data Sources and Ingestion

Where Does Data Come From?

Design a Data Storage Structure

Design a Partition Strategy

Design the Serving/Data Exploration Layer

The Ingestion of Data into a Pipeline

Migrating and Moving Data

Summary

Exam Essentials

Review Questions

Chapter 4: The Storage of Data

Implement Physical Data Storage Structures

Implement Logical Data Structures

Implement a Partition Strategy

Design and Implement the Data Exploration Layer

Additional Data Storage Topics

Summary

Exam Essentials

Review Questions

PART III: Develop Data Processing

Chapter 5: Transform, Manage, and Prepare Data

Ingest and Transform Data

Transformation and Data Management Concepts

Data Modeling and Usage

Summary

Exam Essentials

Review Questions

Chapter 6: Create and Manage Batch Processing and Pipelines

Design and Develop a Batch Processing Solution

Manage Batches and Pipelines

Summary

Exam Essentials

Review Questions

Chapter 7: Design and Implement a Data Stream Processing Solution

Develop a Stream Processing Solution

Ingest and Transform Data

Monitor Data Storage and Data Processing

Summary

Exam Essentials

Review Questions

PART IV: Secure, Monitor, and Optimize Data Storage and Data Processing

Chapter 8: Keeping Data Safe and Secure

Design Security for Data Policies and Standards

Implement Data Security

Develop a Batch Processing Solution

Design and Implement the Data Exploration Layer

Summary

Exam Essentials

Review Questions

Chapter 9: Monitoring Azure Data Storage and Processing

Monitoring Data Storage and Data Processing

Develop a Batch Processing Solution

Develop a Stream Processing Solution

Azure Monitoring Overview

Summary

Exam Essentials

Review Questions

Chapter 10: Troubleshoot Data Storage Processing

Optimize and Troubleshoot Data Storage and Data Processing

Design and Develop a Batch Processing Solution

Monitor Batches and Pipelines

Design and Develop a Stream Processing Solution

Summary

Exam Essentials

Review Questions

Appendix: Answers to Review Questions

Chapter 1: Gaining the Azure Data Engineer Associate Certification

Chapter 2: CREATE DATABASE dbName; GO

Chapter 3: Data Sources and Ingestion

Chapter 4: The Storage of Data

Chapter 5: Transform, Manage, and Prepare Data

Chapter 6. Create and Manage Batch Processing and Pipelines

Chapter 7: Design and Implement a Data Stream Processing Solution

Chapter 8: Keeping Data Safe and Secure

Chapter 9: Monitoring Azure Data Storage and Processing

Chapter 10: Troubleshoot Data Storage Processing

Index

End User License Agreement

List of Tables

Chapter 1

TABLE 1.1 Azure certifications

TABLE 1.2 Popular cloud service offerings

TABLE 1.3 Technical terms and definitions

TABLE 1.4 ADLS‐supported platforms

TABLE 1.5 File/folder access permission levels

TABLE 1.6 Azure storage redundancy

TABLE 1.7 Azure Cosmos DB APIs

TABLE 1.8 NSG example

Chapter 2

TABLE 2.1 File comparison

TABLE 2.2 Common data types

TABLE 2.3 Table category distribution matrix

TABLE 2.4 Wildcard location examples

TABLE 2.5 Spark pool magic commands

TABLE 2.6

PySpark vs. Spark

TABLE 2.7 Azure SDKs packages

TABLE 2.8 Aggregate and mathematical functions

TABLE 2.9

JOIN

types

TABLE 2.10 Big Data processing stages

TABLE 2.11 Azure products and Big Data stages

Chapter 3

TABLE 3.1 Types and tools for ingestion

TABLE 3.2 Analytical datastores

TABLE 3.3 File type use cases

TABLE 3.4 Data landing zones

TABLE 3.5 Slowly changing dimension types

TABLE 3.6 Hot path serving layer and MPP products

TABLE 3.7 Dedicated vs. serverless SQL pools

TABLE 3.8 Dedicated SQL pool performance level

TABLE 3.9 Spark pool node sizes

TABLE 3.10 Apache Spark components

TABLE 3.11 Data Explorer pool workload size

TABLE 3.12 Integration runtimes core count

TABLE 3.13 Azure Databricks cluster modes

TABLE 3.14 Databricks runtime versions

TABLE 3.15 Azure Databricks worker types

TABLE 3.16 Azure Databricks environments

TABLE 3.17 Azure Databricks job types

TABLE 3.18 Azure Databricks user entitlements

TABLE 3.19 Event Hubs vs. IoT Hub

TABLE 3.20 Azure Stream Analytics built‐in functions

TABLE 3.21 Azure Stream Analytics data types

TABLE 3.22 Apache Kafka vs. Event Hubs terminology

Chapter 4

TABLE 4.1 Supported codecs by file format

TABLE 4.2 Cross‐region replication pairings, paired datacenters

TABLE 4.3 ADLS archiving actions

TABLE 4.4 Data flow schema modifiers

TABLE 4.5 Data flow transformation features

TABLE 4.6 Slowly changing dimension types

TABLE 4.7 External location endpoints and protocols

Chapter 5

TABLE 5.1 Data file split recommendation

TABLE 5.2 Brainjammer brain wave values

Chapter 6

TABLE 6.1 Azure Batch resource components

TABLE 6.2 Azure Storage limits

TABLE 6.3 Exercise 6.6 pipeline parameters

TABLE 6.4 Types of pipeline triggers

TABLE 6.5 Copy Data activity—verification results

TABLE 6.6 Copy Data activity—inconsistent data results

TABLE 6.7 Azure DevOps components

Chapter 7

TABLE 7.1 Streaming product capabilities

TABLE 7.2 Additional streaming product capabilities

TABLE 7.3 Streaming scalability by product

TABLE 7.4 Azure streaming products' pricing units

TABLE 7.5 Azure Event Hubs tiers

TABLE 7.6 Stream Analytics input/output partitioning

TABLE 7.7 Data stream illustration

TABLE 7.8 Azure Stream Analytics exceptions

Chapter 8

TABLE 8.1 Azure data product security support

TABLE 8.2 Azure storage account authorization methods

TABLE 8.3 Managed identity types

Chapter 9

TABLE 9.1 Logging verbosity and severity

TABLE 9.2 Synapse platform system dynamic management views

TABLE 9.3 database_transaction_state column description

TABLE 9.4 DMVs for troubleshooting PolyBase

TABLE 9.5 Azure Synapse Analytics workspace metrics

TABLE 9.6 Dedicated SQL pool metrics

TABLE 9.7 Apache Spark pool metrics

TABLE 9.8 Different types of testing

TABLE 9.9 Azure Stream Analytics metrics

Chapter 10

TABLE 10.1 Performance and troubleshooting antipatterns

TABLE 10.2 Database partition analysis features

TABLE 10.3 Dedicated SQL pool indexes

TABLE 10.4 Index‐related Dynamic Management Views

TABLE 10.5 Query Store stored procedures

TABLE 10.6 Transaction and HTAP dynamic management views

TABLE 10.7 Data Flow Compute size

List of Illustrations

Chapter 1

FIGURE 1.1 Comparing the Azure data scientist and analyst roles

FIGURE 1.2 The Azure data engineer role

FIGURE 1.3 The Azure database administrator associate role

FIGURE 1.4 A path to the Azure Data Engineer Associate certification

FIGURE 1.5 The extract, transform, and load (ETL) approach

FIGURE 1.6 A data streaming pipeline

FIGURE 1.7 Azure portal security product and feature security hierarchy

FIGURE 1.8 An Azure data security diagram with products and features

FIGURE 1.9 Using Azure Key Vault and MI

FIGURE 1.10 Azure privacy and governance products

FIGURE 1.11 Azure health and monitoring products

FIGURE 1.12 Azure Synapse pools, performance, and debugging

FIGURE 1.13 Azure feature updates

FIGURE 1.14 Azure Analytics product documentation

FIGURE 1.15 Azure products in preview

FIGURE 1.16 Azure Synapse Analytics services

FIGURE 1.17 Azure Synapse Analytics Studio

FIGURE 1.18 Azure Databricks workspace

FIGURE 1.19 Azure HDInsight most popular supported open source frameworks

FIGURE 1.20 Azure Analysis Services

FIGURE 1.21 Azure Data Factory Studio

FIGURE 1.22 Azure Stream Analytics data flow

FIGURE 1.23 Azure Cosmos DB and supported APIs

FIGURE 1.24 Azure Active Directory portal

FIGURE 1.25 Role‐based access control scope

FIGURE 1.26 Azure App Service Managed Identity

FIGURE 1.27 Azure Managed Identity in Azure Active Directory

FIGURE 1.28 Azure Managed Identity in Azure Key Vault

FIGURE 1.29 Azure Monitor

FIGURE 1.30 Tags for Azure products

Chapter 2

FIGURE 2.1 Big Data characteristics

FIGURE 2.2 Tables in a relational database

FIGURE 2.3 The Select SQL Deployment Option blade

FIGURE 2.4 Azure Data Studio

FIGURE 2.5 A view of data tables in Azure Data Studio

FIGURE 2.6 Azure Cosmos DB APIs

FIGURE 2.7 Azure Cosmos Data Explorer

FIGURE 2.8 Azure Cosmos Data Explorer SQL query

FIGURE 2.9 Azure Synapse Analytics Sharding example

FIGURE 2.10 Azure Synapse Analytics hash table distribution

FIGURE 2.11 Azure Synapse Analytics replicated table distribution

FIGURE 2.12 Azure Synapse Analytics external tables

FIGURE 2.13 Azure Synapse Analytics external tables example

FIGURE 2.14 ADLS directory hierarchy example

FIGURE 2.15 Schemas, views, and users as seen in SSMS

FIGURE 2.16 A star schema example

FIGURE 2.17 A snowflake schema example

FIGURE 2.18 SQL Query from view table and new schema

FIGURE 2.19 A data skew example

FIGURE 2.20 Data streaming solution using Apache Kafka and Apache Spark

FIGURE 2.21 Visual Studio data workloads

FIGURE 2.22 Visual Studio C# code example

FIGURE 2.23 Where to implement and use an SDK

FIGURE 2.24 Output of running the

PDW_SHOWSPACEUSED

command

FIGURE 2.25 Output of running the

SHOW_STATISTICS

command

FIGURE 2.26 Using

CONVERT

and

CAST

SQL commands

FIGURE 2.27 SQL query output from

FIRST_VALUE

and

LAST_VALUE

FIGURE 2.28 Representation of SQL JOINs

FIGURE 2.29

OPENJSON

query

FIGURE 2.30 Big Data stages and Azure products

FIGURE 2.31 Pipeline data flow Azure Synapse transform stage

Chapter 3

FIGUER 3.1 Data producers and processing services

FIGUER 3.2 An Azure storage account Overview blade

FIGUER 3.3 The Upload folder in Azure Storage Explorer

FIGUER 3.4 Files uploaded to ADLS using Azure Storage Explorer

FIGUER 3.5 Storing Parquet files in an ADLS container

FIGUER 3.6 The brainjammer directory structure

FIGUER 3.7 The brainjammer

raw‐files

Guide

Cover

Table of Contents

Title Page

Acknowledgments

About the Author

About the Technical Editor

Table of Exercises

Introduction

Begin Reading

Appendix: Answers to Review Questions

Index

End User License Agreement

Pages

iii

vii

xxiii

xxiv

xxv

xxvii

xxviii

xxix

xxx

xxxi

xxxii

xxxiii

xxxiv

xxxv

xxxvi

xxxvii

xxxviii

xxxix

xli

xlii

xliii

xliv

xlv

xlvi

xlvii

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

915

916

917

918

919

920

921

922

923

925

926

927

928

929

930

931

932

933

934

935

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

MCAMicrosoft Certified Associate Azure® Data Engineer

Study GuideExam DP-203

Benjamin Perkins

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada and the United Kingdom.

ISBNs: 9781119885429 (paperback), 9781119885443 (ePDF), 9781119885436 (ePub)

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at www.wiley.com/go/permission.

Trademarks: WILEY, the Wiley logo, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. Microsoft and Azure are registered trademarks of Microsoft Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Control Number: 2023941199

Acknowledgments

Creating a book starts first as an idea, which then iterates through many versions, until it takes the form of something consumable. Many people helped to progress this book from idea to final product. Here is a list of those who played a significant role in the creation of this book and the organization of its content:

Ken Brown, senior acquisitions editor

Robyn Alvarez, project manager

Heini Ilmarinen, technical editor

John Sleeva, copyeditor

Nancy Carrasco, proofreader

Writing this book—and writing in general—has become something I enjoy. Writing gives me the opportunity to share some of my technical knowledge and experiences so that others can gain some knowledge and insights. In addition to sharing my words, I gain an even greater understanding of the topic, as I structure the content, conduct research, and create hands‐on exercises. Writing a book requires a huge effort, but there are many reasons to do it. I'd like to thank my family for their support while I was writing this book. I know it took hours away from them. Thanks, Andrea, Lea, and Noa. You are the reason and my purpose.

About the Author

Benjamin Perkins is currently employed at Microsoft in Munich, Germany, as a Senior Escalation Engineer on the Azure team. He has been working professionally in the IT industry for close to three decades. He started computer programming with QBasic at the age of 11 on an Atari 1200XL desktop computer. He takes pleasure in the challenges that troubleshooting technical issues have to offer and savors in the rewards of a well‐written program. After completing high school, he joined the United States Army. After successfully completing his military service, he attended Texas A&M University in College Station, Texas, where he received a Bachelor of Business Administration in Management Information Systems. He also received a Master of Business Administration from the European University.

His roles in the IT industry have spanned the entire spectrum, including programmer, system architect, technical support engineer, team leader, and mid‐level manager. While employed at Hewlett‐Packard and Compaq Computer Corporation, he received numerous awards, degrees, and certifications. He has a passion for technology and customer service and looks forward to troubleshooting and writing more world‐class technical solutions: “My approach is to write code with support in mind, and to write it once correctly and completely so we do not have to come back to it again, except to enhance it.”

Benjamin has written numerous magazine articles and training courses and is an active blogger. His catalog of books covers C# programming, IIS, NHibernate, and Microsoft Azure.

Benjamin is married to Andrea and has two wonderful children, Lea and Noa.

About the Technical Editor

Heini Ilmarinen is a data enthusiast with a passion for architecture and DevOps. Heini currently works as Azure Lead and DevOps Consultant at Polar Squad, helping customers bring their data platforms to life in Azure.

Heini initially studied to become a mathematics teacher, graduating from Helsinki University with a Master of Science. After graduating, she transitioned to the IT industry, leveraging her skills for problem‐solving and making complex topics easy to understand. In IT, Heini started her career working in infrastructure architecture development projects in hybrid environments. With architecture as a starting point, her career developed from working with Azure to getting deeper into data projects to topics related to DevOps.

Over the years, Heini has worked in a multitude of Azure projects, from application development to data projects, gaining a broad understanding of the requirements for creating functional, production‐ready solutions. For the past two years, she has also engaged in community events and public speaking, gaining the Data Platform MVP award.

Heini can be often found riding her snowboard and enjoying the fresh air, or riding up and down hills on her mountain bike.

Table of Exercises

Exercise 2.1

Create an Azure SQL DB

Exercise 2.2

Create an Azure Cosmos DB

Exercise 2.3

Create a Schema and a View in Azure SQL

Exercise 3.1

Create an Azure Data Lake Storage Container

Exercise 3.2

Upload Data to an ADLS Container

Exercise 3.3

Create an Azure Synapse Analytics Workspace

Exercise 3.4

Create an Azure Synapse Analytics Linked Service

Exercise 3.5

Configure an Azure Synapse Analytics Workspace Package

Exercise 3.6

Configure an Azure Synapse Analytics Workspace with GitHub

Exercise 3.7

Configure Azure Synapse Analytics Data Hub SQL Pool Staging Tables

Exercise 3.8

Configure Azure Synapse Analytics Data Hub with Azure Cosmos DB

Exercise 3.9

Configure an Azure Synapse Analytics Integrated Dataset

Exercise 3.10

Create an Azure Data Factory

Exercise 3.11

Create a Linked Service in Azure Data Factory

Exercise 3.12

Create a Dataset in Azure Data Factory

Exercise 3.13

Create a Pipeline to Convert XLSX to Parquet

Exercise 3.14

Create an Azure Databricks Workspace with an External Hive Metastore

Exercise 3.15

Configure Delta Lake

Exercise 3.16

Create an Azure Event Namespace and Hub

Exercise 3.17

Create an Azure Stream Analytics Job

Exercise 4.1

Implement Compression

Exercise 4.2

Implement Partitioning

Exercise 4.3

Implement Data Redundancy

Exercise 4.4

Implement Distributions

Exercise 4.5

Implement Data Archiving

Exercise 4.6

Azure Synapse Analytics Data Hub SQL Script

Exercise 4.7

Azure Synapse Analytics Develop Hub Notebook

Exercise 4.8

Azure Synapse Analytics Develop Hub Data Flow

Exercise 4.9

Build a Temporal Data Solution

Exercise 4.10

Azure Synapse Analytics Data Hub Data Flow

Exercise 4.11

Build External Tables on a Serverless SQL Pool

Exercise 4.12

Implement Efficient File and Folder Structures

Exercise 4.13

Implement a Serving Layer with a Star Schema

Exercise 4.14

Implement a Dimensional Hierarchy

Exercise 5.1

Transform Data Using Azure Synapse Pipeline

Exercise 5.2

Transform Data Using Azure Data Factory

Exercise 5.3

Transform Data Using Apache Spark—Azure Synapse Analytics

Exercise 5.4

Transform Data Using Apache Spark—Azure Databricks

Exercise 5.5

Cleanse Data

Exercise 5.6

Split Data

Exercise 5.7

Azure Cosmos DB—Shred JSON

Exercise 5.8

Flatten, Explode, and Shred JSON

Exercise 5.9

Encode and Decode Data

Exercise 5.10

Normalize and Denormalize Values

Exercise 5.11

Perform Exploratory Data Analysis—Transform

Exercise 5.12

Perform Exploratory Data Analysis—Visualize

Exercise 5.13

Transform and Enrich Data

Exercise 5.14

Transform Data by Using Apache Spark—Azure Databricks

Exercise 5.15

Predict Data Using Azure Machine Learning

Exercise 6.1

Create an Azure Batch Account and Pool

Exercise 6.2

Develop a Batch Processing Solution Using an Azure Synapse Analytics Pipeline

Exercise 6.3

Develop a Batch Processing Solution Using an Azure Synapse Analytics Apache Spark

Exercise 6.4

Develop a Batch Processing Solution Using Azure Databricks

Exercise 6.5

Develop a Batch Processing Solution Using an Azure Data Factory Pipeline

Exercise 6.6

Create Data Pipelines—Advanced

Exercise 6.7

Create a Scheduled Trigger

Exercise 6.8

Create and Schedule an Azure Databricks Workflow Job

Exercise 6.9

Handle Duplicate Data with a Data Flow

Exercise 6.10

Upsert Data

Exercise 6.11

Implement Incremental Data Loads

Exercise 6.12

Validate Batch Loads by Using a Validation Activity

Exercise 6.13

Validate Batch Loads by Using a Lookup Activity

Exercise 7.1

Add an Output ADLS Container to an Azure Stream Analytics Job

Exercise 7.2

Develop a Stream Processing Solution with Azure Stream Analytics—Testing the Data

Exercise 7.3

Develop a Stream Processing Solution with Azure Stream Analytics

Exercise 7.4

Use Reference Data with Azure Stream Analytics

Exercise 7.5

Stream Data to Power BI from Azure Stream Analytics

Exercise 7.6

Stream Data with Azure Databricks

Exercise 7.7

Develop and Create Windowed Aggregates

Exercise 7.8

Upsert Stream Processed Data in Azure Cosmos DB

Exercise 7.9

Handle Schema Drift in Azure Stream Analytics

Exercise 7.10

Replay an Archived Stream Data in Azure Stream Analytics

Exercise 8.1

Create an Azure Key Vault Resource

Exercise 8.2

Create a Microsoft Purview Account

Exercise 8.3

Configure and Perform a Data Asset Scan Using Microsoft Purview

Exercise 8.4

Audit an Azure Synapse Analytics Dedicated SQL Pool

Exercise 8.5

Apply Sensitivity Labels and Data Classifications Using Microsoft Purview and Data Discovery

Exercise 8.6

Implement a Data Retention Policy

Exercise 8.7

Implement Column-Level Security

Exercise 8.8

Implement Data Masking

Exercise 8.9

Create a User-Assigned Managed Identity

Exercise 8.10

Connect to an ADLS Container from Azure Databricks Cluster Using ABFSS

Exercise 8.11

Use an Azure Key Vault Secret to Store an Authentication Key for a Linked Service

Exercise 8.12

Implement Azure RBAC for ADLS

Exercise 8.13

Implement POSIX-Like ACLs for ADLS

Exercise 8.14

Create an Azure Storage Account and ADLS Container with a VNet

Exercise 8.15

Create an Azure Synapse Analytics Workspace with a VNET

Exercise 9.1

Create an Azure Monitor Workspace

Exercise 9.2

Create an Azure Synapse Analytics Alert

Exercise 9.3

Monitor and Manage Azure Synapse Analytics Logs

Exercise 10.1

Compact Small Files

Introduction

A long time ago, I was sitting at my desk happily coding my Active Server Page (ASP) and COM component, when someone approached me and asked if I knew anything about databases. Without even a pause, I answered a confident yes, most people in IT know "something" about databases, right? Well, it turned out that a big project was starting, and they needed someone to create and manage a database. I acquired a server, installed a relational database management system (RDMBS), and executed CREATE DATABASE dbName; GO. And the rest is history. I like to call that out because these days, most of the data storage architecture already exists when you start the job. You must learn what someone else created. You experience problems but do not know why, because a lot happened before you started.

The new emerging technology called big data is providing a rare opportunity, kind of like the one I had. The opportunity is to build and/or be involved in creating an IT data analytics solution from the beginning. Being the person or the team who builds the framework and foundation of what could become a system that shapes the future of a company is career‐altering. The experience is a differentiator that stays with you for the rest of your career, as it has in mine. But it could also be a catastrophe for numerous reasons, such as not being able to scale, being too hard to make changes, and not being reliable.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben: