Sybex's Study Guide for Snowflake SnowPro Core Certification - Hamid Mahmood Qureshi - E-Book

Sybex's Study Guide for Snowflake SnowPro Core Certification E-Book

Hamid Mahmood Qureshi

0,0
38,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Prepare smarter, faster, and better with the premier study guide for Snowflake SnowPro Core certification Snowflake, a cloud-based data warehousing platform, has steadily gained popularity since its 2014 launch. Snowflake offers several certification exams, of which the SnowPro Core certification is the foundational exam. The SnowPro Core Certification validates an individual's grasp of Snowflake as a cloud data warehouse, its architectural fundamentals, and the ability to design, implement, and maintain secure, scalable Snowflake systems. Sybex's Study Guide for Snowflake SnowPro Certification delivers comprehensive coverage of every relevant exam topic on the Snowflake SnowPro Core Certification test. Prepare efficiently and effectively for the exam with online practice tests and flashcards, a digital glossary, and concise and easy-to-follow instruction from the subject-matter experts at Sybex. You'll gain the necessary knowledge to help you succeed in the exam and will be able to apply the acquired practical skills to real-world Snowflake solutions. This Study Guide includes: * Comprehensive understanding of Snowflake's unique shared data, multi-cluster architecture * Guidance on loading structured and semi-structured data into Snowflake * Utilizing data sharing, cloning, and time travel features * Managing performance through clustering keys, scaling compute up, down & across * Steps to account management and security configuration including RBAC & MFA * All the info you need to obtain a highly valued credential for a rapidly growing new database software solution * Access to the Sybex online learning center, with chapter review questions, full-length practice exams, hundreds of electronic flashcards, and a glossary of key terms Perfect for anyone considering a new career in cloud-based data warehouse solutions and related fields, Sybex's Study Guide for Snowflake SnowPro Certification is also a must-read for veteran database professionals seeking an understanding of one of the newest and fastest-growing niches in data.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 374

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

Dedication

Acknowledgments

About the Author

About the Technical Editor

Table of Exercises

Introduction

What Is Snowflake?

Who Should Buy This Book

How This Book Is Organized

Bonus Digital Contents

Conventions Used in This Book

SnowPro Study Guide Exam Objectives

Domain 1.0: Snowflake Cloud Data Platform Features and Architecture

Domain 2.0: Account Access and Security

Domain 3.0: Performance Concepts

Domain 4.0: Data Loading and Unloading

Domain 5.0: Data Transformation

Domain 6.0: Data Protection and Data Sharing

How to Contact the Publisher

Assessment Test

Answers to Assessment Test

Chapter 1: Introduction and Overview

Introducing Snowflake

Snowflake Certifications

Signing Up for a Snowflake Trial

Summary

Exam Essentials

Review Questions

Chapter 2: Snowflake Architecture

Traditional Database Architectures

Snowflake’s Hybrid Architecture

Summary

Exam Essentials

Review Questions

Chapter 3: Interfaces and Connectivity

Snowflake Web UI

Snowflake Partners

Snowflake Connectors and Drivers

New Snowflake Web Interface

Extending Snowflake Functionality

Summary

Exam Essentials

Review Questions

Chapter 4: Loading Data

Data Loading in Snowflake

What Is a Stage?

Data Loading via Internal Stages

Data Ingestion Using the Named External Stage

Loading Data via the Snowflake Web UI

Basic Data Transformations While Ingesting

External Tables

Semi‐Structured Data

Unstructured Data

Unloading Data from Snowflake

Load Near‐Real‐Time Streaming Data

Optimizing Data Loading and Unloading

Summary

Exam Essentials

Review Questions

Chapter 5: Data Pipelines

Introducing Tasks

Introducing Streams

Summary

Exam Essentials

Review Questions

Chapter 6: Continuous Data Protection

Components of Continuous Data Protection

Time Travel and Fail‐safe

Undrop Using Time Travel

Time Travel and Fail‐safe Storage Costs

Summary

Exam Essentials

Review Questions

Chapter 7: Cloning and Data Sharing

Zero‐Copy Cloning

Secure Data Sharing in Snowflake

Summary

Exam Essentials

Review Questions

Chapter 8: Performance

Snowflake Performance Considerations

Virtual Warehouse Configuration

Caching in Snowflake

Micro‐partition Pruning

Micro‐partitions and Data Clustering

Materialized Views

Summary

Exam Essentials

Review Questions

Chapter 9: Security

Data Encryption at Rest

Authentication

Authorization

Network

Compliance with Standards

Summary

Exam Essentials

Review Questions

Chapter 10: Account and Resource Management

Resource Monitors

System Usage and Billing

Snowflake Releases

Summary

Exam Essentials

Review Questions

Appendix: Answers to the Review Questions

Chapter 1: Introduction and Overview

Chapter 2: Snowflake Architecture

Chapter 3: Interfaces and Connectivity

Chapter 4: Loading Data

Chapter 5: Data Pipelines

Chapter 6: Continuous Data Protection

Chapter 7: Cloning and Data Sharing

Chapter 8: Performance

Chapter 9: Security

Chapter 10: Account and Resource Management

Index

End User License Agreement

List of Tables

Chapter 1

TABLE 1.1 Snowflake feature matrix

Chapter 6

TABLE 6.1 Time Travel durations for various Snowflake editions

TABLE 6.2 Time Travel comparison of permanent, temporary, and transient tabl...

Chapter 8

TABLE 8.1 Comparison between Standard and Economy scaling policies

Chapter 10

TABLE 10.1 ACCOUNT_USAGE historical views

TABLE 10.2 Differences between INFORMATION_SCHEMA and ACCOUNT_USAGE

List of Illustrations

Chapter 1

FIGURE 1.1 Snowflake certification map

Chapter 2

FIGURE 2.1 Shared‐disk and shared‐nothing architectures

FIGURE 2.2 Snowflake’s hybrid architecture

FIGURE 2.3 Cloud services layer

FIGURE 2.4 Micro‐partitions are transparent.

FIGURE 2.5 Micro‐partitioning in Snowflake

FIGURE 2.6 Metadata at the micro‐partition level

FIGURE 2.7 A single node in a virtual warehouse cluster

FIGURE 2.8 Multiple virtual warehouses of varying sizes

Chapter 3

FIGURE 3.1 Snowflake web UI

FIGURE 3.2 Navigation Bar

FIGURE 3.3 Worksheets view

FIGURE 3.4 Account page

FIGURE 3.5 Downloads page

FIGURE 3.6 User Preferences drop‐down

FIGURE 3.7 Link to the new web interface

FIGURE 3.8 New web interface navigation

Chapter 4

FIGURE 4.1 Data loading

FIGURE 4.2 Staging area outside a data warehouse

FIGURE 4.3 Staging area inside a data warehouse

FIGURE 4.4 Types of stages in Snowflake and their use

FIGURE 4.5 Loading on‐premises data using SnowSQL and a table stage

FIGURE 4.6 Loading on‐premises data via SnowSQL and a user stage

FIGURE 4.7 Loading on‐premises data via SnowSQL and a named internal stage

FIGURE 4.8 Loading data from cloud storage via the external stage

FIGURE 4.9 Accessing the Load Data wizard

FIGURE 4.10 Selecting a compute warehouse

FIGURE 4.11 Selecting source files or stage for loading

FIGURE 4.12 File Format screen

FIGURE 4.13 Loading unstructured data

FIGURE 4.14 Data unload mechanisms in Snowflake

FIGURE 4.15 Loading data through Snowpipe

Chapter 5

FIGURE 5.1 Multiple tasks tree

FIGURE 5.2 Streams and change data capture

Chapter 6

FIGURE 6.1 Snowflake stores data in micro‐partitions.

FIGURE 6.2 Snowflake stores data in micro‐partitions.

FIGURE 6.3 Time Travel and Fail‐safe durations

FIGURE 6.4 Time Travel and Fail‐safe storage for different updates

Chapter 7

FIGURE 7.1 Zero‐copy cloning and micro‐partitions

FIGURE 7.2 Updates to the source table in a cloning scenario

FIGURE 7.3 Updates to the cloned table in a cloning scenario

FIGURE 7.4 Data sharing and micro‐partitions

FIGURE 7.5 Direct Sharing

FIGURE 7.6 Process flow for direct data sharing

FIGURE 7.7 Sharing secure views

FIGURE 7.8 Secure Data Sharing with non‐Snowflake users

FIGURE 7.9 Snowflake Data Marketplace interface

FIGURE 7.10 Get Data button in the Data Marketplace

FIGURE 7.11 Data Exchange

Chapter 8

FIGURE 8.1 Query profile

FIGURE 8.2 Multi‐cluster warehouse in autoscaling mode

FIGURE 8.3 Types of caches in Snowflake

FIGURE 8.4 Query profile indicating metadata‐based result

FIGURE 8.5 Query profile indicating query result cache reuse

FIGURE 8.6 Query result cache generation and purge over time

FIGURE 8.7 Table data stored in micro‐partitions

FIGURE 8.8 Table reclustered on the Product column

FIGURE 8.9 A well‐clustered table

FIGURE 8.10 Clustering depth illustrated

FIGURE 8.11 Materialized View illustrated

Chapter 9

FIGURE 9.1 Security features in Snowflake

FIGURE 9.2 MFA flow in Snowflake

FIGURE 9.3 SCIM‐based user provisioning

FIGURE 9.4 Access control in Snowflake

FIGURE 9.5 Securable objects in Snowflake

FIGURE 9.6 Snowflake roles and role hierarchy

Chapter 10

FIGURE 10.1 Monitoring at virtual warehouse and account levels

FIGURE 10.2 Schemas within the Snowflake database

Guide

Cover

Table of Contents

Title Page

Copyright

Dedication

Acknowledgments

About the Author

About the Technical Editor

Table of Exercises

Introduction

Begin Reading

Appendix: Answers to the Review Questions

Index

End User License Agreement

Pages

iii

iv

v

vii

ix

xi

xx

xxi

xxii

xxiii

xxiv

xxv

xxvi

xxvii

xxviii

xxix

xxx

xxxi

xxxii

xxxiii

xxxiv

xxxv

xxxvi

xxxvii

xxxviii

xxxix

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

235

236

237

238

239

240

241

242

243

244

245

Sybex’sStudy Guide for Snowflake SnowPro CoreCertification

COF‐C02 EXAM

 

Hamid Mahmood Qureshi

 

 

Copyright © 2023 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

978-1-119-82444-2978-1-119-82445-9 (ebk.)978-1-119-82446-6 (ebk.)

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permission.

Trademarks: WILEY, the Wiley logo, and the Sybex logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Control Number: 2022939912

Cover image: © Jeremy Woodhouse/Getty ImagesCover design: Wiley

 

To my wife and my children, who have provided support and encouragement throughout the strenuous process of writing this book.

Acknowledgments

Although this book lists my name as author, it would not be in its current form without the absolutely amazing team that contributed to its creation. Acquisitions Editor Devon Cajas got the wheels rolling on this book. Janet Wehner served as project manager, supported by Managing Editor Pete Gaughan and Saravanan Dakshinamurthy, content refinement specialist. Special thanks to the technical editor, Hassaan Sajid, who reviewed each chapter for content, structure, and technical correctness; Nancy Carrasco, who proofread and corrected many of my writing shortcomings; and Christopher Marland, who reviewed the book technically for correctness. Special thanks to all the other people who contributed to this book.

About the Author

Hamid Mahmood Qureshi is a Senior Cloud and Datawarehouse Professional with two decades of experience architecting, designing, and leading the deployment of many data warehouses and business intelligence solutions. He has substantial experience and qualifications with various data analytics systems, including Teradata, Oracle, Hadoop, and modern cloud-based tools like Snowflake. Having worked extensively with traditional technologies. combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics, which he has captured in his publications.

About the Technical Editor

Hassaan Sajid has 13 years of experience in data warehousing and business intelligence (BI/DW) in the retail, telecommunications, banking, insurance, and government sectors. He is currently working at Coles Australia, where his role as a senior technical analyst enables the business to understand the power and value of data. In the past, he has worked with various clients in Australia, UAE, Pakistan, Saudi Arabia, and the United States in multiple BI/DW roles including BI architect, BI developer, ETL developer, data modeler, operations analyst, data analyst, and technical trainer. He holds a master's degree in business intelligence and is a Professional Scrum Master. He is also certified in Snowflake, MicroStrategy, Tableau, Power BI, and Teradata. His hobbies include reading, traveling, and photography.

Table of Exercises

Exercise  1.1

 

Signing Up for a Snowflake Instance

Exercise  2.1

 

Creating a New Virtual Warehouse Using the UI

Exercise  2.2

 

Creating a New Virtual Warehouse Using SQL

Exercise  3.1

 

Installing and Using SnowSQL

Exercise  4.1

 

Loading On-premises Data from the Table Stage

Exercise  4.2

 

Loading On-premises Data via the User Stage

Exercise  4.3

 

Loading On-premises Data via the Named Internal Stage

Exercise  4.4

 

Loading Cloud Data via a Named External Stage

Exercise  4.5

 

Create an External Table on Cloud Storage

Exercise  4.6

 

Loading NDJSON Data via an External Stage

Exercise  4.7

 

Loading JSON Data via an Eternal Stage

Exercise  4.8

 

Unloading Data via a Named Internal Stage

Exercise  5.1

 

Simple Task to Load Data into a Table

Exercise  5.2

 

Creating a Simple Task Tree

Exercise  5.3

 

Creating a Simple Serverless Task

Exercise  5.4

 

Creating a Stream to Track and Process Changes

Exercise  6.1

 

Traveling to a Point in Time

Exercise  6.2

 

Traveling to an Offset from the Current Time

Exercise  6.3

 

Retrieve Data as It Existed before a DML Query

Exercise  6.4

 

Undrop a Dropped Table

Exercise  7.1

 

Cloning a Table

Exercise  7.2

 

Cloning a Database

Exercise  7.3

 

Cloning a Table with Time Travel

Exercise  7.4

 

Sharing a Table with Another Snowflake Account

Exercise 10.1

 

Setting a Credit Quota at an Account Level

Exercise 10.2

 

Setting a Credit Quota at a Warehouse Level

Exercise 10.3

 

Viewing Credit Usage

Exercise 10.4

 

Viewing Storage Usage

Introduction

Why should you learn Snowflake? Over the last few years, Snowflake has become increasingly popular with data‐led organizations and data enthusiasts. Snowflake is a data warehousing and data analysis platform built for the cloud and has been built from scratch to take full advantage of cloud features. It provides several novel features that changes how you work with data platforms.

With its increasing popularity, learning Snowflake means you increase your standing in the job market. Even if you are already knowledgeable in one or more data warehousing platforms, understanding Snowflake will give you an edge when searching for a new job or aiming for a promotion. Snowflake does not have a steep learning curve because of its simplicity and its use of SQL as the primary language. Once you have grasped the foundations of Snowflake architecture, the rest of the learning naturally flows.

Snowflake has a range of certifications available. SnowPro Core certification is at the foundation level, while other role‐specific advanced certifications are also available for focused areas such as data engineering, administration, or data science on the Snowflake platform. The SnowPro Core certification is a prerequisite for the advanced certifications; therefore, it's a good way to kick‐start your Snowflake journey.

This book aims to help you pass the SnowPro Core exam. This exam covers all the foundational concepts, such as Snowflake architecture, virtual warehouses, storage, security, Time Travel, cloning, data sharing, and data loading. These topics are emphasized in this book to help you prepare for the exam. Even after you pass the SnowPro Core exam, this book should remain a useful reference for you.

What Is Snowflake?

Snowflake is a relatively new cloud‐optimized database, designed primarily for data warehousing solutions on the cloud. Snowflake Inc. was founded in 2012 by three data warehousing experts: Benoit Dageville, Thierry Cruanes, and Marcin Zukowski. Snowflake was designed from the ground up as a cloud‐only data warehouse platform.

A significant advantage of the Snowflake platform compared to other data warehouses is its differentiated architecture. Snowflake takes full advantage of the underlying cloud platform's novel features. The outcome is a hybrid architecture in which the computation or processing layer can be scaled independently of the storage layer. Snowflake's hybrid architecture also brings out a unique array of features not seen in many other databases, introducing concepts such as zero‐copy cloning, Time Travel, and secure data sharing.

Why Become SnowPro Core Certified?

There are several good reasons to get your SnowPro Core certification:

Professional Development

   Certifications are an easy pathway for continuous professional development and adding to your skillset. Because the SnowPro Core certification exam tests all the foundational concepts of Snowflake, even preparing for the exam will give you enough technical skills to develop Snowflake‐based solutions.

Increases Your Marketability

   SnowPro Core certification increases your marketability to prospective employers. SnowPro Core certified candidates may negotiate a better starting salary because hiring businesses don't need to spend on training and upskilling such candidates.

Takes You a Step Closer to Advanced Certifications

   The SnowPro Core exam is a steppingstone to several advanced Snowflake certifications, such as SnowPro Advanced Data Engineer. Gaining the SnowPro Core certification will allow you to undertake any of the advanced Snowflake certifications suited to your line of work.

Provides Proof of Professional Achievement

   Organizations recognize the importance and benefits of certifications such as SnowPro Core and are increasingly encouraging their employees to obtain certification. As a result, a growing number of individuals add the SnowPro Core certifications to their professional profiles every day.

Raises Employer Confidence

   Management is more confident to take on significant initiatives and work for new clients when they have faith in the team's skills. Certifications are a great way to instill the skills in your team and gain management's confidence. If you can prove you have the necessary skills, they are more likely to undertake new initiatives and likely to put you in charge with confidence in your skills.

How to Become SnowPro Core Certified

The SnowPro Core certification is available to anyone who wishes to take the exam. There are no prerequisites; however, it is recommended that you develop an understanding and experience of the Snowflake platform with self‐study and by performing hands‐on exercises and experimentation with the Snowflake platform.

Snowflake's SnowPro Certification exam is administered by Pearson Vue, and you can take it in any of the 1,000+ testing centers worldwide or at home with a virtual proctor.

To register for the exam with Pearson Vue, visit https://home.pearsonvue.com/snowflake. You will need to register for a new account. Once you do, you can schedule your SnowPro Core exam.

Finally, Snowflake exam policies are subject to change. Please be sure to check www.snowflake.com/certifications for the current policies before you register and take the exam.

 Like all exams, the SnowPro Core certification from Snowflake is updated periodically and may eventually be retired or replaced. At some point after Snowflake is no longer offering this exam, the old editions of our books and online tools will be retired. If you have purchased this book after the exam was retired, or are attempting to register in the Sybex online learning environment after the exam was retired, please know that we make no guarantees that this exam’s online Sybex tools will be available once the exam is no longer available.

Who Should Buy This Book

This book is useful for anyone who wishes to pass the SnowPro Core certification exam. If you're new to Snowflake, this book provides the basics you'll need to learn Snowflake from scratch. The book also provides in‐depth information you need to complete the exam.

How This Book Is Organized

This book consists of 10 chapters plus supplementary information, including a glossary and an assessment test, which follows this introduction. The chapters are organized as follows:

Chapter 1

, “Introduction and Overview,”

covers the history of Snowflake and discusses the various Snowflake certifications with a focus on the SnowPro Core certification.

Chapter 2

, “Snowflake Architecture,”

provides a detailed understanding of the unique Snowflake architecture, how the storage is decoupled from the compute, how data is stored in the form of micro‐partitions, and the advantages of this unique architecture. The chapter also covers the critical architectural layers in Snowflake.

Chapter 3

, “Interfaces and Connectivity,”

talks about various interfaces, focusing on Snowflake web interfaces and SnowSQL, through which a user can interact with a Snowflake instance. The chapter also describes various connectors and drivers that can be used to connect to Snowflake. Finally, the chapter discusses the partner ecosystem in Snowflake.

Chapter 4

, “Loading Data,”

discusses the variety of methods provided by Snowflake for loading data, including bulk data loading and processing data in a continuous manner. This chapter walks you through the concept of staging in Snowflake, describing the internal and external stage types. The chapter also discusses loading semi‐structured data, basic data transformations, and exporting data from Snowflake.

Chapter 5

, “Data Pipelines,”

discusses Snowflake's capability to schedule SQL statements and stored procedures through the concept of tasks. The chapter also touches on streams, which is Snowflake's method of providing change data capture.

Chapter 6

, “Continuous Data Protection,”

covers the components of continuous data protection and explores Time Travel, Fail‐Safe, and the undrop functionality, which are Snowflake features that help protect and recover data in the event of human error. The chapter also covers the concept of transient and temporary tables.

Chapter 7

, “Cloning and Data Sharing,”

describes zero‐copy cloning and how it works behind the scenes. The chapter then explores data sharing and the three approaches to data sharing in Snowflake.

Chapter 8

, “Performance,”

focuses on the performance optimization features and techniques that you can use to improve query performance and, at times, reduce costs. The chapter covers scaling up and down a virtual warehouse and autoscaling a virtual warehouse to accommodate increased concurrency. This chapter also touches on data clustering and materialized views, which can be used to optimize query execution.

Chapter 9

, “Security,”

explains several Snowflake features that enable security at various levels in the Snowflake software stack. The chapter discusses security implementation starting from the data storage layer to other levels such as authentication control, data access control or authorization, and network‐level controls to manage access to your Snowflake instance.

Chapter 10

, “Account and Resource Management,”

covers the prebuilt views and table functions that enable Snowflake customers to keep track of their Snowflake credit and storage usage. The chapter also discusses resource monitors used for tracking and managing credit usage. Finally, the chapter discusses Snowflake's release management processes.

Chapter Features

Each chapter begins with a list of the SnowPro Core exam objectives that are covered in that chapter. The book doesn't cover the objectives in order, so you shouldn't be alarmed at some of the odd ordering of the objectives within the book.

Each chapter has exercises that apply the newly taught material. At the conclusion of each chapter, you will find two exam preparation tools:

Exam Essentials

   This section provides a summary of the key information presented in the chapter. You should have a complete grasp of the information summarized in this section.

Review Questions

   Each chapter concludes with a set of review questions. You should answer these questions and compare your response against the provided answers. If you are unable to correctly answer at least 80 percent of these questions, you should revisit the chapter or at least the areas that you do not fully understand.

 

 The review questions, assessment test, and other testing elements in this book are not derived from the SnowPro Core exam questions; therefore, you should not memorize the answers and assume that doing so would enable you to pass the exam. You should study the fundamentals and understand the concepts, which will enable you to answer the exam questions and pass the SnowPro Core exam. Learning the underlying topics is also the approach that will serve you best in the workplace—the ultimate goal of certification exams.

Bonus Digital Contents

This book is accompanied by an online learning environment that provides several additional elements. Items available among these companion files include the following:

Practice Tests

   All of the questions in this book appear in our proprietary digital test engine—including the 30‐question assessment test at the end of this introduction and the questions that make up the review question sections at the end of each chapter. In addition, there is a 100‐question practice exam.

Electronic “Flashcards”

   The digital companion files include 170 questions in flashcard format (a question followed by a single correct answer). You can use these to review your knowledge of the SnowPro Core exam objectives.

Glossary

   The key terms from this book, and their definitions, are available as a fully searchable PDF.

 

Interactive Online Learning Environment and Test Bank

You can access all these resources at www.wiley.com/go/sybextestprep.

Conventions Used in This Book

This book uses certain typographic styles in order to help you quickly identify important information and to avoid confusion over the meaning of words such as on‐screen prompts. In particular, look for the following styles:

Italicized text

indicates key terms that are described at length for the first time in a chapter. (Italics are also used for emphasis.)

A monospaced font

indicates SQL code, contents of files, filenames, and Internet URLs.

In addition to these text conventions, which can apply to individual words or entire paragraphs, a few conventions highlight segments of text:

 A note indicates information that's useful or interesting but that's somewhat peripheral to the main text.

 A tip provides information that can save you time or frustration and that may not be entirely obvious. A tip might describe how to get around a limitation or how to use a feature to perform an unusual task.

Sidebars

A sidebar is like a note but longer. The information in a sidebar is useful, but it doesn't fit into the main flow of the text.

Real World ScenarioReal‐World Scenario

A real‐world scenario is a type of sidebar that describes a task or example that's particularly grounded in the real world. This may be a situation I or somebody I know has encountered, or it may be advice on how to work around problems that are common in real, working Snowflake environments.

EXERCISES

An exercise is a procedure you should try on your own to help you learn about the material in the chapter. Don't limit yourself to the procedures described in the exercises, though! Tinker with different Snowflake features to really learn about Snowflake.

SnowPro Study Guide Exam Objectives

Sybex's Study Guide for Snowflake SnowPro Core Certification has been written to cover every SnowPro exam objective at a level appropriate to its exam weighting. The following table provides a breakdown of this book's exam coverage, showing you the weight of each section and the chapter where each objective or subobjective is covered:

Subject Area

% of Exam

Snowflake Cloud Data Platform Features and Architecture

20–25%

Account Access and Security

20–25%

Performance Concepts

10–15%

Data Loading and Unloading

5–10%

Data Transformations

20–25%

Data Protection and Data Sharing

5–10%

Domain 1.0: Snowflake Cloud Data Platform Features and Architecture

Exam Objective

Chapter(s)

1.1 Outline key features of the Snowflake Cloud Data Platform

1

,

2

,

3

,

7

Elastic Storage

1

,

2

Elastic Compute

1

,

2

Snowflake's three distinct layers

2

Data Cloud/Data Exchange/Partner Network

3

,

7

Cloud partner categories

3

1.2 Outline key Snowflake tools and user interfaces

3

Snowflake User Interfaces (UI)

3

Snowsight

3

Snowflake connectors

3

Snowflake drivers

3

SQL scripting

3

Snowpark

3

1.3 Outline Snowflake's catalog and objects

3

,

4

,

5

,

6

,

7

,

8

,

9

Databases

4

Schemas

4

Tables Types

6

View Types

8

,

9

Data types

4

User‐Defined Functions (UDFs) and User Defined Table Functions (UDTFs)

3

Stored Procedures

3

Streams

5

Tasks

5

Pipes

4

Shares

7

Sequences

4

1.4 Outline Snowflake storage concepts

2

,

8

,

10

Micro partitions

2

Types of column metadata clustering

2

,

8

Data Storage Monitoring

10

Search Optimization Service

8

Domain 2.0: Account Access and Security

Exam Objective

Chapter(s)

2.1 Outline compute principles

9

Network security and policies

9

Multi‐Factor Authentication (MFA)

9

Federated authentication

9

Single Sign‐On (SSO)

9

2.2 Define the entities and roles that are used in Snowflake

9

Outline how privileges can be granted and revoked

9

Explain role hierarchy and privilege inheritance

9

2.3 Outline data governance capabilities in Snowflake

7

,

9

,

10

Accounts

9

Organizations

9

Databases

9

Secure views

7

,

9

Information schemas

10

Access history and read support

10

Domain 3.0: Performance Concepts

Exam Objective

Chapter(s)

3.1 Explain the use of the Query Profile

8

Explain plans

8

Data spilling

8

Use of the data cache

8

Micro‐partition pruning

8

Query history

8

3.2. Explain virtual warehouse configurations

2

,

8

Multi‐clustering

8

Warehouse sizing

2

,

8

Warehouse settings and access

2

,

8

3.3 Outline virtual warehouse performance tools

8

,

10

Monitoring warehouse loads

8

Query performance

8

Scaling up compared to scaling out

8

Resource monitors

10

3.4 Optimize query performance

8

Describe the use of materialized views

8

Use of specific SELECT commands

8

Domain 4.0: Data Loading and Unloading

Exam Objective

Chapter(s)

4.1 Define concepts and best practices that should be considered when loading data

4

Stages and stage types

4

File size

4

File formats

4

Folder structures

4

Adhoc/bulk loading using the Snowflake UI

4

4.2 Outline different commands used to load data and when they should be used

4

,

5

CREATE PIPE

4

COPY INTO

4

GET

4

INSERT/INSERT OVERWRITE

4

PUT

4

STREAM

5

TASK

5

VALIDATE

4

4.3 Define concepts and best practices that should be considered when unloading data

4

File formats

4

Empty strings and NULL values

4

Unloading to a single file

4

Unloading relational tables

4

4.4 Outline the different commands used to unload data and when they should be used

4

LIST

4

COPY INTO

4

CREATE FILE FORMAT

4

CREATE FILE FORMAT … CLONE

4

ALTER FILE FORMAT

4

DROP FILE FORMAT

4

DESCRIBE FILE FORMAT

4

SHOW FILE FORMAT

4

Domain 5.0: Data Transformation

Exam Objective

Chapter(s)

5.1 Explain how to work with standard data

3

,

4

Estimating functions

4

Sampling

4

Supported function types

4

User‐Defined Functions (UDFs) and stored procedures

3

5.2 Explain how to work with semi‐structured data

4

Supported file formats, data types, and sizes

4

VARIANT column

4

Flattening the nested structure

4

5.3 Explain how to work with unstructured data

4

Define and use directory tables

4

SQL file functions

4

Outline the purpose of User‐Defined Functions (UDFs) for data analysis

4

Domain 6.0: Data Protection and Data Sharing

Exam Objective

Chapter(s)

6.1 Outline Continuous Data Protection with Snowflake

6

,

7

Time Travel

6

Fail‐safe

6

Data Encryption

6

Cloning

7

Replication

6

,

7

6.2 Outline Snowflake data sharing capabilities

7

Account types

7

Data Marketplace and Data Exchange

7

Private data exchange

7

Access control options

7

Shares

7

How to Contact the Publisher

If you believe you have found a mistake in this book, please bring it to our attention. At John Wiley & Sons, we understand how important it is to provide our customers with accurate content, but even with our best efforts an error may occur.

In order to submit your possible errata, please email it to our Customer Service Team at [email protected] with the subject line “Possible Book Errata Submission.”

Assessment Test

Which is the term used for Snowflake's architecture composed of shared storage and multiple compute engines?

Multi‐cluster shared data

Hybrid architecture

Multicompute architecture

Shared Storage, multicompute

Which of the following are layers in Snowflake architecture? (Select all that apply.)

On‐Premises layer

Database Storage

Query Processing

Cloud Services

Which of the following is true regarding micro‐partitions? (Select all that apply.)

Each micro‐partition contains 50 MB to 500 MB of uncompressed data.

Micro‐partitions are immutable.

Snowflake automatically compresses data in a micro‐partition.

Each micro‐partition contains 1 MB to 5 MB of uncompressed data.

Which of the following is true regarding data clustering in Snowflake? (Select all that apply.)

Data in a Snowflake table cannot be reclustered.

Snowflake automatically clusters data in a table.

If required, clustering keys can be defined to recluster the data.

Snowflake does not automatically cluster data in a table.

Which of the following are Snowflake's Data Integration partners? (Select all that apply.)

Collibra

Matillion

Informatica

Power BI

What is the command‐line tool for connecting to Snowflake?

Snowpipe

SnowSQL

SnowCD

Snow Mobile

You are required to load data from a named internal stage into a Snowflake table. Which command should you use?

GET

VALIDATE

COPY

PUT

You need to upload data from an on‐premises system to a named internal stage. Which command should you use?

GET

PUT

COPY

VALIDATE

Which of the following can be used to load streaming or real‐time data in Snowflake?

GET

Snowpipe

COPY

PUT

Snowflake supports which semi‐structured file formats? (Select all that apply.)

DOM

Avro

ORC

XML

True or False: Data in an internal stage contributes to overall storage costs.

False

True

True or False: The

VARIANT

data type can store any type of data.

False

True

True or False: Snowpipe can load data from an internal or an external stage.

True

False

Snowflake tasks can execute which of the following? (Select all that apply.)

A single SQL statement

Snowpipe

A call to a stored procedure

Automatic clustering

Which of the following correctly describes streams in Snowflake? (Select all that apply.)

They are used to track data changes made to a table, including inserts, updates, and deletes.

They are used as a queuing mechanism for user queries.

They are used for change data capture (CDC).

They are used to load data in a real‐time manner.

What is the minimum Snowflake edition that allows up to 90 days of Time Travel?

Standard Edition

Virtual Private Snowflake

Enterprise Edition

Business Critical Edition

Which of the following commands will let you recover dropped tables, schemas, and databases?

UNDROP

RECOVER

CREATE

TIME TRAVEL

True or False: Cloning and Time Travel can be combined to create clones of data that exist at a certain time.

False

True

Which layer in Snowflake architecture is responsible for data sharing?

Query Processing

Data Sharing

Cloud Services

Cloud Storage

True or False: The compute charges are billed to the data consumer when sharing data with another Snowflake account.

True

False

True or False: Data Exchange is your own private data sharing hub where you can share data with an invite‐only group of people.

True

False

Which scaling policy ensures maximum performance?

Standard

Economy

Performance

Maximum

True or False: If a query has been run before and the underlying data hasn't changed, the query result cache is used to fulfill results for a semantically similar query.

True

False

What is the name of the service responsible for redistributing data according to the clustering key?

Redistribution Engine

Snowflake Optimizer

Automatic Clustering

Clustering Engine

True or False: All data at rest in Snowflake is encrypted using AES 256‐bit encryption.

True

False

What of the following roles are built into Snowflake roles? (Select all that apply.)

SECURITYADMIN

PUBLIC

SYSADMIN

SNOWFLAKEADMIN

True or False: Network policies can be used to allow or deny access to specific IP addresses.

True

False

Which of the following is true regarding resource monitors?

They help manage user workload on the system.

They help manage costs and avoid unexpected credit usage by virtual warehouses.

They help manage costs and avoid unexpected storage usage.

They are used to allow or deny users access to Snowflake.

Which of the following statements are correct regarding data retention and latency for the ACCOUNT_USAGE schema? (Select all that apply.)

It has no latency.

It has 365 days of retention.

It has 45 minutes to 3 hours of latency.

It has 30 days of retention.

How often does Snowflake release new software?

Fortnightly

Yearly

Weekly

Monthly

Answers to Assessment Test

A.  Snowflake uses a hybrid architecture approach in which the data is stored on a shared‐data storage layer, but multiple compute clusters can simultaneously perform processing on that data. The architecture is often referred to as multi‐cluster, shared‐data architecture. For more information, see

Chapter 2

.

B, C, D.  Snowflake's architecture comprises of database storage layer, query processing layer and cloud services layer. For more information, see

Chapter 2

.

A, B, C.  Micro‐partitions in Snowflake are immutable and contain 50 to 500 MB of uncompressed data. However, Snowflake compresses the data before storing it in a micro‐partition. For more information, see

Chapter 2

.

B, C.  Snowflake automatically clusters data in a table; however, clustering keys can be defined to change the clustering behavior if needed. For more information, see

Chapter 2

.

B, C.  Snowflake has several integration partners, Matillion and Informatica being two examples. For more information, see

Chapter 3

.

B.  SnowSQL is the command‐line tool through which you can connect to your Snowflake instance and run queries. For more information, see

Chapter 3

.

C.  The

COPY

command loads data from internal and external stages into a Snowflake table. The

COPY

command is also used to export data from a table. For more information, see

Chapter 4

.

B.  You will need to use the

PUT

command through SnowSQL or a similar client.

PUT

transfers data from an on‐premises system to a Snowflake internal stage. Similarly,

GET

is used to download data from a Snowflake internal stage to an on‐premises system. For more information, see

Chapter 4

.

B.  Snowpipe is the Snowflake service used for loading near‐realtime data into Snowflake. For more information, see

Chapter 4

.

B, C, D.  Snowflake supports several semi‐structured file formats. Avro, ORC & XML are some of the examples of the supported file formats. For more information, see

Chapter 4

.

B.  Data in an internal stage is stored and managed by Snowflake, therefore contributing to the overall storage costs. For more information, see

Chapter 4

.

B.  That is true. VARIANT is a versatile data type that can store any data. The VARIANT data type is used often for processing semi‐structured. For more information, see

Chapter 4

.

A.  That is correct. Snowpipe can load near real‐time data from an internal or external stage. For more information, see

Chapter 4

.

A, C.  Tasks in Snowflake can be used to execute a single SQL statement, call a stored procedure or procedural logic using Snowflake Scripting. For more information, see

Chapter 5

.

A, C.  Streams are used to track data changes made to a table and can be used for CDC. For more information, see

Chapter 5

.

C.  90 days of Time Travel is supported from the Enterprise edition and above. For more information, see

Chapter 6

.

A.  The

UNDROP

command is used to recover tables, schemas, and databases that have been dropped. For more information, see

Chapter 6

.

B.  Cloning and Time Travel can be combined to produce clones of tables, schemas, and databases as they existed at a point in time. For more information, see

Chapter 7

.

C.  Data sharing is a metadata‐only operation; therefore, the cloud services layer is responsible for data sharing. For more information, see

Chapter 7

.

A.  That is correct. When data is shared with another Snowflake account, the data consumer pays for the compute costs; however, the storage costs are billed to the data provider. For more information, see

Chapter 7

.

A.  Data Exchange is one of the ways data sharing can be achieved. Data Exchange provides private sharing capabilities. For more information, see

Chapter 7

.

A.  Of the two scaling policies, Standard and Economy, Standard ensures maximum performance. For more information, see

Chapter 8

.

A.  The query result cache returns results of queries if the underlying data hasn't changed and the query matches another query that has been executed before. For more information, see

Chapter 8

.

C.  Automatic clustering takes care of clustering behind the scenes. For more information, see

Chapter 8

.

A.  That is correct. All data at rest in Snowflake is encrypted using AES 256‐bit encryption. For more information, see

Chapter 9

.

A, B, C.  Snowflake has several built‐in roles, including PUBLIC, SYSADMIN, and SECURTYADMIN. For more information, see

Chapter 9

.

A.  That is correct. Network policies can be used to restrict or allow access to IP addresses. For more information, see

Chapter 9

.

B.  Resource monitors track and manage costs and help avoid unexpected costs. For more information, see

Chapter 10

.

B, C.  The views in ACCOUNT_USAGE schema have 365 days of data retention; however, they have 45 to 3 hours of latency, depending on the view being used. For more information, see

Chapter 10

.

C.  Snowflake releases new features, updates, and fixes every week. For more information, see

Chapter 10

.