Snowflake Cookbook - Hamid Mahmood Qureshi - E-Book

Snowflake Cookbook E-Book

Hamid Mahmood Qureshi

0,0
35,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Snowflake is a unique cloud-based data warehousing platform built from scratch to perform data management on the cloud. This book introduces you to Snowflake's unique architecture, which places it at the forefront of cloud data warehouses.
You'll explore the compute model available with Snowflake, and find out how Snowflake allows extensive scaling through the virtual warehouses. You will then learn how to configure a virtual warehouse for optimizing cost and performance. Moving on, you'll get to grips with the data ecosystem and discover how Snowflake integrates with other technologies for staging and loading data.
As you progress through the chapters, you will leverage Snowflake's capabilities to process a series of SQL statements using tasks to build data pipelines and find out how you can create modern data solutions and pipelines designed to provide high performance and scalability. You will also get to grips with creating role hierarchies, adding custom roles, and setting default roles for users before covering advanced topics such as data sharing, cloning, and performance optimization.
By the end of this Snowflake book, you will be well-versed in Snowflake's architecture for building modern analytical solutions and understand best practices for solving commonly faced problems using practical recipes.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 327

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Snowflake Cookbook

Techniques for building modern cloud data warehousing solutions

Hamid Mahmood Qureshi

Hammad Sharif

BIRMINGHAM—MUMBAI

Snowflake Cookbook

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh

Publishing Product Manager: Ali Abidi

Commissioning Editor: Sunith Shetty

Acquisition Editor: Ali Abidi

Senior Editor: Roshan Kumar

Content Development Editors: Athikho Rishana, Sean Lobo

Technical Editor: Sonam Pandey

Copy Editor: Safis Editing

Project Coordinator: Aishwarya Mohan

Proofreader: Safis Editing

Indexer: Priyanka Dhadke

Production Designer: Vijay Kamble

First published: February 2021

Production reference: 1230221

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-061-1

www.packt.com

To my father, whose authoring of countless books was an inspiration.

To my mother, who dedicated her life to her children's education and well-being.

– Hamid Qureshi

To my dad and mom for unlimited prayers and (according to my siblings, a bit extra) love. I cannot thank and appreciate you enough.

To my wife and the mother of my children for her support and encouragement throughout this and other treks made by us.

– Hammad Sharif

Contributors

About the authors

Hamid Qureshi is a senior cloud and data warehouse professional with almost two decades of total experience, having architected, designed, and led the implementation of several data warehouse and business intelligence solutions. He has extensive experience and certifications across various data analytics platforms, ranging from Teradata, Oracle, and Hadoop to modern, cloud-based tools such as Snowflake. Having worked extensively with traditional technologies, combined with his knowledge of modern platforms, he has accumulated substantial practical expertise in data warehousing and analytics in Snowflake, which he has subsequently captured in his publications.

I want to thank the people who have helped me on this journey: my co-author Hammad, our technical reviewer, Hassaan, the Packt team, and my loving wife and children for their support throughout this journey.

Hammad Sharif is an experienced data architect with more than a decade of experience in the information domain, covering governance, warehousing, data lakes, streaming data, and machine learning.

He has worked with a leading data warehouse vendor for a decade as part of a professional services organization, advising customers in telco, retail, life sciences, and financial industries located in Asia, Europe, and Australia during presales and post-sales implementation cycles.

Hammad holds an MSc. in computer science and has published conference papers in the domains of machine learning, sensor networks, software engineering, and remote sensing.

I would like to first and foremost thank my loving wife and children for their patience and encouragement throughout the long process of writing this book. I'd also like to thank Hamid for inviting me to be his partner in crime and for his patience, my publishing team for their guidance, and the reviewers for helping improve this work.

About the reviewers

Hassaan Sajid has around 12 years of experience in data warehousing and business intelligence in the retail, telecommunications, banking, insurance, and government sectors. He has worked with various clients in Australia, UAE, Pakistan, Saudi Arabia, and the USA in multiple BI/data warehousing roles, including BI architect, as a BI developer, ETL developer, data modeler, operations analyst, data analyst, and technical trainer. He holds a master's degree in BI and is a professional Scrum Master. He is also certified in Snowflake, MicroStrategy, Tableau, Power BI, and Teradata. His hobbies include reading, traveling, and photography.

Buvaneswaran Matheswaran has a bachelor's degree in electronics and communication engineering from the Government College of Technology, Coimbatore, India. He had the opportunity to work on Snowflake in its very early stages and has more than 4 years of Snowflake experience. He has done lots of work and research on Snowflake as an enterprise admin. He has worked mainly in retail- and Consumer Product Goods (CPG)-based Fortune 500 companies. He is immensely passionate about cloud technologies, data security, performance tuning, and cost optimization. This is the first time he has done a technical review for a book, and he enjoyed the experience immensely. He has learned a lot as a user and also shared his experience as a veteran Snowflake admin.

Daan Bakboord is a self-employed data and analytics consultant from the Netherlands. His passion is collecting, processing, storing, and presenting data. He has a simple motto: a customer must be able to make decisions based on facts and within the right context. DaAnalytics is his personal (online) label. He provides data and analytics services, having been active in Oracle Analytics since the mid-2000s. Since the end of 2017, his primary focus has been in the area of cloud analytics. Focused on Snowflake and its ecosystem, he is Snowflake Core Pro certified and, thanks to his contributions to the community, has been recognized as a Snowflake Data Hero. Also, he is Managing Partner Data and Analytics at Pong, a professional services provider that focuses on data-related challenges.

Table of Contents

Preface

Chapter 1: Getting Started with Snowflake

Technical requirements

Creating a new Snowflake instance

Getting ready

How to do it…

How it works…

Creating a tailored multi-cluster virtual warehouse

Getting ready

How to do it…

How it works…

There's more…

Using the Snowflake WebUI and executing a query

Getting ready

How to do it…

How it works…

Using SnowSQL to connect to Snowflake

Getting ready

How to do it…

How it works…

There's more…

Connecting to Snowflake with JDBC

Getting ready

How to do it…

How it works…

There's more…

Creating a new account admin user and understanding built-in roles

How to do it…

How it works…

There's more…

Chapter 2: Managing the Data Life Cycle

Technical requirements

Managing a database

Getting ready

How to do it…

How it works…

There's more…

Managing a schema

Getting ready

How to do it…

How it works…

There's more…

Managing tables

Getting ready

How to do it…

How it works…

There's more…

Managing external tables and stages

Getting ready

How to do it…

How it works…

There's more…

Managing views in Snowflake

Getting ready

How to do it…

How it works…

There's more…

Chapter 3: Loading and Extracting Data into and out of Snowflake

Technical requirements

Configuring Snowflake access to private S3 buckets

Getting ready

How to do it…

How it works…

Loading delimited bulk data into Snowflake from cloud storage

Getting ready

How to do it…

How it works…

Loading delimited bulk data into Snowflake from your local machine

Getting ready

How to do it…

How it works…

Loading Parquet files into Snowflake

Getting ready

How to do it…

How it works…

Making sense of JSON semi-structured data and transforming to a relational view

Getting ready

How to do it…

How it works…

Processing newline-delimited JSON (or NDJSON) into a Snowflake table

Getting ready

How to do it…

How it works…

Processing near real-time data into a Snowflake table using Snowpipe

Getting ready

How to do it…

How it works…

Extracting data from Snowflake

Getting ready

How to do it…

How it works…

Chapter 4: Building Data Pipelines in Snowflake

Technical requirements

Creating and scheduling a task

Getting ready

How it works…

Conjugating pipelines through a task tree

Getting ready

How to do it…

How it works…

Querying and viewing the task history

Getting ready

How to do it…

How it works…

Exploring the concept of streams to capture table-level changes

Getting ready

How to do it…

How it works…

Combining the concept of streams and tasks to build pipelines that process changed data on a schedule

How to do it…

How it works…

Converting data types and Snowflake's failure management

How to do it…

How it works…

There's more…

Managing context using different utility functions

Getting ready

How to do it…

How it works…

There's more…

Chapter 5: Data Protection and Security in Snowflake

Technical requirements

Setting up custom roles and completing the role hierarchy

Getting ready

How to do it…

How it works…

There's more…

Configuring and assigning a default role to a user

Getting ready

How to do it…

How it works…

There's more…

Delineating user management from security and role management

Getting ready

How to do it…

How it works…

Configuring custom roles for managing access to highly secure data

Getting ready

How to do it…

How it works…

Setting up development, testing, pre-production, and production database hierarchies and roles

Getting ready

How to do it…

How it works…

Safeguarding the ACCOUNTADMIN role and users in the ACCOUNTADMIN role

Getting ready

How to do it…

How it works…

Chapter 6: Performance and Cost Optimization

Technical requirements

Examining table schemas and deriving an optimal structure for a table

Getting ready

How to do it…

How it works…

Identifying query plans and bottlenecks

Getting ready

How to do it…

How it works…

Weeding out inefficient queries through analysis

Getting ready

How to do it…

How it works…

Identifying and reducing unnecessary Fail-safe and Time Travel storage usage

Getting ready

How to do it…

How it works…

Projections in Snowflake for performance

Getting ready

How to do it…

How it works…

There's more…

Reviewing query plans to modify table clustering

Getting ready

How to do it…

How it works…

Optimizing virtual warehouse scale

Getting ready

How to do it…

How it works…

Chapter 7: Secure Data Sharing

Technical requirements

Sharing a table with another Snowflake account

Getting ready

How to do it…

How it works…

Sharing data through a view with another Snowflake account

Getting ready

How to do it…

How it works…

Sharing a complete database with another Snowflake account and setting up future objects to be shareable

Getting ready

How to do it…

How it works…

Creating reader accounts and configuring them for non-Snowflake sharing

Getting ready

How to do it…

How it works…

Getting ready

How to do it…

How it works…

Keeping costs in check when sharing data with non-Snowflake users

Getting ready

How to do it…

How it works…

Chapter 8: Back to the Future with Time Travel

Technical requirements

Using Time Travel to return to the state of data at a particular time

Getting ready

How to do it…

How it works…

Using Time Travel to recover from the accidental loss of table data

Getting ready

How to do it…

How it works…

Identifying dropped databases, tables, and other objects and restoring them using Time Travel

Getting ready

How to do it…

How it works…

Using Time Travel in conjunction with cloning to improve debugging

Getting ready

How to do it…

How it works…

Using cloning to set up new environments based on the production environment rapidly

Getting ready

How to do it…

How it works…

Chapter 9: Advanced SQL Techniques

Technical requirements

Managing timestamp data

Getting ready

How to do it…

How it works…

Shredding date data to extract Calendar information

Getting ready

How to do it…

How it works…

Unique counts and Snowflake

Getting ready

How to do it…

How it works…

Managing transactions in Snowflake

Getting ready

How to do it…

How it works…

Ordered analytics over window frames

Getting ready

How to do it…

How it works…

Generating sequences in Snowflake

Getting ready

How to do it…

How it works…

Chapter 10: Extending Snowflake Capabilities

Technical requirements

Creating a Scalar user-defined function using SQL

Getting ready

How to do it...

How it works...

Creating a Table user-defined function using SQL

Getting ready

How to do it

How it works

Creating a Scalar user-defined function using JavaScript

Getting ready

How to do it

How it works

Creating a Table user-defined function using JavaScript

Getting ready

How to do it

How it works

Connecting Snowflake with Apache Spark

Getting ready

How to do it

How it works

Using Apache Spark to prepare data for storage on Snowflake

Getting ready

How to do it

How it works

Why subscribe?

Other Books You May Enjoy