E-Book
29,99 €

Data Modeling with Snowflake E-Book

Serge Gershkovich

0,0

29,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Struggling with rising Snowflake costs and constant tuning? Poorly aligned data models can lead to bloated expenses, inefficient queries, and time-consuming rework. Data Modeling with Snowflake helps you harness the Snowflake Data Cloud’s scalable, cloud-native architecture and expansive feature set to deliver data solutions faster than ever.
This book introduces simple, practical data modeling frameworks that accelerate agile design and evolve alongside your projects from concept to code. Rooted in decades of proven database design principles, these frameworks are paired, for the first time, with Snowflake-native objects and real-world examples, offering a two-in-one crash course in theory and direct application.
Through real-world examples designed to make learning easy, you’ll leverage Snowflake’s innovative features like Time Travel, Zero-Copy Cloning, and Change Data Capture (CDC) to create cost-efficient solutions. Whether you're just starting out or refining your architecture, this book will guide you in designing smarter, scaling faster, and cutting costs by aligning timeless modeling principles with the power of Snowflake.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 523

Veröffentlichungsjahr: 2025

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Data Modeling with Snowflake

Second Edition

A practical guide to accelerating Snowflake development using universal modeling techniques

Serge Gershkovich

Data Modeling with Snowflake

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Portfolio Director: Sunith Shetty

Relationship Lead: Jane D’Souza

Project Manager: Aniket Shetty

Content Engineer: Saba Umme Salma

Technical Editor: Aniket Shetty

Copy Editor: Safis Editing

Indexer: Manju Arasan

Proofreader: Safis Editing

Production Designer: Ganesh Bhadwalkar

First published: May 2023

Second edition: September 2025

Production reference: 1190825

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK.

ISBN 978-1-83702-803-0

www.packtpub.com

I want to thank Anna, Ed, and Ajay for recognizing the potential that even I didn’t know I had. This book happened thanks to your guidance and encouragement. To my loving wife, Elena, thank you for your unwavering support throughout this process.

Foreword

A few years ago, Serge Gershkovich and I were in Verbier, Switzerland, surrounded by the stunning Swiss Alps. Sadly, I barely saw him, as he was glued to his laptop, racing to finish the first edition of this book. Now, with the second edition in your hands, it’s clear that those long days paid off. Time flies!

And in that time, the data landscape has shifted dramatically. Data modeling today sits at a crossroads. All too often, the focus is on speed, scale, and quick wins. Proper data modeling is often an afterthought, if it’s even considered at all. Tactical execution happens at the expense of strategic design. AI dominates every conversation. Some claim that modern cloud data platforms, with their seemingly limitless resources and AI capabilities, have made data modeling obsolete.

Nothing is further from the truth. Data modeling matters more than ever, and Serge’s book stands as a firm rebuttal to that view. At its heart and soul, data modeling is thinking. In an era where AI thrives on high-quality data, the importance of modeling has only increased.

Serge begins by demystifying data modeling from first principles through to advanced, practical application in the Snowflake Data Cloud. He reminds us that modeling is not just a set of tactics, but also the art of selectively simplifying and mapping reality to data. From there, he guides readers through the four essential stages: conceptual, logical, physical, and transformational.

What sets this book apart is its deep integration with Snowflake’s architecture. Serge shows you how to design data models that work, not just in theory, but in the real world. Starting from first principles, he moves through conceptual, logical, physical, and transformational stages, all deeply grounded in Snowflake’s architecture. You’ll see how to harness features like streams, tasks, and zero-copy cloning to meet almost any use case.

Serge’s structured, business-first approach breaks the design process into manageable, collaborative stages, anchored in the idea that a data model must, first and foremost, be a business model. He emphasizes that models are not just technical artifacts, but consensus-driven blueprints that align technology with organizational reality.

This is more than a technical manual. It’s a strategic guide for anyone in the data profession, from aspiring engineers to seasoned architects. Serge arms you with the verbal, visual, and technical skills to articulate, design, and implement robust, scalable, and cost-effective systems.

I’m glad Serge continues to share his expertise so generously. This edition is another step in his mission to elevate the craft of data modeling. You now have the blueprint. Use it well. And maybe next time, Serge and I will get a full day on the slopes.

Joe Reis

Data engineer, best-selling author, and educator.

Contributors

About the author

Serge Gershkovich is a seasoned data architect with decades of experience designing and maintaining enterprise-scale data warehouse platforms and reporting solutions. He is a leading subject matter expert, speaker, content creator, and Snowflake Data Superhero. Serge earned a bachelor of science degree in information systems from the State University of New York (SUNY) Stony Brook. Throughout his career, Serge has worked in model-driven development from SAP BW/HANA to dashboard design to cost-effective cloud analytics with Snowflake. He currently serves as Head of Product at SqlDBM, an online database modeling tool.

About the reviewers

Jason Bemont, a father of five, has spent 28 years in IT, specializing in data modeling and governance. He has led enterprise data modeling projects for Fortune 500 companies across financial, healthcare, and insurance sectors. Outside of work, he and his wife share a passion for breeding ball pythons. Honored to review Unlocking Value for Enterprise Organizations, he holds deep respect for Serge and even purchased Data Modeling with Snowflake, 1st Edition upon its release—long before they met—recognizing it as a benchmark for Snowflake data modeling.

Jason is the Data Modeling Lead at Beazley Insurance Company, where he oversees the enterprise data modeling team and governs federated data modeling efforts. Beazley is undertaking a major data modernization initiative to enhance seamless data sharing across its domains, driving rapid business growth and decision making. Many of these projects leverage Snowflake’s capabilities.

Keith Belanger is a seasoned data strategist and architect with nearly three decades of experience designing and implementing modern data analytics platforms. He has held leadership roles in corporate America, including at a Fortune 100 company, as well as in consulting and, more recently, at product companies. Currently Field CTO at DataOps.live, he focuses on Snowflake-native applications that accelerate Data Product development. A recognized Snowflake Data Superhero, Keith is known for his thought leadership in data architecture, data modeling, and building data solutions aligned to business objectives. He is a frequent speaker at industry events.

Tim Spreitzer is a seasoned expert in cloud data management, specializing in Snowflake Cloud architectures and Azure technologies within the healthcare domain. They have a proven track record in designing secure RBAC frameworks and optimizing ELT processes, driving significant advancements in data operation performance and security. A dedicated mentor, they are committed to fostering team capabilities and establishing new industry benchmarks for efficient and secure healthcare data handling.

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Your Book Comes with Exclusive Perks - Here’s How to Unlock Them

Unlock Your Book’s Exclusive Benefits

How to unlock these benefits in three easy steps

Step 1

Step 2

Step 3

Need help?

Part 1: Core Concepts in Data Modeling and Snowflake Architecture

Unlocking the Power of Modeling

Technical requirements

Modeling with purpose

Leveraging the modeling toolkit

The benefits of database modeling

Operational and analytical modeling scenarios

Is Snowflake limited to OLAP?

A look at relational and transformational modeling

What modeling looks like in operational systems

What modeling looks like in analytical systems

Summary

Further reading

References

An Introduction to the Four Modeling Types

Design and process

Ubiquitous modeling

Conceptual

What it is

What it looks like

Logical

What it is

What it looks like

Physical modeling

What it is

What about views?

What it looks like

Transformational

What it is

What it looks like

Summary

Further reading

Mastering Snowflake’s Architecture

Traditional architectures

Shared-disk architecture

Shared-nothing architecture

Snowflake’s solution

Snowflake’s three-tier architecture

The storage layer

The compute layer

The services layer

Snowflake’s features

Zero-copy cloning

Time Travel

Hybrid tables

Beyond structured data

Costs to consider

Storage costs

Virtual warehouse compute cost

Serverless compute cost

Services layer compute cost

Data transfer costs

Saving cash by using cache

The services layer

The metadata cache

The query results cache

The warehouse cache

The storage layer

Summary

Further reading

Mastering Basic Snowflake Objects

Stages

File formats

Tables

Physical tables

Permanent tables

Transient tables

Temporary tables

Table types summary

Stage metadata tables

Snowflake views

Caching

Security

Materialized views

Streams

Loading from streams

Change tracking

Tasks

Combining tasks and streams

Summary

References

From Logical Concepts to Snowflake Objects

Entities as tables

How Snowflake stores data

Clustering

Automatic clustering

Attributes as columns

Snowflake data types

Storing semi-structured data

Constraints and enforcement

Identifiers as primary keys

Benefits of a PK

Determining granularity

Ensuring correct join results

Avoiding duplicate values

Specifying a PK

Keys taxonomy

Business key

Surrogate key

Sequences

Alternate keys as unique constraints

Relationships as foreign keys

Benefits of an FK

Visualizing the data model

Informing joins

Automating functionality in BI tools

Mandatory columns as NOT NULL constraints

Summary

Mastering Advanced Snowflake Objects

Iceberg tables

Open table formats explained

ACID properties explained

How Iceberg tables work

Managed or unmanaged

Use cases

Dynamic tables

Declarative and imperative statements

Understanding target lag

Understanding refresh types

Monitoring, error handling, and failed refreshes

Tips, tricks, and cost considerations

Orchestrating dynamic tables with precision

Comparison and use cases

Hybrid tables

Referential integrity

Functionality and costs

Working with indexes

Costs

Working with hybrid tables and understanding limitations

Summary and comparison to standard tables

Event tables

Costs and benefits

Summary

Seeing Snowflake’s Architecture Through Modeling Notation

A history of relational modeling

RM versus entity-relationship diagram

Visual modeling conventions

Depicting entities

Depicting relationships

Crow’s foot

IDEF1X

Adding conceptual context to Snowflake architecture

Many-to-many

Representing subtypes and supertypes

The benefit of synchronized modeling

Summary

Part 2: Applied Modeling from Idea to Deployment

Putting Conceptual Modeling into Practice

Dimensional modeling

Understanding dimensional modeling

Setting the record straight on dimensional modeling

Starting a conceptual model in four easy steps

Define the business process

Determine the grain

Determine the dimensions

Identify the facts

From bus matrix to a conceptual model

Modeling in reverse

Identifying the facts and dimensions

Establishing the relationships

Proposing and validating the business processes

Summary

Further reading

Putting Logical Modeling into Practice

Expanding from conceptual to logical modeling

Adding attributes

Cementing the relationships

Many-to-many relationships

Weak entities

Inheritance

Summary

Database Normalization

An overview of database normalization

Data anomalies

Update anomaly

Insertion anomaly

Deletion anomaly

Domain anomaly

Database normalization through examples

1NF

2NF

3NF

BCNF

4NF

5NF

DKNF

6NF

Data models on a spectrum of normalization

Summary

Database Naming and Structure

Naming conventions

Case

Object naming

Tables

Primary key columns

Foreign key constraints

Suggested conventions

Organizing a Snowflake database

Organization of databases and schemas

Managed schemas

OLTP versus OLAP database structures

Source layer

Staging layer

Normalized self-service analytics

Denormalized reporting analytics

Departmental data

Database environments

Summary

Putting Physical Modeling into Practice

Technical requirements

Considerations before starting the implementation

Performance

Cost

Data quality and integrity

Data security

Backup and scalability considerations

Expanding from logical to physical modeling

Physicalizing the logical objects

Defining the tables

Naming

Table properties

Declaring constraints

Deploying a physical model

Creating an ERD from a physical model

Evolving from physical to semantic models

Working with a semantic model

Building a semantic model

Summary

Part 3: Solving Real-World Problems with Transformational Modeling

Putting Transformational Modeling into Practice

Technical requirements

Separating the model from the object

Shaping transformations through relationships

Join elimination using constraints

When to use RELY for join elimination

When to be careful using RELY

Joins and set operators

Performance considerations and monitoring

Common query problems

Additional query considerations

Putting transformational modeling into practice

Gathering the business requirements

Reviewing the relational model

Building the transformational model

Summary

Modeling Slowly Changing Dimensions

Technical requirements

Dimensions overview

SCD types

Example scenario

Type 0: Maintain original

Type 1: Overwrite

Type 2: Add a new row

Type 3: Add a new column

Type 4: Add a mini dimension

Type 5: Type 4 mini dimension + Type 1

Type 6: The Type 1, 2, and 3 hybrid

Type 7: Complete as-at flexibility

Overview of SCD types

Recipes for maintaining SCDs in Snowflake

Setting the stage

Type 1: Merge

Type 2: Type 1-like performance using streams

Type 2 with a dynamic twist

The anatomy of a window function

Creating a dynamic Type 2 dimension

Creating a Type 2.1 dimension

Performance considerations

Type 3: One-time update

Summary

Modeling Facts for Rapid Analysis

Technical requirements

Fact table types

Fact table measures

Getting the facts straight

The world’s most versatile transactional fact table

The leading method for recovering deleted records

Type 2 slowly changing facts

Maintaining fact tables using Snowflake features

Building a reverse balance fact table with Streams

Recovering deleted records with leading load dates

Handling time intervals in a Type 2 fact table

Summary

Modeling Semi-Structured Data

Technical requirements

The benefits of semi-structured data in Snowflake

Getting hands-on with semi-structured data

Schema-on-read != schema-no-need

Converting semi-structured data into relational data

Summary

Modeling Hierarchies

Technical requirements

Understanding and distinguishing between hierarchies

A fixed-depth hierarchy

A slightly ragged hierarchy

A ragged hierarchy

Maintaining hierarchies in Snowflake

Recursively navigating a ragged hierarchy

Handling changes

Summary

Part 4: Data Modeling for Enterprise Teams

Scaling Data Models through Modern Techniques

Technical requirements

Demystifying Data Vault 2.0

Building the Raw Vault

Hubs

Links

Satellites

Reference tables

Loading with multi-table inserts

Modeling the data marts

Star schema

Snowflake schema

Diving into data lakes

Discovering Data Mesh

Start with the business

Adopt governance guidelines

Emphasize data quality

Encourage a culture of data sharing

Data modeling in the dawn of AI

Summary

Unlocking Value for Enterprise Organizations

Misconceptions

Data modeling is just for the data team

Data modeling is a one-time effort

Data modeling is only for large projects and organizations

Data modeling slows down project timelines

Barriers

Demonstrating value with business metrics

Decentralized data ownership

Actions

Build momentum through quick wins

Link data modeling to business objectives

Data literacy discourages data siloes

Summary

Appendix

Technical requirements

The exceptional time traveler

The secret column type Snowflake refuses to document

Read the functional manual (RTFM)

Summary

Other Books You May Enjoy

Index

Landmarks

Cover

Index

Preface

Snowflake is one of the leading cloud data platforms, gaining popularity among organizations that are looking to migrate their data to the cloud. With its game-changing features, Snowflake is unlocking new possibilities for self-service analytics and collaboration. However, Snowflake’s scalable consumption-based pricing model requires users to fully understand its revolutionary three-tier cloud architecture and pair it with universal modeling principles to ensure they are unlocking value and not letting money evaporate into the cloud.

Data modeling is essential for building scalable and cost-effective designs in data warehousing. Effective modeling techniques not only help businesses build efficient data models but also enable them to gain a deeper understanding of their business. Though modeling is largely database-agnostic, pairing modeling techniques with game-changing Snowflake features can help build Snowflake’s most performant and cost-effective solutions.

Since the first edition of this book, the data landscape has changed significantly, and Snowflake has continued to innovate at an extraordinary pace. This second edition reflects these developments and addresses the increasing complexity of modern data architectures. It includes expanded coverage of semantic models, which have become an essential link between technical data structures, business understanding, and AI. This enables organizations to develop self-service analytics capabilities that genuinely meet the needs of their users.

The introduction of advanced Snowflake objects has changed how we approach data modeling in the cloud. Hybrid tables present new opportunities for transactional workloads; Iceberg tables offer open-standard compatibility, which improves data portability and interoperability, while dynamic tables allow organizations to create self-maintaining data pipelines.

However, having technical capabilities alone does not guarantee success. One critical addition to this second edition addresses a challenge that goes beyond database design: helping engineers communicate effectively with business stakeholders. Data modeling initiatives often fail not due to technical limitations but because of communication barriers between data teams and business leaders. Throughout this updated edition, we explore ways to translate technical concepts into business value, highlighting the return on investment (ROI) of data modeling efforts in terms that resonate with decision-makers and budget holders.

Being able to communicate the ROI of data modeling is essential for ensuring organizational buy-in and securing the resources needed for successful implementation. We’ve learned that even the most sophisticated data models may go unused if business teams do not understand their value or feel disconnected from their development. This edition provides practical frameworks for fostering collaborative relationships between technical and business teams, transforming data modeling from a technical task into a strategic business initiative, and making it a “team sport.”

This book combines best practices in data modeling with Snowflake’s powerful features to provide you with the most efficient and effective approach to data modeling in Snowflake. Using these techniques, you can optimize your data warehousing processes, improve your organization’s data-driven decision-making capabilities, and save valuable time and resources. More importantly, you’ll learn how to build bridges between technical implementation and business value, ensuring that your data modeling efforts translate into organizational success and competitive advantage.

Who this book is for

Database modeling is a simple yet foundational tool for enhancing communication and decision-making within enterprise teams and streamlining development. By pairing modeling-first principles with the specifics of Snowflake architecture, this book will serve as an effective tool for data engineers looking to build cost-effective Snowflake systems and for business users looking for an easy way to understand them.

The three main personas who are the target audience of this content are as follows:

Data engineers: This book takes a Snowflake-centered approach to designing data models. It pairs universal modeling principles with unique architectural facets of the data cloud to help build performant and cost-effective solutions.Data architects: While familiar with modeling concepts, many architects may be new to the Snowflake platform and eager to learn and incorporate its best features into their designs for improved efficiency and maintenance.Business analysts: Many analysts transition from business or functional roles and are cast into the world of data without a formal introduction to database best practices and modeling conventions. This book will give them the tools to navigate their data landscape and confidently create their own models and analyses.

What this book covers

Chapter 1, Unlocking the Power of Modeling, explores the role that models play in simplifying and guiding our everyday experience. This chapter unpacks the concept of modeling into its constituents: natural language, technical, and visual semantics. This chapter also gives you a glimpse into how modeling differs across various types of databases.

Chapter 2, An Introduction to the Four Modeling Types, looks at the four types of modeling covered in this book: conceptual, logical, physical, and transformational. This chapter gives an overview of where and how each type of modeling is used and what it looks like. This foundation gives you a taste of where the upcoming chapters will lead.

Chapter 3, Mastering Snowflake’s Architecture, provides a history of the evolution of database architectures and highlights the advances that make the data cloud a game changer in scalable computing. Understanding the underlying architecture will inform how Snowflake’s three-tier architecture unlocks unique capabilities in the models we design in later chapters.

Chapter 4, Mastering Basic Snowflake Objects, explores the various Snowflake objects we will use in our modeling exercises throughout the book. This chapter looks at the memory footprints of the different table types, change tracking through streams, and the use of tasks to automate data transformations, among many other topics.

Chapter 5, From Logical Concepts to Snowflake Objects, bridges universal modeling concepts such as entities and relationships with accompanying Snowflake architecture, storage, and handling. This chapter breaks down the fundamentals of Snowflake data storage, detailing micro partitions and clustering so that you can make informed and cost-effective design decisions.

Chapter 6, Mastering Advanced Snowflake Objects, dives deep into the fundamentals and use cases of Snowflake’s newer and more specialized table types. This chapter explores the mixed analytics/transactional potential of hybrid tables, building automated data pipelines with dynamic tables, and using Iceberg tables for interoperable workloads across cloud platforms.

Chapter 7, Seeing Snowflake’s Architecture through Modeling Notation, explores why there are so many competing and overlapping visual notations in modeling and how to use the ones that work. This chapter zeroes in on the most concise and intuitive notations you can use to plan and design database models and make them accessible to business users simultaneously.

Chapter 8, Putting Conceptual Modeling into Practice, starts the journey of creating a conceptual model by engaging with domain experts from the business and understanding the elements of the underlying business. This chapter uses Kimball’s dimensional modeling method to identify the facts and dimensions, establish the bus matrix, and launch the design process. We also explore how to work backward using the same technique to align a physical model to a business model.

Chapter 9, Putting Logical Modeling into Practice, continues the modeling journey by expanding the conceptual model with attributes and business nuance. This chapter explores how to resolve many-to-many relationships, expand weak entities, and tackle inheritance in modeling entities.

Chapter 10, Database Normalization, demonstrates that normal doesn’t necessarily mean better—there are trade-offs. While most database models fall within the first to third normal forms, this chapter takes you all the way to the sixth, with detailed examples to illustrate the differences. This chapter also explores the various data anomalies that normalization aims to mitigate.

Chapter 11, Database Naming and Structure, takes the ambiguity out of database object naming and proposes a clear and consistent standard. This chapter focuses on the conventions that will enable you to scale and adjust your model and avoid breaking downstream processes. By considering how Snowflake handles cases and uniqueness, you can make confident and consistent design decisions for your physical objects.

Chapter 12, Putting Physical Modeling into Practice, translates the logical model from the previous chapter into a fully deployable physical model. In this process, we handle the security and governance concerns accompanying a physical model and its deployment. This chapter also explores physicalizing logical inheritance and demonstrates how to go from DDL to generating a visual diagram. Finally, we jump from physical models to semantic models and learn to “talk to our data” using Cortex AI.

Chapter 13, Putting Transformational Modeling into Practice, demonstrates how to use the physical model to drive transformational design and improve performance gains through join elimination in Snowflake. The chapter discusses the types of joins and set operators available in Snowflake and provides guidance on monitoring Snowflake queries to identify common issues. Using these techniques, you will practice creating transformational designs from business requirements.

Chapter 14, Modeling Slowly Changing Dimensions, delves into the concept of slowly changing dimensions (SCDs) and provides you with recipes for maintaining SCDs efficiently using Snowflake features. You will learn about the challenges of keeping record counts in dimension tables in check and how mini dimensions can help address this issue. The chapter also discusses creating multifunctional surrogate keys and compares them with hashing techniques.

Chapter 15, Modeling Facts for Rapid Analysis, focuses on fact tables and explains the different types of fact tables and measures. You will discover versatile reporting structures such as the reverse balance and range-based factless facts, and learn how to recover deleted records. This chapter also provides related Snowflake recipes for building and maintaining all the operations mentioned.

Chapter 16, Modeling Semi-Structured Data, explores techniques required to use and model semi-structured data in Snowflake. This chapter demonstrates that, while Snowflake makes querying semi-structured data easy, there is effort involved in transforming it into a relational format that users can understand. We explore the benefits of converting semi-structured data to a relational schema and review a rule-based method for doing so.

Chapter 17, Modeling Hierarchies, provides you with an understanding of the different types of hierarchies and their uses in data warehouses. The chapter distinguishes between hierarchy types and discusses modeling techniques for maintaining each of them. You will also learn about Snowflake features for traversing a recursive tree structure and techniques for handling changes in hierarchy dimensions.

Chapter 18, Scaling Data Models through Modern Techniques, discusses the utility of Data Vault methodology in modern data platforms and how it addresses the challenges of managing large, complex, and rapidly changing data environments. This chapter also discusses the efficient loading of Data Vault with multi-table inserts and creating Star and Snowflake schema models for reporting information marts. Additionally, you will be introduced to Data Mesh and its application in managing data in large, complex organizations. Of course, the conversation would not be complete without mentioning AI and the role data modeling plays in modern data initiatives. Finally, the chapter reviews modeling best practices mentioned throughout the book.

Chapter 19, Unlocking Value for Enterprise Organizations, tackles the biggest challenge facing data modeling initiatives: getting organizational buy-in from business stakeholders who don’t understand the value. You will learn how to speak the language of business by demonstrating data modeling ROI using concrete metrics that show every dollar invested can generate nearly 30% returns. The chapter provides proven strategies for overcoming common misconceptions, building momentum through quick wins, and transforming data modeling from a technical burden into a strategic business advantage that drives collaboration and competitive edge.

Chapter 20, Appendix, collects all the fun and practical Snowflake recipes that couldn’t fit into the structure of the main chapters. This chapter showcases useful techniques such as the exceptional time traveler, exposes the (secret) virtual column type, and more!

To get the most out of this book

This book will rely heavily on the design and use of visual modeling diagrams. While a diagram can be drawn by hand, maintained in Excel, or constructed in PowerPoint, a modeling tool with dedicated layouts and functions is recommended. As the exercises in this book will take you from conceptual database-agnostic diagrams to deployable and runnable Snowflake code, a tool that supports Snowflake syntax and can generate deployable DDL is recommended.

This book uses visual examples from SqlDBM, an online database modeling tool that supports Snowflake. A free trial is available on their website here: https://sqldbm.com/Home/.

Another popular online diagramming solution is Lucidchart (https://www.lucidchart.com/pages/). Although Lucidchart does not support Snowflake as of this writing, it offers a free tier for designing ER diagrams as well as other models such as Unified Modeling Language (UML) and network diagrams.

Software/hardware covered in the book

Operating system requirements

Snowflake Data Cloud

Windows, macOS, or Linux

SQL

Windows, macOS, or Linux

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Data-Modeling-with-Snowflake-2E. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book.

You can download it here: https://packt.link/gbp/9781837028030.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: “Adding a discriminator between the CUSTOMER supertype and the LOYALTY_CUSTOMER subtype adds context that would otherwise be lost at the database level.”

A block of code is set as follows:

-- Query the change tracking metadata to observe-- only inserts from the timestamp till nowselect*from myTable changes(information => append_only) at(timestamp=> $cDts);

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Subtypes share common characteristics with a supertype entity but have additional attributes that make them distinct.”

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book or have any general feedback, please email us at [email protected] and mention the book’s title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packt.com/submit-errata, click Submit Errata, and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packt.com/.

Your Book Comes with Exclusive Perks - Here’s How to Unlock Them

Unlock this book’s exclusive benefits now

Scan this QR code or go to packtpub.com/unlock, then search this book by name. Ensure it’s the correct edition.

Note: Keep your purchase invoice ready before you start.

Enhanced reading experience with our Next-gen Reader:

Multi-device progress sync: Learn from any device with seamless progress sync.

Highlighting and notetaking: Turn your reading into lasting knowledge.

Bookmarking: Revisit your most important learnings anytime.

Dark mode: Focus with minimal eye strain by switching to dark or sepia mode.

Learn smarter using our AI assistant (Beta):

Summarize it: Summarize key sections or an entire chapter.

AI code explainers: In the next-gen Packt Reader, click the Explain button above each code block for AI-powered code explanations.

Note: The AI assistant is part of next-gen Packt Reader and is still in beta.

Learn anytime, anywhere:

Access your content offline with DRM-free PDF and ePub versions—compatible with your favorite e-readers.

Unlock Your Book’s Exclusive Benefits

Your copy of this book comes with the following exclusive benefits:

Next-gen Packt Reader

AI assistant (beta)

DRM-free PDF/ePub downloads

Use the following guide to unlock them if you haven’t already. The process takes just a few minutes and needs to be done only once.

How to unlock these benefits in three easy steps

Step 1

Keep your purchase invoice for this book ready, as you’ll need it in Step 3. If you received a physical invoice, scan it on your phone and have it ready as either a PDF, JPG, or PNG.

For more help on finding your invoice, visit https://www.packtpub.com/unlock-benefits/help.

Note: Did you buy this book directly from Packt? You don’t need an invoice. After completing Step 2, you can jump straight to your exclusive content.

Step 2

Scan this QR code or go to packtpub.com/unlock.

On the page that opens (which will look similar to Figure X.1 if you’re on desktop), search for this book by name. Make sure you select the correct edition.

Figure X.1: Packt unlock landing page on desktop

Step 3

Sign in to your Packt account or create a new one for free. Once you’re logged in, upload your invoice. It can be in PDF, PNG, or JPG format and must be no larger than 10 MB. Follow the rest of the instructions on the screen to complete the process.

Need help?

If you get stuck and need help, visit https://www.packtpub.com/unlock-benefits/help for a detailed FAQ on how to find your invoices and more. The following QR code will take you to the help page directly:

Note: If you are still facing issues, reach out to [email protected].

Share your thoughts

Once you’ve read Data Modeling with Snowflake, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Part 1 Core Concepts in Data Modeling and Snowflake Architecture

This part provides you with a comprehensive overview of the power and potential of data modeling within the Snowflake cloud data platform. You will be introduced to the fundamental concepts and techniques that underpin effective modeling, including the importance of understanding data relationships and the role of modeling in driving better business outcomes. This part also includes a detailed examination of the four different types of modeling, highlighting their benefits and use cases. Finally, we focus specifically on Snowflake architecture and objects, exploring how to master this powerful platform and optimize it for maximum performance and value. Through a combination of theoretical insights and practical examples, you will gain a deep understanding of how to use modeling to unlock the full potential of Snowflake and transform your approach to data management and analysis.

This part has the following chapters:

Chapter 1, Unlocking the Power of ModelingChapter 2, An Introduction to the Four Modeling TypesChapter 3, Mastering Snowflake’s ArchitectureChapter 4, Mastering Basic Snowflake ObjectsChapter 5, From Logical Concepts to Snowflake ObjectsChapter 6, Mastering Advanced Snowflake ObjectsChapter 7, Seeing Snowflake’s Architecture through Modeling Notation