Tableau Prep Cookbook - Hendrik Kleine - E-Book

Tableau Prep Cookbook E-Book

Hendrik Kleine

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Tableau Prep is a tool in the Tableau software suite, created specifically to develop data pipelines. This book will describe, in detail, a variety of scenarios that you can apply in your environment for developing, publishing, and maintaining complex Extract, Transform and Load (ETL) data pipelines.

The book starts by showing you how to set up Tableau Prep Builder. You’ll learn how to obtain data from various data sources, including files, databases, and Tableau Extracts. Next, the book demonstrates how to perform data cleaning and data aggregation in Tableau Prep Builder. You’ll also gain an understanding of Tableau Prep Builder and how you can leverage it to create data pipelines that prepare your data for downstream analytics processes, including reporting and dashboard creation in Tableau. As part of a Tableau Prep flow, you’ll also explore how to use R and Python to implement data science components inside a data pipeline. In the final chapter, you’ll apply the knowledge you’ve gained to build two use cases from scratch, including a data flow for a retail store to prepare a robust dataset using multiple disparate sources and a data flow for a call center to perform ad hoc data analysis.

By the end of this book, you’ll be able to create, run, and publish Tableau Prep flows and implement solutions to common problems in data pipelines.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 221

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Tableau Prep Cookbook

Use Tableau Prep to clean, combine, and transform your data for analysis

Hendrik Kleine

BIRMINGHAM—MUMBAI

Tableau Prep Cookbook

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh

Publishing Product Manager: Sunith Shetty

Senior Editor: David Sugarman

Content Development Editor: Nathanya Dias

Technical Editor: Sonam Pandey

Copy Editor: Safis Editing

Project Coordinator: Aishwarya Mohan

Proofreader: Safis Editing

Indexer: Priyanka Dhadke

Production Designer: Aparna Bhagat

First published: March 2021

Production reference: 1170221

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-376-6

www.packt.com

Contributors

About the author

Hendrik Kleine is an advanced analytics leader with 15 years of experience in the analytics space, including in data architecture, engineering, and visualization. He specializes in translating vast amounts of data into easy-to-understand visual communications that provide actionable intelligence. He is an avid innovator and a listed author of multiple data-related inventions. Before COVID-19, he was a speaker at the most recent Tableau conference in San Francisco.

I want to thank the people who have motivated, supported, and inspired me, especially my wife, Kinga, and our beautiful children, Holden and Fallon.

About the reviewers

Fabian Peri's interest in decision analysis started after joining his first fantasy basketball league in 2006. His love for data analysis led him to pursue an MBA in information systems at the University of Tulsa, and then an MSc in predictive analytics from Northwestern University. Since graduating, he has primarily worked in risk analysis and management for companies such as Amazon, GE Capital, and Wells Fargo. He is currently focused on using visualization to explore and interpret vast quantities of data.

Vicente Ruben is a data professional with more than a decade of experience in big data analytics. His expertise comprises architecture design and the development and implementation of business intelligence and data warehouse environments at scale. Vicente Ruben has implemented big data solutions for several Fortune 20 companies and currently leads data engineering solutions at one of the world's largest healthcare companies. He has expertise in a wide range of technologies, ranging from relational databases such as SQL Server and business intelligence suites such as the Microsoft stack (Azure and SQL Server) to NoSQL databases such as MongoDB and CouchBase and cloud services such as Azure and AWS.

Table of Contents

Preface

Chapter 1: Getting Started with Tableau Prep

Technical requirements

Installing Tableau Prep Builder

Getting ready

How to do it…

How it works…

Checking out the user interface

Getting ready

How to do it…

How it works…

Using Tableau Prep for ad hoc data analysis

Getting ready

How to do it…

How it works…

Preparing data for generic BI tools

Getting ready

How to do it…

How it works…

Preparing data for Tableau Desktop ad hoc analysis

Getting ready

How to do it…

How it works…

Chapter 2: Extract and Load Processes

Technical requirements

Connecting to text and Excel files

Getting ready

How to do it…

How it works…

Connecting to PDF files

Getting ready

How to do it…

How it works…

Connecting to SAS, SPSS, and R files

Getting ready

How to do it…

How it works…

There's more…

Connecting to on-premises databases

Getting ready

How to do it…

How it works…

There's more…

Connecting to cloud databases

Getting ready

How to do it…

How it works…

There's more…

Connecting to Tableau extracts

Getting ready

How to do it…

How it works…

Connecting to JDBC or ODBC data sources

Getting ready

How to do it…

How it works…

Writing data to CSV and Hyper files

Getting ready

How to do it…

How it works…

Writing data to databases

Getting ready

How to do it…

How it works…

Setting up an incremental refresh

Getting ready

How to do it…

How it works…

Publishing a flow to Tableau Server

Getting ready

How to do it…

How it works…

Chapter 3: Cleaning Transformations

Technical requirements

Renaming columns

Getting ready

How to do it…

How it works…

Filtering your dataset

Getting ready

How to do it…

How it works…

Changing data types

Getting ready

How to do it…

How it works…

Auto-validating data

Getting ready

How to do it…

How it works…

Validating data with a custom reference list

Getting ready

How to do it…

How it works…

Splitting fields with multiple values

Getting ready

How to do it…

How it works…

Chapter 4: Data Aggregation

Technical requirements

Determining granularity

Getting ready

How to do it…

How it works…

Aggregating values

Getting ready

How to do it…

How it works…

Using fixed LOD calculations for grouping data

Getting ready

How to do it…

How it works…

There's more…

Grouping data

Getting ready

How to do it…

How it works…

There's more…

Chapter 5: Combining Data

Technical requirements

Combining data with Union

Getting ready

How to do it…

How it works…

Combining data ingest and Union actions

Getting ready

How to do it…

How it works…

Combining datasets using an inner join

Getting ready

How to do it…

How it works…

Combining datasets using a left or right join

Getting ready

How to do it…

How it works…

Expanding datasets using a full outer join

Getting ready

How to do it…

How it works…

Expanding datasets using a not inner join

Getting ready

How to do it…

How it works…

Chapter 6: Pivoting Data

Technical requirements

Pivoting columns to rows

Getting ready

How to do it…

How it works…

Pivoting columns to rows using wildcards

Getting ready

How to do it…

How it works…

Pivoting rows to columns

Getting ready

How to do it…

How it works…

Chapter 7: Creating Powerful Calculations

Technical requirements

Creating calculated fields

Getting ready

How to do it…

How it works…

Creating conditional calculations

Getting ready

How to do it…

How it works…

Extracting substrings

Getting ready

How to do it…

How it works…

Changing date formats with calculations

Getting ready

How to do it…

How it works…

Creating relative temporal calculations

Getting ready

How to do it…

How it works…

Creating regular expressions in calculations

Getting ready

How to do it…

How it works…

Chapter 8: Data Science in Tableau Prep Builder

Technical requirements

Preparing Tableau Prep to work with R

Getting ready

How to do it…

How it works…

Embedding R code in a Tableau Prep flow

Getting ready

How to do it…

How it works…

Forecasting time series using R

Getting ready

How to do it…

How it works…

Preparing Tableau Prep to work with Python

Getting ready

How to do it…

How it works…

Embedding Python code in a Tableau Prep flow

Getting ready

How to do it…

How it works…

Chapter 9: Creating Prep Flows in Various Business Scenarios

Technical requirements

Creating a flow for transaction analytics

Getting ready

How to do it…

How it works…

Creating a call center flow for instant analysis

Getting ready

How to do it…

How it works…

Why subscribe?

Other Books You May Enjoy

Chapter 1: Getting Started with Tableau Prep

Tableau Prep Builder is an exciting new platform to develop data pipelines to transform your data for reporting and analytics purposes.

In this chapter, you will come to understand how we think about data preparation from the perspective of Tableau Prep Builder. You will learn about the different use cases you may employ Tableau Prep for, be it ad hoc data analysis, creating a dataset for a BI tool, or specifically for Tableau Desktop.

In this chapter, we will cover the following recipes:

Installing Tableau Prep BuilderChecking out the user interfaceUsing Tableau Prep for ad hoc data analysisPreparing data for generic BI toolsPreparing data for Tableau Desktop ad hoc analysis

Technical requirements

To follow along with the recipes in this chapter, you need to have Tableau Prep Builder installed, and Tableau Desktop. In the first recipe, we'll walk through the details of installing Tableau Prep Builder.

Installing Tableau Prep Builder

In this recipe, you'll install Tableau Prep Builder. We'll download the software, perform the installation, and open Tableau Prep Builder for the first time.

Getting ready

To enjoy the many benefits of Tableau Prep Builder, you need a license key. Typically, this would be issued by your administrator. Alternatively, you may have purchased a license yourself at https://buy.tableau.com/.

If you do not have a license key, Tableau offers a free trial so you can start right away.

As with all recipes in this book, the installation is performed on an Apple MacBook running macOS Big Sur. The steps are nearly identical on Windows machines and you can follow along on either operating system.

How to do it…

Ensure you're connected to the internet and have your favorite browser open:

Navigate to https://www.tableau.com/products/prep/download, enter your business email, and click START FREE TRIAL:

Figure 1.1 – Tableau Prep Builder download site

The installer file should start downloading in a few seconds. Wait until the download has been completed, then proceed to open it. On the first step, click Continue.Review the license agreement and when ready, select Continue.Next, the installer will confirm the installation destination. In most cases, the default location should work. However, you may customize it at this time. When done, click Install to continue.The installer may prompt you for your password. This is normal. Enter your password and click Install Software to continue.When the installation has completed, click Close. You won't be needing the installer file after this, so you may safely delete it. Select Move to Bin to do so now.With the installation completed, you may now open Tableau Prep Builder for the first time. Your trial will automatically be activated without the need for a product key. If you do have a product key, you can always add it via the Help menu under Manage Product Keys.

You're now ready to start using Tableau Prep Builder with the recipes in this book.

How it works…

Tableau Prep Builder is updated frequently, and you may expect new features, enhancements, and bug fixes at a regular cadence. Once you have installed a version, as in this recipe, Tableau Prep Builder will always notify you upon startup if a more recent version is available, along with a link to the download page.

Checking out the user interface

Tableau has taken great care in creating an interface that is intuitive and easy to understand. Perhaps best of all, it has quite a few similarities to the manner in which things are laid out in Tableau Desktop. So, if you are familiar with Tableau Desktop, you should feel right at home.

In this recipe, we will take a brief tour of the Tableau Prep user interface.

Getting ready

Tableau Prep provides what we need right out of the box. That includes data connectors, sample flows, training resources, and community updates. We'll walk through these step by step. This knowledge is foundational to all recipes.

How to do it…

Open Tableau Prep:

When you open Tableau Prep Builder, you're presented with the home screen. From here, you can take a number of actions, which we'll cover briefly:

Figure 1.2 – Tableau Prep Builder home page

In Tableau Prep, a flow is what we call a data pipeline. If you've used other software in the past, you may have referred to a pipeline as an Extract, Transform, and Load (ETL) process, workflow, or data pipeline. 

It's easy to start a new flow, simply by creating a data connection. To get started, click the blue Connect to Data button to expand the data connection options:

Figure 1.3 – Starting a new flow

From here, select your connection type, and that will complete the creation of a new flow:

Figure 1.4 – Selecting a data connection type

In Chapter 2, Extract and Load Processes, we'll cover the configuration of various data connections in detail.

At the bottom of the home page, you'll notice two example flows provided by Tableau:

Figure 1.5 – Preinstalled sample flows

Both these flows use the sample Superstore and WorldIndicators data that is delivered with the Tableau Desktop application as well, so you may be familiar with this data already.

These example flows can be opened with one click and run locally. They're excellent for testing out quick actions and recipes learned in this book, without the need for you to create a new flow from scratch. Personally, I've become so accustomed to this, I typically try something out in an example workflow quickly, and then move on to my own flow and implement the action there when I'm confident it'll work.

To the right side of the home screen, you'll find the Discover pane:

Figure 1.6 – Discover pane

The Discover pane has two sections that are always visible, Training and Resources. Training includes links to training materials authored by the Tableau team, while Resources includes links to Tableau blogs and user community forums. 

These links update over time, so it's great to glance at this pane every time you open Tableau Prep, to make sure you're up to date on the latest developments.

There are two ways to open flows. Firstly, you can use the Open a Flow button at the top of the home screen.

The second, one-click approach is to select a flow from the Recent Flows section. This section will automatically update based on your activity, with the latest flow accessed being the first one listed:

Figure 1.7 – Quick access to Recent Flows

Click the Superstore flow in the Recent Flows section to view a flow in the flow builder interface, which shows you the data input, transformations, and output steps in a single overview:

Figure 1.8 – Flow builder interface

A key feature of the flow interface is pausing data updates, which you can enable and disable with a single click in the top action bar:

Figure 1.9 – Pausing data updates

When data updates are paused, Tableau Prep Builder does not validate all the changes you are making instantly. As a result, you get increased performance. However, some features that require a data preview will be disabled until you resume data updates:

Figure 1.10 – Some features are disabled when data updates are paused

Next to the Pause Data Updates icon, you'll find the Data Refresh button. This comes in handy when you are actively working on a flow and you are expecting changes to your data inputs at the same time. For example, a column may have been added to a data input since you opened the flow. In that case, you'll need to refresh the input to ensure the column becomes visible to Tableau Prep:

Figure 1.11 – Refreshing all data inputs

You can click the button itself to simultaneously update all inputs. Alternatively, open the dropdown to select a single input to update:

Figure 1.12 – Refreshing specific data inputs

The play button in the action bar will run your workflow and produce all outputs with a single click:

Figure 1.13 – Running an entire data flow

However, you may also select the dropdown and select a specific output to be generated only. This could significantly improve the flow runtime, a great benefit during development and testing:

Figure 1.14 – Running a specific data output only

You're now familiar with the foundational elements that make up the Tableau Prep Builder user interface and can start building flows using your data.

How it works…

Simply put, Tableau Prep Builder works by ingesting data from a source to your local machine and processes it there whenever you make updates to a flow, in real time. To stay performant, Tableau Prep Builder automatically takes a sample of your data inputs only during this process. When you execute an entire flow, only then will the full data input be processed, and so this may take longer than previewing data in real time. In Chapter 2, Extract and Load Processes, there is a recipe to manage the sampling size and method used by Tableau Prep.

Using Tableau Prep for ad hoc data analysis

In this recipe, you'll learn how to leverage Tableau Prep Builder to perform ad hoc data analysis. In most scenarios, getting insights from your data would involve the creation of a data pipeline and then connecting a data visualization tool to the output of that pipeline to perform your analysis. However, with Tableau Prep Builder, you can perform basic ad hoc analysis on your data from within the tool itself.

Getting ready

Open the Tableau Prep Superstore flow to follow along with the steps outlined.

How to do it…

Ad hoc analysis typically starts with a business question to be answered with the use of your data. Let's assume the question posed for the Superstore data is: Which is the top category of products that consumers order with same-day delivery?

Following these steps, you'll be able to use Tableau Prep to answer this question without the need for additional reporting tools:

In Tableau Prep Builder, select the All orders step:

Figure 1.15 – Selecting the All orders step

Whenever we select a step in Tableau Prep, the bottom pane will become visible. The pane will leverage data as it is at the time of the step being selected. In our case, this will be the state of the data after having passed through the All orders step:

Figure 1.16 – The bottom pane offers many additional options

First, let's reduce the dataset to consumers only. To do this, scroll horizontally through the columns in the results pane until you find the Segment column. From the three available values, Consumer, Corporate, and Home Office, right-click Consumer. From the context menu, select Keep Only:

Figure 1.17 – Selecting the All orders step will bring up the bottom pane

Tableau Prep will instantly apply the filter and show you the data preview excluding any segments that are not Consumer.

Next, locate the Delivery Mode field. We could perform the method of filtering as in the previous step. However, an alternate method ideally suited to quick exploratory analysis is using highlights. Highlights instantly mark data related to the selected value in the results pane, in a shade of blue. Go ahead and left-click Same Day delivery mode.Next, locate the Category column and sort its values by descending order:

Figure 1.18 – Instantly sort data in Tableau Prep Builder

Now, we can clearly see the top category for consumers' orders with same-day shipping is Office Supplies, which is the answer to the business question posed. We can get additional information by hovering over the item and see that 707 rows, or 5% of consumer orders, fall into this category:

Figure 1.19 – Instant category details

With these steps, you've quickly performed ad hoc analysis on the Superstore data and identified the top product category for consumers who placed orders with the same-day shipping delivery mode.

How it works…

Using Tableau Prep Builder, we've quickly performed exploratory data analysis without the need to run our flow or create new outputs. Doing so provides great value not only in terms of a fast turnaround but also in keeping your data landscape clean by avoiding the creation of new data sources (outputs) for simple analysis.

When you perform analysis in this fashion, Tableau Prep instantly runs the required actions in the background to give you the results. In the Superstore example flow, this is fairly quick. However, on large datasets, this may take more time. Tableau Prep will show a progress indicator in the top-right corner when performing such background actions:

Figure 1.20 – Background actions

In this recipe, you've learned how to quickly perform ad hoc data analysis in Tableau Prep without the need to export your data for analysis in a downstream application.

Preparing data for generic BI tools

In this recipe, you'll learn how to use Tableau Prep to generate outputs for consumption by a variety of Business Intelligence (BI) tools. Specifically, we'll write a single output, from a flow with multiple outputs, to a CSV file. At the time of writing, output to CSV is the only non-Tableau proprietary format supported by Tableau Prep. Future releases of Tableau Prep will see the introduction of output to databases.

Getting ready

Open the Tableau Prep Superstore flow to follow along with the steps outlined.

How to do it…