39,59 €
Data is everywhere and everything is data!
Visualization of data allows us to bring out the underlying trends and patterns inherent in the data and gain insights that enable faster and smarter decision making.
Tableau is one of the fastest growing and industry leading Business Intelligence platforms that empowers business users to easily visualize their data and discover insights at the speed of thought. Tableau is a self-service BI platform designed to make data visualization and analysis as intuitive as possible.
Creating visualizations with simple drag-and-drop, you can be up and running on Tableau in no time.
Starting from the fundamentals such as getting familiarized with Tableau Desktop, connecting to common data sources and building standard charts; you will walk through the nitty gritty of Tableau such as creating dynamic analytics with parameters, blended data sources, and advanced calculations. You will also learn to group members into higher levels, sort the data in a specific order & filter out the unnecessary information. You will then create calculations in Tableau & understand the flexibility & power they have and go on to building story-boards and share your insights with others.
Whether you are just getting started or whether you need a quick reference on a “how-to” question, This book is the perfect companion for you
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 341
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: December 2016
Production reference: 1161216
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78439-551-3
www.packtpub.com
Author
Shweta Sankhe-Savale
Reviewer
Sally Zhang
Commissioning Editor
Veena Pagare
Acquisition Editor
Chaitanya Nair
Content Development Editor
Sumeet Sawant
Technical Editor
Akash Patel
Copy Editor
Safis Editing
Project Coordinator
Shweta H Birwatkar
Proofreader
Safis Editing
Indexer
Aishwarya Gangawane
Graphics
Disha Haria
Production Coordinator
Nilesh Mohite
Cover Work
Nilesh Mohite
Shweta Sankhe-Savale is the Co-founder and Head of Client Engagements at Syvylyze Analytics (pronounced as "civilize"), a boutique business analytics firm specializing in visual analytics. Shweta is a Tableau Desktop Qualified Associate and a Tableau Accredited Trainer. Being one of the leading experts on Tableau in India, Shweta has translated her experience and expertise into successfully rendering analytics and data visualization services for numerous clients across a wide range of industry verticals. She has taken up numerous training as well as consulting assignments for customers across various sectors like BFSI, FMCG, Retail, E-commerce, Consulting & Professional Services, Manufacturing, Healthcare & Pharma, ITeS etc. She even had the privilege of working with some of the renowned Government and UN agencies as well.
Combining her ability to breakdown complex concepts, with her expertise on Tableau's visual analytics platforms, Shweta has successfully trained over a 1300+ participants from 85+ companies.
LinkedIn: https://in.linkedin.com/in/shwetasavale
Sally Zhang is an aspiring data scientist and software engineer. She graduated with a BS in Statistics and Computer Science in May 2014 and an MS in Computer Science in Dec 2015 from University of Illinois Urbana-Champaign. Sally is currently working at Apple, Inc. as a Machine Learning Software Engineer. Before this, she worked at Neustar, Bazaarvoice, and Groupon, where she worked on data analytics, data engineering and infrastructures with tools and languages such as Tableau, Hadoop, Splunk, Python/Bash/R, Java, and SQL.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Thank you for purchasing this Packt book. We take our commitment to improving our content and products to meet your needs seriously—that's why your feedback is so valuable. Whatever your feelings about your purchase, please consider leaving a review on this book's Amazon page. Not only will this help us, more importantly it will also help others in the community to make an informed decision about the resources that they invest in to learn.
You can also review for us on a regular basis by joining our reviewers' club. If you're interested in joining, or would like to learn more about the benefits we offer, please contact us: <[email protected]>
Tableau is a suite of business analytics and data visualization tools that allows people to explore and analyze data quickly and easily with simple drag-and-drop operations.
Tableau software Inc. (http://www.tableau.com/) was founded in 2003 by Chris Stolte, Christian Chabot, and Pat Hanrahan. What began as a research project in Stanford University between 1999 and 2002 soon changed the way people see and interact with their data. Through the development of a database visualization language called VizQL (Visual Query Language), which is a combination of a structured query language for databases and a descriptive language for rendering graphics, Tableau was able to give great power to the end users and allowed them to visualize and interact with their data with simple drag-and-drop operations.
Giving people the ability to analyze their data and interact with it at the speed of thought, Tableau software empowered end users to ask questions on the fly by using their self-service analysis products suite.
The Tableau product suite – differences between the products
Before we actually get started on any visualization, it would be useful to understand the overall range and purpose of the various products offered by Tableau.
The overall suite of products can be broadly bifurcated into two categories; the ones built for creation of dashboards and visualizations and those built for collaboration, sharing, and management of these dashboards and visualizations.
Tableau Desktop
Tableau Desktop is the primary tool most of us will spend most of our time on. It is where we actually create the visualizations, analytics, and dashboards. It is the tool in which all our development will be done.
Tableau Desktop comes in two editions; the Desktop Professional edition which most of us will typically use, and the Desktop Personal edition, which is typically used by people with limited data connectivity needs. To explain this just a bit further, the Desktop Professional edition is a full feature version that can connect to a wide range of data sources, including flat files as well as large database formats (which we will cover in more detail a bit later). Correspondingly the Desktop Personal edition is a limited version in the sense that it can connect only to flat file formats as a data source (Excel, Access, Statistical files, and so on) and does not give the option to connect to any database formats. However, in terms of all other features, the Desktop Professional and the Desktop Personal editions are essentially identical.
Tableau Public
Tableau Public is a free edition that is very similar to Desktop Personal in most ways. It has virtually the full range of features available in the Desktop editions and connects only to flat file formats and not database formats, but with one key distinction. The Tableau Public edition is meant for anyone wanting to post their dashboards and visualizations on the Web, typically for bloggers, journalists, researchers, and the like dealing with public or open data. Hence, the Tableau Public edition does not allow you to save your work offline to your laptop, but publishes your visualizations directly to the Web, on your Tableau Public account on the Tableau public cloud server. The Tableau Public edition is a great tool for anyone wanting to build great visualizations for public consumption, but is not recommended for anyone working with confidential data.
Tableau Server
Tableau Server is an on-premise hosted browser and mobile-based collaboration platform used to publish dashboards created in Tableau Desktop and share them throughout your organization. It allows you to share, to some extent edit, and publish dashboards within your organization while managing access rights and making your visualizations accessible securely over the Web. It also allows you to maintain live data connectivity to backend data sources, which in turn allows users to view up-to-date dashboards online from anywhere. The Tableau Server also allows you to view your dashboards on a mobile tablet through an app available on iOS as well as Android.
Tableau Online
Tableau Online is a cloud hosted version or SaaS version of Tableau Server. It brings Server's capabilities on the cloud without the infrastructure cost.
Tableau Reader
Tableau Reader is a free desktop application that you can use to open, view, and interact with dashboards and visualizations built in Tableau Desktop. Since the dashboards built in Tableau Desktop can package the data within the workbook itself when you save it, Tableau Reader allows you to filter, drill down, view the details of the data, and interact with the dashboards to the full extent of what the author has intended. That said, it being a reader, you cannot make any changes or edit the dashboard in any way beyond what has already been built in by the author.
With this brief introduction to Tableau's suite of products, you will notice that the entire process of creating a dashboard or visualization is done within Tableau Desktop, and thereby this is the product we will be focusing on for the purposes of this book.
In this book, we will go through a bunch of recipes and create a Tableau workbook. The idea is that we follow the recipes and create them from scratch; however, a final copy of the Tableau workbook has been uploaded on the following link.
https://1drv.ms/u/s!Av5QCoyLTBpnhlRBwZcWGGJKpasC.
Chapter 1, Keep Calm and Say Hello to Tableau, covers the fundamentals of Tableau. We learn how to connect to data, get acquainted with the Tableau workspace and terminologies, and finally see how to save the workbook as a Tableau workbook.
Chapter 2, Ready to Build Some Charts? Show Me!, focuses on the data visualization part. We learn to create some basic charts such as text table, highlight table, heat map, bar chart, stacked bar, pie chart, line chart, area chart, tree map, packed bubble chart, and word cloud.
Chapter 3, Hungry for More Charts? Dig In!, focuses on the advanced chart types in Tableau. We learn how to create charts to compare multiple measures by creating a blended axes chart, dual axes chart, combination chart, scatter plot, and so on. We will also understand how we can create a Gantt chart, build maps and use background images.
Chapter 4, Slice and Dice – Grouping, Sorting, and Filtering Data, teaches you how to do various analyses on the data such as grouping members into higher levels, sorting, filtering unnecessary information, and creating custom hierarchies.
Chapter 5, Adding Flavor – Create Calculated Fields, looks at various calculations in Tableau. The idea is that not every single field will come from the database, and hence one needs to create some calculations in the tool. We will look at creating custom calculations, level-of-detail calculations, and the use of table calculations and parameters.
Chapter 6, Serve It on a Dashboard!, is about building one holistic view for end users and giving them a consolidated snapshot of the business. We will look at building dashboards, the use of actions to link multiple sheets on the dashboard, the use of images on the dashboard, formatting dashboards, and using cross-data-source filters.
Chapter 7, The Right MIX – Blending Multiple Data Sources, walks us through the options of connecting to data from multiple data sources. We will look at concepts such as data blending, multiple table joins, cross-database joins, unions, custom SQL, and working with Tableau extracts.
Chapter 8, Garnish with Reference Lines, Trends, Forecasting, and Clustering, focuses on some specific analytics in terms of computing and understanding trends in the data. We do a forecast by using the in-built forecasting model and lastly understand the use of reference lines as benchmark. We will look at topics like trend lines, forecast, reference lines, bullet charts and clustering.
Chapter 9, Bon Appétit! Tell a Story and Share It with Others, covers the storytelling feature in Tableau. We also look at the various ways in which one can save and share their work with others.
Chapter 10, Formatting in Tableau for Desserts, focuses on the various formatting options in Tableau.
You'll need any one of the following for the book:
Tableau Desktop Professional version 10.1 or higher
Tableau Desktop Personal version 10.1 or higher
Tableau Public version 10.1 or higher
Please note that it is ideal if you use Tableau Desktop Professional. However, in case you are using Tableau Desktop Personal or Tableau Public, then some of the functionalities mentioned in the book may change.
This book is for anyone who wishes to use Tableau. It will be of use to both beginners who want to learn Tableau from scratch and to more seasoned users who simply want a quick reference guide. This book is a ready reckoner guide for you. The book will be such that both new and existing Tableau users who don't know or can't recall how to perform different Tableau tasks can use it and be benefited from it.
In this book, you will find several headings that appear frequently (Getting ready, How to do it…, How it works…, There's more…, and See also).
To give clear instructions on how to complete a recipe, we use these sections as follows:
This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.
This section contains the steps required to follow the recipe.
This section usually consists of a detailed explanation of what happened in the previous section.
This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.
This section provides helpful links to other useful information for the recipe.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can include other contexts through the use of the include directive."
A block of code is set as follows:
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
Any command-line input or output is written as follows:
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking the Next button moves you to the next screen."
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Tableau-Cookbook-Recipes-for-Data-Visualization. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
If you are using Tableau Public, you'll need to locate the workbooks that have been published to Tableau Public. These may be found at the following link: http://goo.gl/wJzfDO.
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/TableauCookbookRecipesforDataVisualization_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
In this chapter, we will cover the following recipes:
Earlier, in the preface, we saw the various products that are available under Tableau's product suite. The focus of this book will be on Tableau Desktop, the UI authoring tool of Tableau that will help us analyze and visualize our data. So, let's get started by familiarizing ourselves with Tableau Desktop.
To get started in terms of creating our visualizations, we will be using Tableau Desktop Professional Edition Version 10.1. This is the latest version of Tableau products and offers compatibility with both Windows OS as well as Mac OS. However, for the sake of a wider audience, we will be using the Windows-OS-compatible version in this book. That said, those using version 10.1 on a Mac OS should not see much of a difference since the User Interface and functionality is essentially the same.
Before we get started, we need to make sure that Tableau Desktop Professional Edition Version 10.1 is downloaded and installed on our machines.
Note that if we use Tableau Public, then some of the protocols for saving our work will change since Tableau Public doesn't allow us to save our work locally. Also, if we use Tableau Desktop Personal Edition, then apart from certain limitations, such as being able to connect to databases, the rest of the functionality will mostly remain the same.
If you are an existing licensed user of Tableau Desktop, then make sure that you are using the latest version. If the version is anything below 10.1, then it is recommended that you upgrade it to the latest version, as there are significant changes in terms of the User Interface as well as features. Nonetheless, even if we decide to continue with the earlier versions, the majority of the fundamental concepts and features outlined in this book would remain the same, although their implementation may differ.
If you do not have a license, the 14-day full-feature trial of Tableau Desktop Professional Version 10.1 can be installed and downloaded from Tableau's website (http://www.tableau.com/).
Tableau, being a plug and play software, simply needs to be downloaded and installed just like any .exe or .dmg installable.
The Start Page of Tableau is divided into three parts, namely Connect, Open, and Discover; refer to the numbers 1, 2, and 3 respectively in the preceding screenshot.
The top-left corner, which says Connect (number 1), consists of three sections called To a File, To a Server, and Saved Data Sources. In each of these sections, we will see the list of data sources that Tableau can connect to.
The To a Filesection consists of flat file data sources such as Excel files, Access files, Text files, JSON files, and statistical files, including SAS, SPSS, and R. Clicking on the More… option under this section allows us to even connect to Tableau's data extract files, which we will understand in detail in the later chapters.
The To a Server section consists of data sources such as Microsoft SQL Server, MySQL, Oracle, and so on. To see a more detailed list of data sources that Tableau can connect to, we need to click on the More… option and expand the section by clicking on the arrow. Refer to the following image that illustrates this:
Just below the To a Server section, we will see the Saved Data Sources section. This section essentially points to those data sources that have been previously worked on and then saved for later purposes. Currently, we will not use this option and we will start by connecting to the raw data and not use any of the existing saved data sources.
Adjacent to the Connect header is an empty section that is covered under the header Open (number 2). This is the section where we will see the thumbnails of all our recently opened workbooks. This section is blank to begin with; however, as we create and save new workbooks, this section will display the thumbnails of the nine most recently opened workbooks.
This section also gives us access to some sample workbooks that are provided by Tableau for our reference. We can open these workbooks in Tableau Desktop to see how certain functionalities are used or how a particular visualization is created in Tableau.
A workbook in Tableau is similar to that of Excel. Just as an Excel workbook consists of multiple sheets, the workbook in Tableau contains multiple worksheets, and/or dashboards, and/or stories. The worksheet in Tableau consists of one view/visualization/report, whereas a dashboard is a combination of multiple worksheets which when viewed will provide the viewer with a holistic view.
The next section is the Discover section (number 3). This section basically provides links to resources that are available on the Tableau website. Apart from the training videos, this section also shows the blogs, forums, latest news about Tableau, as well as the views that are selected as the Viz of the week on Tableau Public.
Viz is short for visualization/visual. In the Tableau community, this word is used very regularly to describe the visualizations done in Tableau.
Tableau is a very versatile tool and is being used across various industries, businesses, and organizations. These include government and non-profit organizations, the BFSI sector, consulting, construction, education, healthcare, manufacturing, retail, FMCG, software and technology, telecommunications, and many more. The good thing about Tableau is that it is industry- and business-vertical-agnostic, and hence, as long as we have data, we can analyze and visualize it.
Tableau can connect to a wide variety of data sources, and many of the data sources are implemented as native connections in Tableau. This ensures that the connections are as robust as possible.
To view the comprehensive list of data sources that Tableau connects to, we can visit the technical specification page on the Tableau website by clicking on http://www.tableau.com/products/techspecs.
Tableau provides some sample datasets with the Desktop edition. In this book, we will frequently use the sample datasets that have been provided by Tableau. We can find these datasets in the Datasources folder within the My Tableau Repository folder, which gets created in the Documents folder when Tableau Desktop is installed on the machine. We can look for these data sources in the repository or quickly download them from https://1drv.ms/f/s!Av5QCoyLTBpnhj06IKTNX0S9hK48. Once you do this, you save them in a new folder called Tableau Cookbook data that you'll find by navigating to Documents\My Tableau Repository\Datasources.
There are three files that have been uploaded, and these are the ones that we will primarily use throughout the book. They are as follows:
In the following section, we will see how to connect to the sample data source. We will be connecting to the Excel data called Sample - Superstore.xls.
This Excel file contains transactional data for a retail store. There are three worksheets in this Excel workbook. The first sheet, called the Orders sheet, contains the transaction details, The Returns sheet contains the status of returned orders. And the People sheet contains the region names and the names of the managers associated with those regions. Refer to the following image to get a glimpse of how the Excel data is structured:
Now that we have looked at the Excel data, let's see how to connect to this data in the following recipe. To begin with, we will work on the Orders sheet of the Sample - Superstore.xls data. This worksheet contains the order details in terms of the products purchased, the name of the customer, sales, profits, discounts offered, day of purchase, and the order shipment date, among many other transactional details.
Before we connect to any data, we need to make sure that our data is clean and in the right format. The Excel file that we connected to was stored in a tabular format where the first row of the sheet contained all the column headers and every other row is basically a single transaction in the data. This is the ideal data structure for making the best use of Tableau. Typically, when we connect to databases, we get a columnar/tabular type of data. However, flat files, such as Excel, can have data even in cross-tab formats. Although Tableau can read cross-tab data, we may end up facing some limitations in terms of creating certain chart types, thereby aggregating and slicing and dicing our data in Tableau.
Having said that, there may be situations where we have to deal with such cross-tab or preformatted Excel files. These files will essentially need cleaning up before being pulled into Tableau. Refer to http://kb.tableausoftware.com/articles/knowledgebase/preparing-excel-files-analysis to understand more about how we can clean up these files and make them Tableau-ready. Refer to the following article to quickly understand how we can quickly pivot the data in Excel. http://kb.tableau.com/articles/knowledgebase/addin-reshaping-data-excel.
If it is a cross-tab file, then we will have to pivot it into normalized columns either at the data level or at the Tableau level on the fly. We can do so by selecting multiple columns that we wish to pivot and then selecting the Pivot option from the drop-down menu that appears when we hover over any of the columns. Refer to the following image:
Further, if the format of the data in our Excel file is not suitable for analysis in Tableau, then we can turn on the Data Interpreter option, which becomes available when Tableau detects any unique formatting or any extra information in our Excel file. For example, the Excel data may include some empty rows and columns or extra headers and footers. Refer to the following image:
Data Interpreter can remove that extra bit of information to help prepare our Tableau data source for analysis. Refer to the following image:
When we enable Data Interpreter, the preceding view will change to what is shown in the following image:
This is how Data Interpreter works in Tableau.
Now, many a times, there may also be situations where our data fields are compounded or clubbed in a single column. Refer to the following image:
In the preceding image, the highlighted column is basically a concatenated field that has the Country, City, and State fields. For our analysis, we may want to break these and analyze each geographic level separately. To do so, we simply need to use the Split or Custom Split option in Tableau. Refer to the following image:
Once we do this, our view would be as shown in the following image:
Further, when preparing some data for analysis, at times a list of fields may be easy to consume as against the current preview of our data. The Metadata grid in Tableau allows us to do the same along with many other quick functions such as renaming fields, hiding columns, changing data types, changing aliases, creating calculations, splitting fields, merging fields, and pivoting the data. Refer to the following image:
After having established the initial connectivity by pointing to the right data source, we need to specify how we wish to maintain that connectivity. We can choose between the Live option and the Extract option.
The Live option helps us connect to our data directly and maintains a live connection with the data source. Using this option allows Tableau to leverage the capabilities of our data source, and in this case, the speed of our data source will determine the performance of our analysis.
The
