41,99 €
Master the intricacies of Tableau to create effective data visualizations
If you are a business analyst without developer-level programming skills, then this book is for you. You are expected to have at least a fundamental understanding of Tableau and basic knowledge of joins, however SQL knowledge is not assumed. You should have basic computer skills, including at least moderate Excel proficiency.
Tableau has emerged as one of the most popular Business Intelligence solutions in recent times, thanks to its powerful and interactive data visualization capabilities. This book will empower you to become a master in Tableau by exploiting the many new features introduced in Tableau 10.0.
You will embark on this exciting journey by getting to know the valuable methods of utilizing advanced calculations to solve complex problems. These techniques include creative use of different types of calculations such as row-level, aggregate-level, and more. You will discover how almost any data visualization challenge can be met in Tableau by getting a proper understanding of the tool's inner workings and creatively exploring possibilities.
You'll be armed with an arsenal of advanced chart types and techniques to enable you to efficiently and engagingly present information to a variety of audiences through the use of clear, efficient, and engaging dashboards. Explanations and examples of efficient and inefficient visualization techniques, well-designed and poorly designed dashboards, and compromise options when Tableau consumers will not embrace data visualization will build on your understanding of Tableau and how to use it efficiently.
By the end of the book, you will be equipped with all the information you need to create effective dashboards and data visualization solutions using Tableau.
This book takes a direct approach, to systematically evolve to more involved functionalities such as advanced calculation, parameters & sets, data blending and R integration. This book will help you gain skill in building visualizations previously beyond your capacity.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 433
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: November 2016
Production reference: 1231116
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78439-769-2
www.packtpub.com
Author
David Baldwin
Copy Editor
Vikrant Phadke
Reviewer
Tamas Foldi
Project Coordinator
Shweta H Birwatkar
Commissioning Editor
Veena Pagare
Proofreader
Safis Editing
Acquisition Editor
Vinay Argekar
Indexer
Aishwarya Gangawane
Content Development Editor
Sumeet Sawant
Graphics
Disha Haria
Technical Editor
Akash Patel
Production Coordinator
Nilesh Mohite
David Baldwin has provided consulting in the business intelligence sector for 17 years. His experience includes Tableau training and consulting, developing BI solutions, project management, technical writing, and the web and graphic design. His vertical experience includes financial, healthcare, human resource, aerospace, energy, education, government, and entertainment industries. As a Tableau trainer and consultant, David enjoys serving a variety of clients throughout the USA. Tableau provides David a platform that collates his broad experience into a skill set that can service a diverse client base.
Many people provided invaluable support in the writing of this book. Although I cannot name everyone, there are those to whom I would like to draw special attention: My wife, Kara, was an unfailing encourager, supporter, and cheerleader throughout the writing journey. My children, Brent and Brooke, were very understanding of their dad’s many long hours in front of a laptop at the dining room table. My mother, Bettye, was my first and best writing instructor and thus provided a foundation for clear communication. My father, Larry, taught me the importance of precise technical and mathematical thinking. My sister, Chelsea, modeled perseverance as she pursued and achieved advanced degrees. Also I’d like to thank my colleagues at Teknion for being ever willing to entertain questions, provide valuable feedback, and read rough drafts, particularly Bridget Cogley, Matthew Agee, Preston Howell, and especially Joshua Milligan.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
So what is this book about? The title certainly points in the right direction: Mastering Tableau. The word Mastering implies a journey to a level of competency beyond mere familiarity or superficial knowledge. The word Tableau, of course, limits the scope of a particular software package. Let’s extend the title by one word in order to hone the focus: Mastering Tableau Desktop. The word Desktop further narrows consideration by communicating that this book is not focused on Tableau Server, although there is a chapter dedicated to interacting with Server. Nor does this book dive deep into topics beyond the realm of Tableau, though other technologies such as R and SQL are discussed as they pertain to Tableau. Furthermore, this book is not focused on data visualization or architectural theory per se, though these topics are explored and every attempt is made to adhere to sound methodology as technical problems are discussed. Instead, this book attempts to build on a foundation of an already basic understanding of Tableau Desktop so as to provide a theoretical and practical basis for solving real-world challenges in an efficient and elegant manner. Along the way, many tips and tricks for use in everyday work are discussed and exercises with careful step-by-step instructions and commentary are provided.
Chapter 1, Getting Up to Speed - a Review of the Basics, provides a quick on-ramp for those new to Tableau and a useful review for those with experience. For a more thorough consideration of fundamental topics, see Learning Tableau, written by Joshua Milligan and published by Packt Publishing.
Chapter 2, All about Data - Getting Your Data Ready, commences a series of three "All about Data" chapters. The chapter begins with a theoretical discussion of the Tableau data paradigm and data mining topics and then moves on to practical ways to use Tableau to survey and cleanse data.
Chapter 3, All about Data - Joins, Blends, and Data Structures, explores complex joins, data blending, and pivoting.
Chapter 4, All about Data - Data Densification, Cubes, and Big Data, ends the series of "All about Data" chapters by surveying a variety of data topics, including the undocumented world of data densification, working with cubes and big data considerations.
Chapter 5, Table Calculations, focuses on two questions: “What is the function?” and “How is the function applied?” These questions provide a framework for discussing directional and non-directional table calculations as well as partitioning and addressing.
Chapter 6, Level of Detail Calculations, begins with two playground environments created in Tableau designed to provide a foundation for understanding level-of-detail calculations and then moves on to practical application.
Chapter 7, Beyond the Basic Chart Types, looks at improving some popular visualization types and then considers the largely underexplored topic of using background images in Tableau. The workbook provided with this chapter also provides many additional visualization types.
Chapter 8, Mapping, begins by considering how to expand Tableau’s native mapping capabilities without leaving the interface, and then explores extending Tableau mapping via other technologies, including connecting to WMS servers and MapBox. Lastly, the chapter demonstrates how to provide the end user options for choosing different maps and ends with a discussion on custom polygons.
Chapter 9, Tableau for Presentations, discusses techniques for integrating Tableau with PowerPoint as well as how to use Tableau as a standalone presentation tool via animation and story points.
Chapter 10, Visualization Best Practices and Dashboard Design, begins by considering design topics such as formatting, color, and visualization types and then addresses dashboard layout options. The chapter ends by exploring sheet swapping in some depth.
Chapter 11, Improving Performance, is the longest chapter of the book and attempts to systematically (though not exhaustively) cover options for optimizing Tableau performance.
Chapter 12, Interacting with Tableau Server, explores how to optimize Tableau Server architecture for best performance and easiest maintenance. The chapter also considers the web authoring environment, user filters, and accessing the Performance Recording dashboard via Tableau Server.
Chapter 13, R Integration, begins by considering how to install and integrate R with Tableau and then explores R and Tableau integration via a series of exercises. The chapter ends with a troubleshooting section.
In order to make use of this book, an installation of Tableau 10 is required. The following technologies are mentioned and lightly utilized in this book but are not strictly required:
Mastering Tableau targets persons with 5+ months of experience using Tableau. Although not strictly required, a thorough reading of the predecessor to this book, Learning Tableau, is helpful. Alternatively, the Desktop I and II training provided by Tableau provides a helpful foundation. A basic knowledge of SQL is helpful in a few sections. A basic knowledge of Excel is assumed.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to [email protected], and mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on https://www.packtpub.com/books/info/packt/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-Tableau. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide you a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/MasteringTableau_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at [email protected] with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
You can contact us at [email protected] if you are having a problem with any aspect of the book, and we will do our best to address it.
The goal of this book is to empower you to become a Tableau master; in Tableau-speak, the term is Jedi. Yes, that is official Tableau terminology. Attend the yearly Tableau conference and you can sit in on Jedi classes. Of course, simply attending a class will not automatically bestow you with Jedi powers - nor will simply reading this book. Diligent work on real-world problems is absolutely essential. Couple this diligent work with industrious study and you will make it. You will become a Tableau Jedi. My hope is that this book will prove useful to you on your journey to mastery.
If you are a seasoned Tableau author, you may find this initial chapter elementary. (A person who creates Tableau workbooks is referred to as an author, not a developer.) For such persons, I recommend a quick, inspectional read. If after a few minutes you are satisfied you already possess a solid understanding of the concepts discussed, feel free to proceed to subsequent chapters. If, however, you find some of the content unfamiliar, it may be wise to read with greater attention.
Those who are fairly new to Tableau should find this chapter helpful in getting up to speed quickly; however, since this book targets advanced topics, relatively little time is spent considering the basics. For a more thorough consideration of fundamental topics, consider Learning Tableau, written by Joshua Milligan and published by Packt Publishing.
In this chapter, we will discuss the following:
Tableau Software has a focused vision resulting in a small product line. The main product (and hence the center of the Tableau universe) is Tableau Desktop. Assuming you are a Tableau author, that's where almost all your time will be spent when working with Tableau. But of course you must be able to connect to data and output the results. Thus, as shown in the following figure, the Tableau universe encompasses data sources, Tableau Desktop, and output channels, which include the Tableau Server family and Tableau Reader:
Data Sources
Tableau connects to many data sources. Those data sources will be discussed in more detail in the following section.
Tableau Desktop
Tableau Desktop is where visualizations are created. Although, as of Tableau 8.0, some authoring capabilities were introduced into the Tableau Server environment, that environment is limited. (See Chapter 12, Interacting with Tableau Server to learn more about authoring in the Tableau Server environment.) Thus, the heavy lifting is still done in Tableau Desktop.
Tableau Server Family
Once completed in Tableau Desktop, a workbook can be uploaded to Tableau Server for end-user access. Tableau Server provides a secure, web-based environment where end users can access visualizations created in Desktop either through a browser or via the Tableau Mobile app for Android and iPhone.
Tableau Online is a cloud-based version of Tableau Server hosted by Tableau Software. It's an ideal solution for smaller organizations that need the security and flexibility of Server without the associated overheads.
Tableau Public is, in reality, split into two products: the Tableau Public client, and a cloud-based, public-facing version of Tableau Server. The client has the capabilities of Desktop, with a few major exceptions:
Tableau Reader
The relationship between Tableau Desktop and Tableau Reader is synonymous to that between Adobe Acrobat and Adobe Reader. Desktop is used for authoring; Reader is used for viewing. Desktop has an associated cost; Reader is free.
A few brief notes regarding Reader:
To begin our survey of the basics, let's consider the terminology and associated definitions of assets that make up the Tableau workspace. Lingering a little on these terms should prove helpful, since each is used throughout the book:
Go to Start
Click to toggle between the start page and the workspace.
Data Pane
Provides access to the data source, all fields, sets, and parameters, and displays the underlying data when the View Data icon is clicked.
Data Source
Lists all data sources and provides access to edit data sources, create data extracts, publish data sources, and more.
Dimensions
Lists all fields classified as Dimensions.
Measures
Lists all fields classified as Measures.
Sets
Lists all sets created by the Tableau author.
Parameters
Lists all parameters created by the Tableau author.
Data Source Tab
Provides access to the Data Source page.
Shelves
Areas where fields are placed to create views.
Legend
Serves the dual purpose of communicating Color, Size, and Shape information that exists in the view, as well as providing sorting, filtering, and highlighting capabilities.
Status Bar
Displays information about the current view.
Fit
Determines how a view is sized on the screen.
View
That which the end user sees via Tableau Server or Reader. Includes the visualization, legends, displayed filters, parameters, captions, and so on.
Pills
Fields, sets, or parameters that have been placed on one or more shelves. So named because the shape resembles a pill.
Show Me
Tool used to automatically create visualizations based on selected fields and any fields already placed on shelves.
Displayed Filter
A field exposed to the end user providing the ability to display/hide a numeric or date range of the field or display/hide members of the field. In earlier versions of Tableau this was referred to as a Quick Filter.
At the time of writing, Tableau's Data Connection menu includes 50 different connection types, and that is somewhat of an understatement since some of those types contain multiple options. For example, the selection choice, Other Files, includes 21 options. Of course, we won't cover the details for every connection type, but we will cover the basics.
Upon opening a new instance of Tableau Desktop, you will note a link in the upper left-hand corner of the workspace. Clicking on that link will enable you to connect to data. Alternatively, you can simply click on the New Data Source icon on the toolbar:
Although subsequent chapters will consider connecting to other data sources, here we will limit the discussion to considerations when connecting to Excel and text files.
Upon choosing to connect to an Excel or text file, the Tableau author is presented with two choices. Note that those choices are somewhat hidden. As shown in the following screenshot, you will need to click on the arrow next to the Open button to access them:
The Open option uses a native Tableau driver. The Open with Legacy Connection option accesses the Microsoft JET driver. Let's compare and contrast some of the differences between these two drivers.
Native Tableau Driver
MS Jet Driver
More set capabilities such as in/out and combined sets
Limited set capabilities
Count Distinct is allowed
Count Distinct is disallowed
Allows more than 255 columns
Columns are capped at 255
Special characters, such as brackets and quotation marks, are allowed in file and field names
Special characters are disallowed in file and field names
When connecting to Excel, the data type is determined by 95% of the first 10,000 rows
When connecting to Excel, the data type is determined by the first eight rows
Cannot connect to .xlsb files
Can connect to .xlsb files
File names can be any length
File names are limited to 64 characters
Custom SQL is not allowed
Custom SQL is allowed
Left and inner joins are allowed
Left, inner, and right joins are allowed
Pivot data from rows to columns
No pivoting feature
Improved header auto-detection
Note that the preceding table is not complete. There are many other differences between the functionality of Native Tableau Driver and MS Jet Driver. Most of those, however, are less consequential.
So, when should you use Native Tableau Driver versus MS JET Driver? In short, use the native Tableau driver! In almost every case it will provide better performance and more functionality. One exception is when custom SQL is required. Tableau Software does not recommend using custom SQL in most cases because Tableau-generated SQL will run more efficiently; however, in some cases it may be necessary.
Connecting to Tableau Server is perhaps the single most important server connection type to consider, since it is frequently used to provide better performance than may otherwise be possible. Additionally, connecting to Tableau Server enables the author to receive not only data, but information regarding how that data is to be interpreted, for example, whether a given field should be considered a measure or a dimension. Let's explore this further via two exercises.
As a precursor to connecting to Tableau Server, let's compare and contrast the instance of the Superstore data source represented in the workbook associated with this chapter (that is, the Chapter 1 workbook) with a new connection to the same data.
In order to complete this exercise, access to an instance of Tableau Server is necessary. If you do not have access to Tableau Server, consider installing a trial version on your local computer:
Having completed the previous two exercises, let's discuss the most germane point; that is, metadata. Metadata is often defined as data about the data. In the preceding case, the data source name, default aggregation, default number formatting, and hierarchy are all examples of Tableau remembering changes made to the metadata. This is important because publishing a data connection allows for consistency across multiple Tableau authors. For example, if your company has a policy regarding the use of decimal points when displaying currency, that policy will be easily adhered to if all Tableau authors start building workbooks by pointing to data sources where all formatting has been predefined.
In the last exercise, the fact that the Profit Ratio calculated field was not directly editable when accessed via connecting to Tableau Server as a data source has important implications. Imagine the problems that would ensue if different Tableau authors defined Profit Ratio differently. End users would have no way of understanding what Profit Ratio truly means. However, by creating a workbook based on a published data source, the issue is alleviated. One version of Profit Ratio is defined and it can only be altered by changing the data source. This functionality can greatly improve consistency across the enterprise.
Connecting to a saved data source on a local machine is much like connecting to a data source published on Tableau Server. Metadata definitions associated with the local data source are preserved, just like they are on Tableau Server. Of course, since the data source is local instead of remote, the publication process is different. Let's explore this via an exercise.
Note that you can save a local data source that points to a published data source on Tableau Server. First, connect to a published data source on Tableau Server. Next, right-click on the data source in your workspace and choose Add to Saved Data Sources. Now you can connect to Tableau Server directly from your Start page!
I've observed the following scenario frequently. A new Tableau author creates a worksheet and drags a measure to the Text shelf. They would like to create another row to display a second measure but do not know how. They drag the second measure to various places on the view and get results that seem entirely unpredictable. The experience is very frustrating for the author, since it's so easy to accomplish this in Excel! The good news is that it's also easy to accomplish in Tableau. It just requires a different approach. Let's explore the solution via an exercise.
Measure Names and Measure Values are generated fields in Tableau. They do not exist in the underlying data, but they are indispensable for creating many kinds of views. As may be guessed from its placement in the Data pane and its name, Measure Names is a dimension whose members are made up of the names of each measure in the underlying dataset. Measure Values contains the numbers or values of each measure in the dataset. Watch what happens below when measure names and measure values are used independently. Afterward observe how they work elegantly together to create a view.
Perhaps the interrelationship between Measure Names and Measure Values is best explained by an analogy. Consider several pairs of socks and a partitioned sock drawer. Step 2 is the equivalent of throwing the socks into a pile. The results are, well, disorganized. Step 4 is the equivalent of an empty sock drawer with partitions. The partitions are all in place but where are the socks? Step 5 is a partitioned drawer full of nicely organized socks. Measure Names is like the partitioned sock drawer. Measure Values is like the socks. Independent of one another they are not of much use. Used together, they can be applied in many different ways.
Tableau provides various shortcuts to quickly create a desired visualization. If you are new to the software, this shortcut behavior may not seem intuitive, but with a little practice and a few pointers, you will quickly gain understanding. Let's use the following exercise to explore how you can use a shortcut to rapidly deploy Measure Names and Measure Values.
Several things happened in step 2. After placing Sales on top of the Profit number in the view, Tableau did the following:
