Advanced Analytics with R and Tableau - Jen Stirrup - E-Book

Advanced Analytics with R and Tableau E-Book

Jen Stirrup

0,0
37,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Leverage the power of advanced analytics and predictive modeling in Tableau using the statistical powers of R

About This Book

  • A comprehensive guide that will bring out the creativity in you to visualize the results of complex calculations using Tableau and R
  • Combine Tableau analytics and visualization with the power of R using this step-by-step guide
  • Wondering how R can be used with Tableau? This book is your one-stop solution.

Who This Book Is For

This book will appeal to Tableau users who want to go beyond the Tableau interface and deploy the full potential of Tableau, by using R to perform advanced analytics with Tableau.

A basic familiarity with R is useful but not compulsory, as the book will start off with concrete examples of R and will move quickly into more advanced spheres of analytics using online data sources to support hands-on learning. Those R developers who want to integrate R in Tableau will also benefit from this book.

What You Will Learn

  • Integrate Tableau's analytics with the industry-standard, statistical prowess of R.
  • Make R function calls in Tableau, and visualize R functions with Tableau using RServe.
  • Use the CRISP-DM methodology to create a roadmap for analytics investigations.
  • Implement various supervised and unsupervised learning algorithms in R to return values to Tableau.
  • Make quick, cogent, and data-driven decisions for your business using advanced analytical techniques such as forecasting, predictions, association rules, clustering, classification, and other advanced Tableau/R calculated field functions.

In Detail

Tableau and R offer accessible analytics by allowing a combination of easy-to-use data visualization along with industry-standard, robust statistical computation.

Moving from data visualization into deeper, more advanced analytics? This book will intensify data skills for data viz-savvy users who want to move into analytics and data science in order to enhance their businesses by harnessing the analytical power of R and the stunning visualization capabilities of Tableau. Readers will come across a wide range of machine learning algorithms and learn how descriptive, prescriptive, predictive, and visually appealing analytical solutions can be designed with R and Tableau. In order to maximize learning, hands-on examples will ease the transition from being a data-savvy user to a data analyst using sound statistical tools to perform advanced analytics.

By the end of this book, you will get to grips with advanced calculations in R and Tableau for analytics and prediction with the help of use cases and hands-on examples.

Style and approach

Tableau (uniquely) offers excellent visualization combined with advanced analytics; R is at the pinnacle of statistical computational languages. When you want to move from one view of data to another, backed up by complex computations, the combination of R and Tableau makes the perfect solution. This example-rich guide will teach you how to combine these two to perform advanced analytics by integrating Tableau with R and create beautiful data visualizations.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 175

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Advanced Analytics with R and Tableau
Credits
About the Authors
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Customer Feedback
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Advanced Analytics with R and Tableau
Installing R for Windows
RStudio
Prerequisites for RStudio installation
Implementing the scripts for the book
Testing the scripting
Tableau and R connectivity using Rserve
Installing Rserve
Configuring an Rserve Connection
Summary
2. The Power of R
Core essentials of R programming
Variables
Creating variables
Working with variables
Data structures in R
Vector
Lists
Matrices
Factors
Data frames
Control structures in R
Assignment operators
Logical operators
For loops and vectorization in R
For loops
Functions
Creating your own function
Making R run more efficiently in Tableau
Summary
3. A Methodology for Advanced Analytics Using Tableau and R
Industry standard methodologies for analytics
CRISP-DM
Business understanding/data understanding
CRISP-DM model — data preparation
CRISP-DM — modeling phase
CRISP-DM — evaluation
CRISP-DM — deployment
CRISP-DM — process restarted
CRISP-DM summary
Team Data Science Process
Business understanding
Data acquisition and understanding
Modeling
Deployment
TDSP Summary
Working with dirty data
Introduction to dplyr
Summarizing the data with dplyr
Summary
4. Prediction with R and Tableau Using Regression
Getting started with regression
Simple linear regression
Using lm() to conduct a simple linear regression
Coefficients
Residual standard error
Comparing actual values with predicted results
Investigating relationships in the data
Replicating our results using R and Tableau together
Getting started with multiple regression?
Building our multiple regression model
Confusion matrix
Prerequisites
Instructions
Solving the business question
What do the terms mean?
Understanding the performance of the result
Next steps
Sharing our data analysis using Tableau
Interpreting the results
Summary
5. Classifying Data with Tableau
Business understanding
Understanding the data
Data preparation
Describing the data
Data exploration
Modeling in R
Analyzing the results of the decision tree
Model deployment
Decision trees in Tableau using R
Bayesian methods
Graphs
Terminology and representations
Graph implementations
Summary
6. Advanced Analytics Using Clustering
What is Clustering?
Finding clusters in data
Why can't I drag my Clusters to the Analytics pane?
Clustering in Tableau
How does k-means work?
How to do Clustering in Tableau
Creating Clusters
Clustering example in Tableau
Creating a Tableau group from cluster results
Constraints on saving Clusters
Interpreting your results
How Clustering Works in Tableau
The clustering algorithm
Scaling
Clustering without using k-means
Hierarchical modeling
Statistics for Clustering
Describing Clusters – Summary tab
Testing your Clustering
Describing Clusters – Models Tab
Introduction to R
Summary
7. Advanced Analytics with Unsupervised Learning
What are neural networks?
Different types of neural networks
Backpropagation and Feedforward neural networks
Evaluating a neural network model
Neural network performance measures
Receiver Operating Characteristic curve
Precision and Recall curve
Lift scores
Visualizing neural network results
Neural network in R
Modeling and evaluating data in Tableau
Using Tableau to evaluate data
Summary
8. Interpreting Your Results for Your Audience
Introduction to decision system and machine learning
Decision system-based Bayesian
Decision system-based fuzzy logic
Bayesian Theory
Fuzzy logic
Building a simple decision system-based Bayesian theory
Integrating a decision system and IoT project
Building your own decision system-based IoT
Wiring
Writing the program
Testing
Enhancement
Summary
References
Index

Advanced Analytics with R and Tableau

Advanced Analytics with R and Tableau

Copyright © 2017 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: August 2017

Production reference: 1180817

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78646-011-0

www.packtpub.com

Credits

Authors

Jen Stirrup

Ruben Oliva Ramos

Reviewers

Kyle Johnson

Radovan Kavicky

Juan Tomás Oliva Ramos

Lourdes Bolaños Pérez

Commissioning Editor

Veena Pagare

Acquisition Editor

Vinay Argekar

Content Development Editor

Aishwarya Pandere

Technical Editor

Karan Thakkar

Copy Editor

Safis Editing

Project Coordinator

Nidhi Joshi

Proofreader

Safis Editing

Indexer

Tejal Daruwale Soni

Graphics

Tania Dutta

Production Coordinator

Arvindkumar Gupta

Cover Work

Arvindkumar Gupta

About the Authors

Jen Stirrup, recently named as one of the top 9 most influential business intelligence female experts in the world by Solutions Review, is a Microsoft Data. Platform MVP, and PASS Director-At-Large, is a well-known business intelligence and data visualization expert, author, data strategist, and community advocate who has been peer-recognized as one of the top 100 most global influential tweeters on big data and analytics topics.

Specialties: business intelligence, Microsoft SQL Server, Tableau, architecture, data, R, Hadoop, and Hive. Jen is passionate about all things data and business intelligence, helping leaders derive value from data. For two decades, Jen has worked in artificial intelligence and business intelligence consultancy, architecting, and delivering and supporting complex enterprise solutions for customers all over the world.

I would like to thank the reviewers of this book for their valuable comments and suggestions. I would also like to thank the wonderful team at Packt for publishing the book and helping me all along.

I'd like to thank my son Matthew for his unending patience, and my Coton de Tuléar puppy Archie for his long walks. I'd also like to thank my parents, Margaret and Drew, for their incredible support for this globe-trotting single mother who isn't always the best daughter that they deserve. They are the parents that I want to be.

I'd like to thank the Microsoft teams for their patience and support; they deserve special recognition here. I am grateful for their love and support, and for generally humouring me when I go off and do another community venture focused on my passions for their technology and diversity in the tech community.

I'd like to thank Tableau: Bora Beran who kindly got in touch, Andy Cotgreave who keeps the Tableau community fun and engaging as well as educational, and the Tableau UK team for humouring me, too. I am seeing a pattern here.

Ruben Oliva Ramos is a computer systems engineer from Tecnologico of León Institute with a master's degree in computer and electronic systems engineering, tele informatics, and networking specialization from University of Salle Bajio in Leon, Guanajuato, Mexico. He has more than five years' experience in developing web applications to control and monitor devices connected with the Arduino and Raspberry Pi using web frameworks and cloud services to build Internet of Things applications.

He is a mechatronics teacher at University of Salle Bajio and teaches students studying for their master's degree in Design and Engineering of Mechatronics Systems. He also works at Centro de Bachillerato Tecnologico Industrial 225 in Leon, Guanajuato, Mexico, teaching electronics, robotics and control, automation, and microcontrollers at Mechatronics Technician Career.

He has worked on consultant and developer projects in areas such as monitoring systems and datalogger data using technologies such as Android, iOS, Windows Phone, Visual Studio .NET, HTML5, PHP, CSS, Ajax, JavaScript, Angular, ASP .NET databases (SQLite, MongoDB, and MySQL), and web servers (Node.js and IIS). Ruben has done hardware programming on the Arduino, Raspberry Pi, Ethernet Shield, GPS, and GSM/GPRS, ESP8266, control and monitor systems for data acquisition and programming.

He's the Author at Pack Publishing book: Internet of Things Programming with JavaScript.

About the Reviewers

Kyle Johnson is a data scientist based out of Pittsburgh Pennsylvania. He has a Masters Degree in Information Systems Management from Carnegie Mellon University and a Bachelors Degree in Psychology from Grove City College. He is an adjunct data science professor at Carnegie Mellon, and his applied work focuses in the healthcare and life sciences domain. See his LinkedIn page for an updated resume and contact information: https://www.linkedin.com/in/kljohnson721.

I would like to thank Nancy, George and Helena.

Radovan Kavický is the principal data scientist and president at GapData Institute based in Bratislava, Slovakia, where he harnesses the power of data & wisdom of economics for public good. With an academic background in macroeconomics, he is a consultant and analyst by profession, with more than eight years of experience in consulting for clients from public and private sectors along with strong mathematical and analytical skills and the ability to deliver top-level research and analytical work.

He switched to Python, R, and Tableau from MATLAB, SAS, and Stata. Besides being a member of the Slovak Economic Association (SEA), Evangelist of Open Data, Open Budget Initiative, & Open Government Partnership, he is also the founder of PyData Bratislava, R <- Slovakia, and SK/CZ Tableau User Group (skczTUG). He has been the speaker at TechSummit (Bratislava, 2017) and at PyData (Berlin, 2017). He is also a member of the global Tableau #DataLeader network (2017).

You can follow him on Twitter at @radovankavicky, @GapDataInst, or @PyDataBA

For his full profile and experience, visit https://www.linkedin.com/in/radovankavicky/, https://github.com/radovankavicky, and GapData Institute, https://www.gapdata.org.

Juan Tomás Oliva Ramos is an environmental engineer from the university of Guanajuato, with a master's degree in Administrative engineering and quality. He has more than five years of experience in: Management and development of patents, technological innovation projects and Development of technological solutions through the statistical control of processes.

He is a teacher of Statistics, Entrepreneurship and Technological development of projects since 2011. He has always maintained an interest for the improvement and the innovation in the processes through the technology. He became an entrepreneur mentor, technology management consultant and started a new department of technology management and entrepreneurship at Instituto Tecnologico Superior de Purisima del Rincon.

He has worked in the book: Wearable designs for Smart watches, Smart TV's and Android mobile devices

He has developed prototypes through programming and automation technologies for the improvement of operations, which have been registered to apply for his patent.

I want to thank God God for giving me wisdom and humility to review this book. I want thank Rubén, for inviting me to collaborate on this adventure. I want to thank my wife, Brenda, our two magic princesses (Regina and Renata) and our next member (Tadeo), All of you are my strengths, happiness and my desire to look for the best for you.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1786460114.

If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Preface

Moving from data visualization into deeper, more advanced analytics, this book will intensify data skills for data-savvy users who want to move into analytics and data science in order to enhance their businesses by harnessing the analytical power of R and the stunning visualization capabilities of Tableau.

Together, Tableau and R offer accessible analytics by allowing a combination of easy-to-use data visualization along with industry-standard, robust statistical computation. Readers will come across a wide range of machine learning algorithms and learn how descriptive, prescriptive, predictive, and visually appealing analytical solutions can be designed with R and Tableau.

In order to maximize learning, hands-on examples will ease the transition from being a data-savvy user to a data analyst using sound statistical tools to perform advanced analytics.

Tableau (uniquely) offers excellent visualization combined with advanced analytics; R is at the pinnacle of statistical computational languages. When you want to move from one view of data to another, backed up by complex computations, the combination of R and Tableau is the perfect solution. This example-rich guide will teach you how to combine these two to perform advanced analytics by integrating Tableau with R to create beautiful data visualizations.

What this book covers

Chapter 1, Getting Ready for Tableau and R, shows how to connect Tableau Desktop with R through calculated fields and take advantage of R functions, libraries, packages, and even saved models. We'll also cover Tableau Server configuration with R through an instance of Rserve (through the tabadmin utility), allowing anyone to view a dashboard containing R functionality. Combining R with Tableau gives you the ability to bring deep statistical analysis into a drag-and-drop visual analytics environment.

Chapter 2, The Power of R, integrates both the platforms in the previous chapter; we'll walk through different ways in which readers can use R to combine and compare data for analysis. We will cover, with examples, the core essentials of R programming such as variables, data structures in R, control mechanisms in R, and how to execute these commands in R before proceeding to later chapters that heavily rely on these concepts to script complex analytical operations.

Chapter 3, A Methodology for Advanced Analytics using Tableau and R, creates a roadmap for our analytics investigation. You'll learn how to assess the performance of both supervised and unsupervised learning algorithms, and the importance of testing. Using R and Tableau, we will explore why and how you should split your data into a training set and a test set. In order to understand how to display the data accurately as well as beautifully in Tableau, the concepts of bias and variance are explained.

Chapter 4, Prediction with R and Tableau Using Regression, considers regression from an analytics point of view. In this chapter, we look at the predictive capabilities and performance of regression algorithms. At the end of this chapter, you'll have experience in simple linear regression, multi-linear regression, and k-nearest neighbors regression using a business-oriented understanding of the actual use cases of regression techniques.

Chapter 5, Classifying Data with Tableau, shows ways to perform classification using R and visualize the results in Tableau. Classification is one of the most important tasks in analytics today. By the end of this chapter, you'll build a decision tree and classify unseen observations with k-nearest neighbors, with a focus on a business-oriented understanding of the business question using classification algorithms.

Chapter 6, Advanced Analytics Using Clustering, gives a business-oriented understanding of the business questions using clustering algorithms and applying visualization techniques that best suit the scenario.

Chapter 7, Advanced Analytics with Unsupervised Learning, teaches k-means clustering and hierarchical clustering. It has a business-oriented understanding of the business question using unsupervised learning algorithms.

Chapter 8, Interpreting Your Results for Your Audience. How do you interpret the results and the numbers when you have them? What does a p-value mean? Analytical investigations will result in a variety of relationships in data, but the audience may have problems understanding the results. Statistical tests state a null and an alternative hypothesis, and then calculate a test statistic and report an associated p-value. In this chapter, we will look at ways in which we can answer "what if?" questions and applicable customer scenarios using cohort analysis, with a focus on how we can display the results so that the audience can make a conclusion from the tests.

What you need for this book

You'll need the following software:

R version 3.4.1RStudio for WindowsPlugins for RStudio

Who this book is for

This book will appeal to Tableau users who want to go beyond the Tableau interface and deploy the full potential of Tableau, by using R to perform advanced analytics with Tableau.

A basic familiarity with R is useful but not compulsory, as the book starts off with concrete examples of R and will move on quickly to more advanced spheres of analytics using online data sources to support hands-on learning. Those R developers who want to integrate R with Tableau will also benefit from this book.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Advanced-Analytics-with-R-and-Tableau