Apache Superset Quick Start Guide - Shashank Shekhar - E-Book

Apache Superset Quick Start Guide E-Book

Shashank Shekhar

0,0
28,14 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Integrate open source data analytics and build business intelligence on SQL databases with Apache Superset. The quick, intuitive nature for data visualization in a web application makes it easy for creating interactive dashboards.




Key Features



  • Work with Apache Superset's rich set of data visualizations


  • Create interactive dashboards and data storytelling


  • Easily explore data





Book Description



Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset.







First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe.







You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data.







Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.




What you will learn



  • Get to grips with the fundamentals of data exploration using Superset


  • Set up a working instance of Superset on cloud services like Google Compute Engine


  • Integrate Superset with SQL databases


  • Build dashboards with Superset


  • Calculate statistics in Superset for numerical, categorical, or text data


  • Understand visualization techniques, filtering, and grouping by aggregation


  • Manage user roles and permissions in Superset


  • Work with SQL Lab



Who this book is for



This book is for data analysts, BI professionals, and developers who want to learn Apache Superset. If you want to create interactive dashboards from SQL databases, this book is what you need. Working knowledge of Python will be an advantage but not necessary to understand this book.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 114

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Apache Superset Quick Start Guide 

 

Develop interactive visualizations by creating user-friendly dashboards

 

 

 

 

 

 

 

 

 

 

Shashank Shekhar

 

 

 

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Apache Superset Quick Start Guide

Copyright © 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Sunith ShettyAcquisition Editor: Siddharth MandalContent Development Editor: Kirk DsouzaTechnical Editor: Suwarna PatilCopy Editor: Safis EditingProject Coordinator:Hardik BhindeProofreader: Safis EditingIndexer: Tejal Daruwale SoniGraphics: Alishon MendonsaProduction Coordinator:Tom Scaria

First published: December 2018

Production reference: 1141218

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78899-224-4

www.packtpub.com

 
mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

Packt.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Foreword

I've always thought good problem-solving has a certain sound—like that of a key turning in a lock, with a firm but smooth turn. And then there's this click and a soft ting. You can hear it, right? It has just the right balance of standing out but not making a fuss. You can hear it, but it's also understated.

As much as we would love to hear this sound more often, the locks that we encounter every day are not that simple. Executing a project successfully is not as common a phenomenon as it sounds. Doing business is tough. It always has been. And the more complex the operations, the tougher it gets. And it is tough because, sometimes, we make it so by connecting the dots in slightly different ways.

Dashboards give you a sense of the situation, or a scan of how everything went down, giving us a feedback loop to what happened and what resulted—which can help us make better decisions, every day, with visualized data on demand.

Whether you are a business owner, contributing to a business, or just someone trying to make a system work in your everyday life, having a dashboard is almost an intuitive, mental need of seeing a bunch of dots in front of us, and being able to connect them.

And so, being able to pull data easily and arrange it nicely on a table for everyone to see is an ability that has the power to change the entire process of doing work. Increasingly, we have been seeing how, similar to coding, interest in this particular skill has touched programmers, managers, designers, marketers, CXOs, and even just individuals buying groceries.

To me, the essence of this book lies in its attempt to democratize a skill that empowers people in a way that has never happened before, using technologies that anyone can access and build on. The reliability and power to create dashboards allows anyone to uncover insights, seek truth, and use intelligence to make better decisions. So, if you are someone who has been sitting on the sidelines, this book could be the perfect wingman to enter the playground with, and change the game.

Sumit Saurav

Contributors

About the author

Shashank Shekhar is a data analyst and open source enthusiast. He has contributed to Superset and pymc3 (the Python Bayesian machine learning library), and maintains several public repositories on machine learning and data analysis projects of his own on GitHub. He heads up the data science team at HyperTrack, where he designs and implements machine learning algorithms to obtain insights from movement data. Previously, he worked at Amino on claims data. He has worked as a data scientist in Silicon Valley for 5 years. His background is in systems engineering and optimization theory, and he carries that perspective when thinking about data science, biology, culture, and history.

I would like to take this opportunity to thank my mentors, Ashish Anand, Prashant Ishwar, Prashant Kukde, and Sameera Podduri, for guiding me and setting me up on a path where I could learn about my interests and improve the skills that I required while doing so. I was lucky enough to meet them over the course of the last six years. I owe my success in writing this book, and the many goals that I am motivated to achieve in the future, to them.

 

 

About the reviewer

Stephan Adams is a software engineer working in adtech in New York City. An electrical engineer by training, he graduated with an MS from UC Berkeley in 2013. He currently works on building systems for serving massive datasets in real time to heavy traffic loads. In his spare time, Stephan enjoys playing volleyball.

 

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Apache Superset Quick Start Guide

About Packt

Why subscribe?

Packt.com

Foreword

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers   

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Getting Started with Data Exploration

Datasets

Installing Superset

Sharing Superset

Configuring Superset

Adding a database

Adding a table

Creating a visualization

Uploading a CSV

Configuring the table schema

Customizing the visualization

Making a dashboard

Summary

Configuring Superset and Using SQL Lab

Setting the web server

Creating the metadata database

Migrating data from SQLite to PostgreSQL

Web server

Gunicorn

Setting up an NGINX reverse proxy

Setting up HTTPS or SSL certification

Flask-AppBuilder permissions

Securing session data

Caching queries

Mapbox access token

Long-running queries

Main configuration file

SQL Lab

Summary

User Authentication and Permissions

Security features

Setting up OAuth Google sign-in

List Users page

List Base Permissions page

Views/Menus page

List Permissions on Views/Menus pages

Alpha and gamma – building blocks for custom roles

Alpha

Gamma

Public

User Statistics page

Action log

Summary

Visualizing Data in a Column

Dataset

Distribution – histogram

Comparison – relationship between feature values

Comparison – box plots for groups of feature values

Comparison – side-by-side visualization of two feature values

Summary statistics – headline

Summary

Comparing Feature Values

Dataset

Comparing multiple time series

Comparing two time series

Identifying differences in trends for two feature values

Summary

Drawing Connections between Entity Columns

Datasets

Directed force networks

Chord diagrams

Sunburst chart

Sankey's diagram

Partitioning

Summary

Mapping Data That Has Location Information

Data

Scatter point

Scatter grid

Arcs

Path

Summary

Building Dashboards

Charts

Getting started with Superset

Visualizing data in a column

Comparing feature values

Drawing connections between entity columns

Mapping data that has location information

Dashboards

Making a dashboard

Selecting charts

Separating charts into tabs

Headlining sections using titles

Inserting markdown

Organizing charts in the dashboard layout

Separating sections using dividers

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Apache Superset is a modern, open source, enterprise-ready Business Intelligence (BI) web application. Connecting it to a SQL database allows you to create real-time data visualizations and BI dashboards. This book gets you started with making the best use of Superset for your organization.

Who this book is for

This book is for data analysts, BI professionals, and developers who want to learn Apache Superset. If you want to create interactive dashboards from SQL databases, this book is what you need. 

What this book covers   

Chapter 1, Getting Started with Data Exploration, teaches you how to install Superset, add a database, create a dashboard, and share a dashboard with users. We will train ourselves to be ready to add additional databases and tables, as well as to create new visualizations and dashboards.

Chapter 2, Configuring Superset and Using SQL Lab, shows you how to configure a Superset web server for your runtime environment needs using thesuperset_config.pyfile. We will look at the configuration parameters that can make Superset secure and scalable to match optimal trade-offs. We will replace SQLite metadata with a PostgreSQL database and configure a web app to use it as the database.

Chapter 3, User Authentication and Permissions, looks at how to allow new users to register on the Superset web app with their Google accounts. We will explore the security tools available to the administrator, such as activity logs and user statistics.

Chapter 4,  Visualizing Data in a Column, helps you understand columnar data through distribution plots, point-wise comparison with a reference columns, and charts. 

Chapter 5, Comparing Feature Values, involves two datasets that you will use to compare prices of food commodities.We will make use of five chart types that will help in giving us a better understanding of how we can correlate between the two sets of data. 

Chapter 6, Drawing Connections between Entity Columns, looks at visualizing relationships as graphs instead of coordinates on orthogonal axes. We will learn about the approaches for visualizing and analyzing dataset with entities and a value quantifying some type of relationship.

Chapter 7, Mapping Data That Has Location Information, continues the trend of analyzing geographical regions by working with location data.We will visualize location data as scatter plots on maps and then we will plot arcs and lines on a map.

Chapter 8, Building Dashboards, is where we will make some beautiful dashboards and complete our Superset quick start journey. We will try to organize the charts such that the dashboard is effective at coherently communicating those answers.

To get the most out of this book

This book will make a great choice for collaborative data analysis work within a cross-functional team of data analysts, business professionals, and software engineers.

Many common analytical questions on data can be addressed using the charts, which are easy to use. 

A working knowledge of Python will be an advantage but is not necessary to understand this book.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packt.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Superset-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!