28,14 €
Integrate open source data analytics and build business intelligence on SQL databases with Apache Superset. The quick, intuitive nature for data visualization in a web application makes it easy for creating interactive dashboards.
Key Features
Book Description
Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset.
First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe.
You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data.
Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.
What you will learn
Who this book is for
This book is for data analysts, BI professionals, and developers who want to learn Apache Superset. If you want to create interactive dashboards from SQL databases, this book is what you need. Working knowledge of Python will be an advantage but not necessary to understand this book.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 114
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Sunith ShettyAcquisition Editor: Siddharth MandalContent Development Editor: Kirk DsouzaTechnical Editor: Suwarna PatilCopy Editor: Safis EditingProject Coordinator:Hardik BhindeProofreader: Safis EditingIndexer: Tejal Daruwale SoniGraphics: Alishon MendonsaProduction Coordinator:Tom Scaria
First published: December 2018
Production reference: 1141218
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78899-224-4
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
I've always thought good problem-solving has a certain sound—like that of a key turning in a lock, with a firm but smooth turn. And then there's this click and a soft ting. You can hear it, right? It has just the right balance of standing out but not making a fuss. You can hear it, but it's also understated.
As much as we would love to hear this sound more often, the locks that we encounter every day are not that simple. Executing a project successfully is not as common a phenomenon as it sounds. Doing business is tough. It always has been. And the more complex the operations, the tougher it gets. And it is tough because, sometimes, we make it so by connecting the dots in slightly different ways.
Dashboards give you a sense of the situation, or a scan of how everything went down, giving us a feedback loop to what happened and what resulted—which can help us make better decisions, every day, with visualized data on demand.
Whether you are a business owner, contributing to a business, or just someone trying to make a system work in your everyday life, having a dashboard is almost an intuitive, mental need of seeing a bunch of dots in front of us, and being able to connect them.
And so, being able to pull data easily and arrange it nicely on a table for everyone to see is an ability that has the power to change the entire process of doing work. Increasingly, we have been seeing how, similar to coding, interest in this particular skill has touched programmers, managers, designers, marketers, CXOs, and even just individuals buying groceries.
To me, the essence of this book lies in its attempt to democratize a skill that empowers people in a way that has never happened before, using technologies that anyone can access and build on. The reliability and power to create dashboards allows anyone to uncover insights, seek truth, and use intelligence to make better decisions. So, if you are someone who has been sitting on the sidelines, this book could be the perfect wingman to enter the playground with, and change the game.
Sumit Saurav
Shashank Shekhar is a data analyst and open source enthusiast. He has contributed to Superset and pymc3 (the Python Bayesian machine learning library), and maintains several public repositories on machine learning and data analysis projects of his own on GitHub. He heads up the data science team at HyperTrack, where he designs and implements machine learning algorithms to obtain insights from movement data. Previously, he worked at Amino on claims data. He has worked as a data scientist in Silicon Valley for 5 years. His background is in systems engineering and optimization theory, and he carries that perspective when thinking about data science, biology, culture, and history.
Stephan Adams is a software engineer working in adtech in New York City. An electrical engineer by training, he graduated with an MS from UC Berkeley in 2013. He currently works on building systems for serving massive datasets in real time to heavy traffic loads. In his spare time, Stephan enjoys playing volleyball.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Apache Superset Quick Start Guide
About Packt
Why subscribe?
Packt.com
Foreword
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers   
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Data Exploration
Datasets
Installing Superset
Sharing Superset
Configuring Superset
Adding a database
Adding a table
Creating a visualization
Uploading a CSV
Configuring the table schema
Customizing the visualization
Making a dashboard
Summary
Configuring Superset and Using SQL Lab
Setting the web server
Creating the metadata database
Migrating data from SQLite to PostgreSQL
Web server
Gunicorn
Setting up an NGINX reverse proxy
Setting up HTTPS or SSL certification
Flask-AppBuilder permissions
Securing session data
Caching queries
Mapbox access token
Long-running queries
Main configuration file
SQL Lab
Summary
User Authentication and Permissions
Security features
Setting up OAuth Google sign-in
List Users page
List Base Permissions page
Views/Menus page
List Permissions on Views/Menus pages
Alpha and gamma – building blocks for custom roles
Alpha
Gamma
Public
User Statistics page
Action log
Summary
Visualizing Data in a Column
Dataset
Distribution – histogram
Comparison – relationship between feature values
Comparison – box plots for groups of feature values
Comparison – side-by-side visualization of two feature values
Summary statistics – headline
Summary
Comparing Feature Values
Dataset
Comparing multiple time series
Comparing two time series
Identifying differences in trends for two feature values
Summary
Drawing Connections between Entity Columns
Datasets
Directed force networks
Chord diagrams
Sunburst chart
Sankey's diagram
Partitioning
Summary
Mapping Data That Has Location Information
Data
Scatter point
Scatter grid
Arcs
Path
Summary
Building Dashboards
Charts
Getting started with Superset
Visualizing data in a column
Comparing feature values
Drawing connections between entity columns
Mapping data that has location information
Dashboards
Making a dashboard
Selecting charts
Separating charts into tabs
Headlining sections using titles
Inserting markdown
Organizing charts in the dashboard layout
Separating sections using dividers
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
Apache Superset is a modern, open source, enterprise-ready Business Intelligence (BI) web application. Connecting it to a SQL database allows you to create real-time data visualizations and BI dashboards. This book gets you started with making the best use of Superset for your organization.
This book is for data analysts, BI professionals, and developers who want to learn Apache Superset. If you want to create interactive dashboards from SQL databases, this book is what you need.
Chapter 1, Getting Started with Data Exploration, teaches you how to install Superset, add a database, create a dashboard, and share a dashboard with users. We will train ourselves to be ready to add additional databases and tables, as well as to create new visualizations and dashboards.
Chapter 2, Configuring Superset and Using SQL Lab, shows you how to configure a Superset web server for your runtime environment needs using thesuperset_config.pyfile. We will look at the configuration parameters that can make Superset secure and scalable to match optimal trade-offs. We will replace SQLite metadata with a PostgreSQL database and configure a web app to use it as the database.
Chapter 3, User Authentication and Permissions, looks at how to allow new users to register on the Superset web app with their Google accounts. We will explore the security tools available to the administrator, such as activity logs and user statistics.
Chapter 4, Visualizing Data in a Column, helps you understand columnar data through distribution plots, point-wise comparison with a reference columns, and charts.
Chapter 5, Comparing Feature Values, involves two datasets that you will use to compare prices of food commodities.We will make use of five chart types that will help in giving us a better understanding of how we can correlate between the two sets of data.
Chapter 6, Drawing Connections between Entity Columns, looks at visualizing relationships as graphs instead of coordinates on orthogonal axes. We will learn about the approaches for visualizing and analyzing dataset with entities and a value quantifying some type of relationship.
Chapter 7, Mapping Data That Has Location Information, continues the trend of analyzing geographical regions by working with location data.We will visualize location data as scatter plots on maps and then we will plot arcs and lines on a map.
Chapter 8, Building Dashboards, is where we will make some beautiful dashboards and complete our Superset quick start journey. We will try to organize the charts such that the dashboard is effective at coherently communicating those answers.
This book will make a great choice for collaborative data analysis work within a cross-functional team of data analysts, business professionals, and software engineers.
Many common analytical questions on data can be addressed using the charts, which are easy to use.
A working knowledge of Python will be an advantage but is not necessary to understand this book.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Superset-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
