32,36 €
Create innovative informatics solutions with TIBCO Spotfire
Key Features
Book Description
The need for agile business intelligence (BI) is growing daily, and TIBCO Spotfire® combines self-service features with essential enterprise governance and scaling capabilities to provide best-practice analytics solutions. Spotfire is easy and intuitive to use and is a rewarding environment for all BI users and analytics developers.
Starting with data and visualization concepts, this book takes you on a journey through increasingly advanced topics to help you work toward becoming a professional analytics solution provider. Examples of analyzing real-world data are used to illustrate how to work with Spotfire. Once you've covered the AI-driven recommendations engine, you'll move on to understanding Spotfire's rich suite of visualizations and when, why and how you should use each of them. In later chapters, you'll work with location analytics, advanced analytics using TIBCO Enterprise Runtime for R®, how to decide whether to use in-database or in-memory analytics, and how to work with streaming (live) data in Spotfire. You'll also explore key product integrations that significantly enhance Spotfire's capabilities.This book will enable you to exploit the advantages of the Spotfire serve topology and learn how to make practical use of scheduling and routing rules.
By the end of this book, you will have learned how to build and use powerful analytics dashboards and applications, perform spatial analytics, and be able to administer your Spotfire environment efficiently
What you will learn
Who this book is for
If you are a business intelligence or data professional, this book will give you a solid grounding in the use of TIBCO Spotfire. This book requires no prior knowledge of Spotfire or any basic data and visualization concepts.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 462
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Amey VarangaonkarAcquisition Editor: Meeta RajaniContent Development Editor: Princia DsouzaTechnical Editor: Sayali ThanekarCopy Editor:Safis EditingProject Coordinator: Nusaiba AnsariProofreader: Safis EditingIndexer: Manju ArasanGraphics: Jisha ChirayilProduction Coordinator: Arvindkumar Gupta
First published: February 2015 Second edition: April 2019
Production reference: 1300419
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78712-132-4
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Andrew Berridge is a data scientist at TIBCO Software Ltd. He has 10 years' experience with TIBCO Spotfire—first as an end user and latterly (for nearly 8 years) as a TIBCO employee. He is a world-renowned Spotfire expert and has assisted with selling Spotfire and implementing it for many customers. He has also developed parts of the latest versions of Spotfire, being on the team that implemented the AI-Driven Recommendations feature in Spotfire X. Prior to his role at TIBCO, Andrew was an internal consultant at one of the world's largest pharmaceutical companies.
In his spare time, Andrew restores classic cars. He is also an orchestral horn player. He is dedicated to his family and enjoys spending time with his wife and children.
Michael Phillips is the original author of the first edition of TIBCO Spotfire: A Comprehensive Primer. He is an eClinical product innovator specializing in informatics solutions that support the drug development process generally, and clinical risk management in particular. He is a creative analyst with over 14 years' experience in IT and business intelligence and 5 years' experience in clinical informatics. He has a background in medicine and general science publishing, and a PhD in biochemistry (drug metabolism).
Colin Gray has worked in industries such as pharmaceuticals, environmental, and IT, and has over 15 years of experience in data analysis and informatics. Throughout this time, he has led data analysis and informatics projects, has worked on developing methods to make better use of data, and has had a key focus on how to communicate data better to others. To this aim, he has heavily employed web-based technologies and statistical packages. Having worked with TIBCO Spotfire for many years, he is a keen enthusiast in using Spotfire to further data science in all areas of business.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
TIBCO Spotfire: A Comprehensive Primer Second Edition
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Introducing Spotfire
Welcome to Spotfire
Getting started with TIBCO Spotfire
Launching Spotfire Analyst
Logging in to TIBCO Cloud Spotfire
Downloading Spotfire Cloud Analyst
Installing the mobile apps and logging in to TIBCO Cloud
TIBCO Spotfire for macOS
Logging in to the Spotfire web clients (on-premises)
Spotfire licenses
Getting started with loading data
Importing Excel spreadsheets into Spotfire
Introduction to the data panel
The Spotfire recommendations engine
Saving Spotfire files
Saving a file in analyst clients
Saving a file in Spotfire Cloud or Business Author
Producing a useful interactive dashboard
Coloring
Proportionality with bar charts and pie charts
Drilling in to the data – details visualizations
Insights from details visualizations
Using filters
Trellising
Summary
It's All About the Data
Technical requirements
Understanding the basic row/column structure of a data table
Exploring key data types
Building a scatter plot with real-world data
Gathering population data
Gathering country and region data
Combining the data
What is a scatter plot?
Building the scatter plot using population growth versus child mortality
Working with natural hierarchies in data
Gaining insight from the scatter plot
Displaying information quickly in tabular form
Enriching visualizations with color categorization
Visualizing hierarchical data using treemaps
Summary
Impactful Dashboards!
KPI charts
Constructing a KPI chart
Coloring, sorting, and other customization of the KPI chart
Enabling end users to configure the KPI chart interactively
Framing your analysis using text areas
Spotfire custom expressions
Spotfire document properties
Spotfire property controls
Bringing it all together – interactively configuring the KPI chart
Drilling in to the KPI chart
Marking in Spotfire
Building a details visualization using marking
Deep dive – what insights can the dashboard reveal?
Publishing the dashboard to Spotfire Web
Summary
Sharing Insights and Collaborating with Others
Bookmarks
Advanced topic – key columns
Annotations
Conversations
Summary
Section 2: Spotfire In Depth
Practical Applications of Spotfire Visualizations
Bar charts
Things to watch out for with bar charts
Bar chart summary
Combination charts
Cross tables
Cross table grand totals
Grand totals – underlying values versus sum of cell values
Underlying values
Sum of cell values
Cross table summary
Scatter plots
Line ordering
Density plots
Other types of visualization with scatter plots
Scatter plot summary
Line charts
Pie charts
Box plots
How to further interpret a box plot
More box plot options
Box plot summary
Treemaps
Waterfall charts
Graphical tables
Graphical table visualization types
Graphical table summary
Taking action from KPI charts or graphical tables
Other visualization types
Summary
The Big Wide World of Spotfire
An overview of the Spotfire platform
Spotfire server
Spotfire database
Nodes and services
Spotfire Web Player(s) and Business Author
Spotfire Automation Services
Spotfire Analyst clients and mobile clients
Spotfire Statistics Services
Spotfire Web Player scalability
A quick guide to administration manager
Users
Groups and licenses
Special user groups
Preferences
Using the library administration interface
Folder permissions
Import and export
Automating tasks using Automation Services
Running Automation Services jobs
The Spotfire administration console
Analytics
Library
Users and groups
Scheduling and routing
Nodes and services
Deployments and packages
Monitoring and diagnostics
Automation Services
Summary
Source Data is Never Enough
Technical requirements
Creating metrics using calculated columns
Basic metric
Dynamic metric
Categorizing continuous numerical data using binning functions
Slicing and dicing data using hierarchy nodes
LastPeriods
PreviousPeriod
ParallelPeriod
NavigatePeriod
AllPrevious and AllNext
Previous and next
Intersect
Over methods in calculated columns versus axis expressions
Over method summary
Other calculations in Spotfire
Data manipulation – where, why, and how?
Merging (joining) data from multiple sources
Tall tables versus wide tables
Tall tables
Wide tables
Transforming data structure through pivots and unpivots
Unpivot
Pivot
Other transformations
Summary
The World is Your Visualization
Data relations (between tables)
Setting up relationships between data tables
Configuring marking and filtering between related data tables
Mashing up data from different tables in a single visualization 
Comparing subsets of data
Showing/hiding items of data
Annotating visualizations with reference lines, fitted curves, and error bars
Fitted curves
Forecasting
Error bars
Visualizing categorical information and trends together in combination charts
Visualizing complex multidimensional data using heat maps
Heat maps
Dendrograms
Profiling your data using parallel coordinate plots
Summary
What's Your Location?
Map chart layers
Getting started with map charts
Coordinate reference systems
Using geocoding to position data on a map
Geocoding using a zip code
Case study – when geocoding doesn't work
Feature layers
Using a data function to assist with geocoding
Adding Web Map Service data to a map chart
Creating custom maps using Tile Map Service layers
Using the map chart for non-geographic spatial analysis
Mapping an airport
Process mapping using a map chart
TIBCO GeoAnalytics
Summary
Section 3: Databases, Scripting, and Scaling Spotfire
Information Links and Data Connectors
In-memory versus in-database analytics
Information links
Data sources
Column
Joins
Filters
Procedures
Information links
Loading data from an information link
Writing back to a database using an information link
Data connectors
Loading data from a connector
Using data on demand to retrieve raw data
Troubleshooting data on demand
Custom expressions with in-database data
Important points to note with in-database (external) data
Connection credentials for Web Player users and Automation Services jobs
Saving connections to the Spotfire library
Streaming data
Summary
Scripting, Advanced Analytics, and Extensions
Scripting in Spotfire – the why, how, and what
Automating actions using IronPython
Customizing the Spotfire user interface using JavaScript and HTML in text areas
Spotfire Developer Tools
Script trust
TIBCO Enterprise Runtime for R and open source R
Python in Spotfire
Statistica and Spotfire
Extending Spotfire
The JavaScript API
Spotfire Server APIs
Summary
Scaling the Infrastructure; Keeping Data up to Date
Context-aware load management with scheduling and routing
Some definitions
Server topology
Scaling the Spotfire Server using clustering
Spotfire Sites
Spotfire Web Player scaling
Nodes, services, and resource pools
Spotfire Web Player scalability – Scheduling & Routing
Routing rules – definitions
File
Group
User
Routing rules – practical examples
Scenario 1 – manufacturing sensor data analysis
Scenario 2 – separating critical functions and everyday operations
Verifying the example rules and schedules
Checking the schedule
Checking the routing rules for the scenarios
Triggering an update from an external system
Notes on scheduling and routing
Summary
Beyond the Horizon
Natural language search
Canvas styling and theming
Exporting from Spotfire
Visualization export
Exporting data to a file or a library
PDF export
JavaScript visualizations
Alerting in Spotfire
TIBCO Data Virtualization
TIBCO Data Science
KNIME and Spotfire
TIBCO StreamBase and Spotfire Data Streams
TIBCO Spotfire Cloud Enterprise
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
Welcome to TIBCO Spotfire: A Comprehensive Primer—Second Edition—Building enterprise-grade data analytics and visualization solutions. This book will introduce you to Spotfire and how to build analysis and visualization solutions using Spotfire. You'll learn how to determine what is going on in your data and then how to drill into it to determine why something is happening! You'll find out how to get to data-driven insights fast, with the help of real-world examples and visualizations. The book doesn't just cover the basics—if you're a seasoned user of Spotfire, there are many advanced topics that will extend and enhance your knowledge.
This book is for those who wish to learn about how to use Spotfire and how to work with analytics in general. It is also for those who would like to evaluate Spotfire to help determine whether it's suitable for their own or their organization's analytics needs. Finally, advanced users will benefit from reading the book—there are many topics, technical hints, and tips that experienced Spotfire users will find very useful. You don't need any prior experience of Spotfire or analytics to get the best out of the book, although some background in how to work with data will be helpful.
Chapter 1, Welcome to Spotfire, introduces Spotfire, showing the different Spotfire clients and how to get started with them. It jumps right into analytics with a real-world example of analyzing data from the Titanic disaster.
Chapter 2, It's All About the Data, covers how to work with data in Spotfire by way of building a scatter plot showing world population growth. Along the way, the chapter discusses how to build detailed visualizations to gain additional insights.
Chapter 3, Impactful Dashboards!, shows how to construct attention-grabbing dashboards using KPI charts.
Chapter 4, Sharing Insights and Collaborating with Others, discusses how to share insights with others and how to use Spotfire's collaboration features.
Chapter 5, Practical Applications of Spotfire Visualizations, is a tour through some of the most frequently used visualizations in Spotfire's toolbox. It gives useful hints and tips as to how to use visualizations to their best effect and details common pitfalls to avoid.
Chapter 6, The Big Wide World of Spotfire, introduces the Spotfire platform. It covers scaling in brief and shows some of Spotfire's administration tools. It also introduces Automation Services.
Chapter 7, Source Data is Never Enough, covers data manipulation in Spotfire—creating calculated columns, working with custom expressions, and transforming data.
Chapter 8, The World is Your Visualization, rounds off what has been learned so far. It covers data relationships, adding smoothing and forecasting lines, error bars, and working with Spotfire's subsets and show/hide features. It finishes off by covering the final two visualization types that haven't yet been discussed.
Chapter 9, What's Your Location?, covers map charts and geo-analytics in Spotfire. It also introduces the concept of data functions and TIBCO Enterprise Runtime for R.
Chapter 10,Information Links and Data Connectors, shows you how to work with information links and data connectors in Spotfire. These features are important if you're working with anything other than flat files. In-database versus in-memory data is discussed, along with all the tools you'll need to work with both. The chapter also covers streaming (live) data.
Chapter 11, Scripting, Advanced Analytics, and Extensions, introduces how to work with IronPython and JavaScript in Spotfire. It covers how to work with the many and varied data science platforms that integrate with Spotfire and explains how to get started with developing custom extensions.
Chapter 12, Scaling the Infrastructure; Keeping Data up to Date, is designed primarily for Spotfire administrators. It shows how to scale up the Spotfire infrastructure to support as many users as you need. It also details the various mechanisms for keeping data in Spotfire files up to date, showcasing schedules and rules.
Chapter 13, Beyond the Horizon, covers searching in Spotfire, styling and theming, exporting, conditional alerting, JavaScript visualizations, and some additional products that work nicely with Spotfire.
You can follow a large number of the examples in this book by just using a web browser, since you can sign up for a trial account for TIBCO Cloud Spotfire and use Spotfire in the cloud. However, the more in-depth examples require you to use Spotfire Analyst (more on this in Chapter 1, Welcome to Spotfire). You can download a version of Spotfire Analyst once you have signed up for a Cloud account. You'll need a reasonably modern PC running Microsoft Windows 7 or later. It should be a 64-bit computer system, as 32-bit systems are only suitable for analyzing small datasets or working with in-database analytics.
You can use Spotfire on a Mac computer too, but the Mac client isn't as fully-featured as the PC-based clients.
This book is specifically targeted toward Spotfire version 10.0.0 and later. You can still take advantage of the book, even if you're using Spotfire version 7, but just be aware that the menu selections won't correspond exactly, and some new functionality was released with Spotfire X (Spotfire 10) that wasn't available in previous versions. Specifically, AI-driven recommendations, natural language search, and streaming data are all new to Spotfire X.
If you're a new user to Spotfire, I recommend you start at the beginning of the book and work your way through it, sequentially, to Chapter 4, Sharing Insights and Collaborating with Others, or Chapter 5, Practical Applications of Spotfire Visualizations. Then, you should dip in and out of the rest of the book as you see fit. Chapter 10, Information Links and Data Connectors, is a must-read for when you want to connect Spotfire up to any data source other than flat files. Chapter 9, What's your Location?, is essential reading if you want to perform any kind of geoanalytics or other location-based analytics. Chapter 13, Beyond the Horizon, covers several interesting topics, from theming, through to export, search, and more.
A lot of the URLs provided in the book are quite long, so I used https://bitly.com/ for link shortening (in most cases). https://bitly.com/ does track how many clicks each of the links has received, but not who clicked them. Of course, you can always just use a search engine to find each of the references, but I have included the links so you know you're always looking at exactly the right topic on the web!
Although the author, Andrew Berridge, works for TIBCO Software Ltd., it must be stressed that any views or opinions expressed within the book are solely Andrew's. They are not sanctioned in any way by TIBCO Software Ltd. and do not represent the views or opinions of TIBCO Software. Any apparently forward-facing statements or opinions should not be used for purchasing decisions for Spotfire or any other TIBCO product.
All product names, trademarks, and registered trademarks are property of their respective owners. All company, product and service names used in this book are for identification purposes only. Use of these names and trademarks does not imply endorsement.
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/9781787121324_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
The first section of this book is a general introduction to Spotfire—it shows how to load data and how to get started with Spotfire visualizations. Here, you will see how quick and easy it is to get started with producing insightful and impactful visualizations and how you can use Spotfire to collaborate with others.
In this section, the following chapters will be covered:
Chapter 1
,
Welcome to Spotfire
Chapter 2
,
It's All About the Data
Chapter 3
,
Impactful Dashboards!
Chapter 4
,
Sharing Insights and Collaborating with Others
Welcome to the world of TIBCO Spotfire®! This book will take you on a journey through data discovery and visualization, advanced analysis, and beyond. It will show you how to get started really easily, then how to progress on to more advanced topics.
When you start Spotfire for the first time, your first task is to load some data. This data can be loaded from a wide variety of sources, from files through to database and big data repositories. It's then really easy to visualize and explore the data, gaining fresh insight and understanding all the time.
The following topics are covered in this chapter:
Introduction
to TIBCO Spotfire
Getting started with TIBCO Spotfire
The different Spotfire clients
Importing and loading data into Spotfire
The Spotfire recommendations engine
Simple visualization types
Building useful visualizations
Details visualizations
Introduction to marking and filtering
Gaining insights from your data
In this section, we are going to explore loading and visualizing data using the Spotfire rich desktop client and the Spotfire web authoring clients.
There are a few different Spotfire client applications. As of Version X (pronounced 10), these are as follows:
Desktop clients (rich, installed):
Spotfire Analyst
: This fully featured application connects to an on-premises Spotfire server that's installed at your organization and is sometimes called Spotfire professional.
Spotfire Cloud Analyst
: A fully featured desktop application that connects to a cloud-based Spotfire server, you can download it from your
Spotfire Cloud
account if you have one.
Spotfire for macOS
: This is a hybrid application—it's essentially a wrapper around the Spotfire web clients (detailed later), but it is installed on desktop machines.
Web clients (thin, accessible via a web browser):
Spotfire Consumer
: This is a standard, read-only Spotfire web client, usually available on-premises. You can consume existing visualizations and data. You cannot create new visualizations, load data, or change the configuration of visualizations (unless explicitly enabled by the author of a Spotfire file).
Spotfire Business Author
: This is an advanced web client. In addition to the features of consumer, it allows loading of data, configuration of visualizations, and many other Spotfire authoring capabilities. You can check if it's available to you by logging in to Spotfire web (Consumer) and checking if there's an option to edit or create an existing analysis—more on this later.
TIBCO Cloud Spotfire
: This has all the features of Business Author and is accessible via
https://cloud.tibco.com/
. You can even sign up for a free trial account—I recommend you do this if you just want to explore Spotfire before you purchase it.
Mobile clients:
Spotfire Analytics
for iOS
: Provides
Spotfire Consumer
-type functionality for iPhone
®
and iPad
®
devices.
Spotfire Analytics
for Android
: Provides
Spotfire Consumer
-type functionality for Android phones and tablet computers.
Important
: You cannot create new analyses or modify existing analysis files using the mobile clients.
Spotfire Analyst is a rich desktop client and can be launched like any other Windows application. There are usually two shortcuts to Spotfire once it's installed. The first is as follows:
The other is as follows:
The first shortcut is the one you will use the most often. The (show login dialog) shortcut is useful if you've previously asked Spotfire to remember your server details and login information and you want to update these for any reason. By clicking on Manage servers..., you will be able to select which server to connect to (or, potentially, work offline, by selecting the corresponding option at the bottom-left of the page):
When you launch the Spotfire client, you will be presented with a home page that has shortcuts to various frequently used features, such as loading recently used analyses, loading recently used data, or looking at sample Spotfire files. This is the Files and Data... view, which can be shown or hidden by pressing the + sign on the top-left of the page. This page is the starting point for connecting to data and analyses, whether this be on the local system or in the Spotfire library:
Feel free to explore some of the sample analysis files. I hope they inspire you to see what you can do with visualizations in Spotfire. Of course, you can always use the search box—it's really useful for finding files or connections to databases, and so on.
If you're interested, view the sample video! The link opens a YouTube video—while you're there, there are lots of other Spotfire videos that you can browse. Many of them cover some advanced topics, but hopefully they will inspire you to learn more and you'll come back repeatedly as there is always new content being published.
Visit https://cloud.tibco.com/ and sign up for an account, if you don't have one already. Once you have an account, you can log in the usual way:
You will now be able to select which application(s) you would like to work with. In this case, choose Analytics:
Click Spotfire:
Spotfire will open and be shown to you. The web-based client looks very similar to the desktop (analyst) client, but has less functionality when it comes to building data workflows and various other authoring functions:
If you have a Spotfire Cloud account you can download Spotfire Cloud Analyst—this is a full-featured version of Spotfire that works on your desktop. I recommend that you download it as soon as you can—it's more powerful than the web-based clients, and some of the exercises in this book can only be performed using an analyst client.
To download and install Spotfire Cloud Analyst, follow these steps:
Make sure you're logged in to your TIBCO Cloud account and Spotfire has been launched (like we did previously).
Dismiss the left-hand panel by clicking anywhere on the right-hand side of the window.
You should be left with a blank Spotfire screen—from the
File
menu, choose
View library
:
Click
Downloads
under
Resources:
Click
DOWNLOAD FOR WINDOWS
.
Once the download has finished, install the application as you would do so with any other Windows application.
The Spotfire Analytics app is available for Android and iOS phones and tablets from the Google Play Store and the iTunes App Store, respectively.
Just search for Spotfire Analytics and install the app!
I have an Android phone, so I am going to use that to demonstrate this functionality. iOS is broadly similar.
Once you have opened the app, you should get something that looks a bit like this:
You can start off by looking at some of the examples—in fact, it's a great way to get started without even needing to log in anywhere!
However, let's get started by logging in to a Cloud account:
Tap the
Get Started
button.
Tap the
TIBCO Cloud
button (or
Sign up here
if you don't have an account yet):
Log in to your TIBCO Cloud account:
Once you have logged in, you should be able to see Spotfire's library browser. This is what I get:
Feel free to browse the existing analyses. It's important to remember that you cannot create new analysis files or edit existing ones using the mobile apps.
You can also log in to your on-site Spotfire server if you have the requisite login and permissions (and potentially VPN access).
As we mentioned previously, this is a lightweight wrapper around Spotfire's web clients, so you will need access to a Spotfire web server. This can either be in the cloud or on-premises.
Install Spotfire for macOS by searching for it on the App store. Once the application is installed, the following window is shown:
You can sign into your organization's Spotfire web server by clicking Add Library or log in to your Cloud account by clicking on TIBCO Cloud. In my case, I have logged in to my TIBCO Cloud account. If I select Samples, this is what I see:
Logging in to the Spotfire web clients is straightforward. Just navigate to the URL provided by your server administrator. Once you are logged in, you will be presented with a blank Spotfire client, a view of the Spotfire library, or with a blank analysis, depending on your access levels.
If you are using Spotfire Consumer, the ability to create a new analysis will not be available to you. You'll need to get access to Spotfire Business Author or one of the analyst clients (for example, Cloud Analyst) in order to be able to author Spotfire analyses, as per most of the examples in this book.
Before we get started with loading data and doing some analysis, I'd just like to briefly cover the very important topic of Spotfire licenses. Spotfire licenses are not software licenses in the normal sense (where you have to buy a license in order to use the software). You can think of Spotfire licenses as permissions to perform certain functions in the application. Almost every function in Spotfire has an associated license. Licenses are assigned via user groups. Assigning licenses is an administrative function and, as such, will be controlled via your Spotfire administrator.
A TIBCO Cloud account will probably have the most licenses assigned to it, so you should be able to do most things with Cloud Analyst or the web (business) author.
However, if you struggle to follow along with the examples in this book—the options don't seem to be there, or you just can't even get started–it's possible that you don't have the required license(s) to perform the functions that are suggested. Please contact your Spotfire administrator to get this fixed.
The simplest way to get started with loading data into Spotfire is to import some data from a file such as an Excel spreadsheet, so that's what this tutorial will cover.
To switch to Editing mode, follow these steps:
In the top right-hand corner of the application, click the dropdown.
Choose
Editing
:
If
Editing
mode is not available to you, it means that you do not have the correct permissions (license) for editing, or that the Spotfire client itself does not support it (for example, the Android or IOS clients).
The procedures for importing comma-separated values (CSV) files or Microsoft Excel spreadsheets in Spotfire are essentially identical:
From the Spotfire home page (shown initially when launching the
Analyst
or web clients), select
Browse local file...
:
You will be presented with a standard
Open File
dialog, allowing you to navigate to the file that you want to load. For this example, let's use some publicly available data on the Titanic disaster—it can be downloaded from the following link:
http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls
or:
http://bit.ly/2HStQ7R
.
In the Analyst client, Spotfire will open a dialog, which will allow you to define the import settings:
The first thing to notice is the
Worksheet
selection dropdown at the very top of the dialog window. Spotfire can only import one worksheet at a time. There is only one sheet in our file, so we don't need to do anything with this option.
The next thing to notice is the preview of the data and its structure. Spotfire will automatically detect and assign column headers and data types, but you can change any of these settings. You can also tell Spotfire not to import specific columns or rows.
We want to open the file with all defaults, so we're just going to click
OK
, but please do explore the drop-down options for columns and rows and experiment with the settings. The core philosophy of Spotfire is discovery, so start as you mean to continue and explore some of the options.
Once you click
OK
, Spotfire will show the
Add data to analysis
display:
Click
OK
to load the data.
You'll find that a lot of work is done in Spotfire via the data panel. You can show the data panel by clicking on the big icon in the middle of the Spotfire window, or by pulling it out by clicking on the data icon on the left-hand side of Spotfire:
The data panel shows all the data tables and columns that are available in the analysis. Spotfire has already classified the columns into different groups of numerical and categorical columns:
In this particular dataset, some categorical columns have been loaded as numeric columns. It's not Spotfire's fault—it's just that some of the data columns are integers in the data, and represent categories. Think of the column called survived. This is a 1 or 0, indicating whether the passenger, died or survived. Similarly, passenger class (pclass), the class of passenger should be categorical since it is either 1, 2, or 3, and taking any kind of aggregation of this (average, max, and so on) probably doesn't make much sense. You can read more about the dataset and its data dictionary here:http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3info.txt
Or:
http://bit.ly/2uDg6oX
Next, we are going to use Spotfire's recommendations engine to build a visualization, but in order to get the best results from it, it can sometimes be a good idea to change the categorization of columns in order to give Spotfire some hints about how to display or analyse the data. So, let's do this first:
Right-click the
pclass
column and change its categorization to
Categories
:
Do the same with the
survived
column.
Now, we can get started with building visualizations!
The recommendations engine gives you instant insight into your data and some suggestions of which visualizations to use. Spotfire's analyst clients have an advanced feature called AI-Powered Suggestions. This is Spotfire's new way of helping make sense of any type of data, regardless of its size or shape. The basic premise is that you should select a "target" column in the data panel and Spotfire will do the rest. Spotfire runs a specialized algorithm over all the columns in the data and selects those that most strongly drive or influence the target. Those columns are called "predictors." It then produces suggested visualizations for the target and selected predictors.
The recommender is available in the web clients too, but (at the time of writing) the web clients do not have the AI element, where predictor columns are automatically selected. I hope that the feature will be made available at some point. In the meantime, if you're using the web clients, you can follow a slightly different path to create visualizations. I'll point out how to do that along the way.
Let's get started! In the case of the Titanic data, the most obvious target column is survived. In other words, we'd like to know which columns best predict, influence, or explain whether passengers survived the Titanic disaster or not:
In the data panel, select the
survived
column. In analyst clients, Spotfire will produce something that looks like this:
Interesting! Immediately, we can see that the strongest predictors of survival are pclass and sex. The very first visualization is always just the row count of each of the values in the target column, so the second visualization is the one that begins to explain the target.
To add the visualization to your analysis, just click on the visualization that shows the relationship with
pclass
and
sex
. Your Spotfire session should now look something like the following screenshot:
You can produce the same effect in Spotfire web clients by selecting the
survived
column, the
pclass
column, and the
sex
column. Hold down the
Ctrl
key while clicking to select multiple columns:
I think it's also interesting to explore the male/female ratio on board the Titanic, so we need to add a bar chart visualization that shows just
survived
and
sex
. The AI recommender will choose these columns—just scroll down the panel a bit to find the visualization that shows this relationship, or manually select the columns in the web clients, then click the visualization to add it to your analysis. Now, let's pause and look at what we've created. In a few clicks, we have loaded some data and created two visualizations that really explains a lot of findings (insights) all in one go! Here is the analysis without the data panel (collapse it by clicking the double arrow toward the top-right corner of the panel):
Here are my notes on interpreting these visualizations. Of course, what I say is only a matter of opinion, so feel free to draw your own conclusions:
Notice how many more men there were on board than women?
Look at how many more men died than survived! The survival rate of women was much higher—this would be borne by the "women and children first" policy of filling the lifeboats.
Look at the left-hand visualization. It is trellised by
survived
—this means that Spotfire has split it into panels, one for each data value.
First-class female passengers had the greatest chance of survival—suggesting that class and socioeconomic factors played a role in the survival rates.
More third-class males survived than first- or second-class ones. Might they have had stronger fighting instincts or been pushier? Were they more willing to cram into overcrowded lifeboats? All are potential explanations.
It's a good idea to get into the habit of saving your Spotfire analysis files. There's no auto-save capability at the time of writing, so save regularly!
When saving a file in Spotire Analyst, you have a lot more options than in web-clients. Here's how to save a file in Spotfire Analyst, with a couple of tips along the way:
You'll notice that when you save the file (
File
|
Save as
|
File
) for the first time, Spotfire will show this prompt:
This is an important dialog to discuss. If you want to share your Spotfire file with another user, they will not be able to view it if they don't have access to the original data file. We can fix this now.
Click
Show details
. You'll see a dialog that allows you to choose what happens with the data. In this case, it's most appropriate to select
New data when possible
. This means that Spotfire will load new data if it's available (for example, if you updated the original Titanic data file), otherwise it will use data stored (embedded) in the Spotfire file:
Saving a file in Spotfire Cloud or Business Author is really straightforward. Saving a file will save it into the Spotfire library, where you and others can access the file (if folder permissions are set correctly—there is more on this later in this book):
Click the
Save
menu at the top of the screen and select
Save As
|
Library item...
:
The Spotfire web client will prompt you to choose the destination folder in the Spotfire library:
You can create a new folder using the + icon if you want.
Name the file as you see fit. Spotfire won't prompt you for the data saving settings (as it did in the analyst client) because the data is always embedded in the analysis.
When you're finished, click
Save
.
Now that we have produced some visualizations from the data, let's turn them into a useful interactive dashboard that we can use to gain more insight from the data.
In our example, the right-hand bar chart is colored by sex. The colors assigned by Spotfire are not immediately indicative of the sex of a Titanic passenger, so let's fix that:
Locate the legend for the bar chart.
Click on the dot for the
female
data and choose a more appropriate color. I suggest a pale pink or similar:
Do the same for male—click on the dot and choose a color suitable for
male
—I suggest a pale blue. For those viewing this in black and white, I apologize—you'll have to take my word for it...
It's all very well looking at the absolute numbers of female and male survivors, but this doesn't tell us the relative proportion of female and male passengers that survived.
Let's compare the use of bar charts and pie charts:
Open the data panel again by clicking the
Data
button on the left-hand side of the Spotfire window.
If the recommendations panel isn't shown, click the double arrow (
>>
) to display it. Now let's select the
sex
column as the target and scroll to see the relationship between
sex
and
survived
(analyst client). Choose the bar chart and add it to the analysis:
If you are using the web clients, you'll need to select both sex and survived and click MORE LIKE THIS on the bar chart for survived and sex to get to the bar chart for sex and survived. Note that the order is important here, as it determines which column goes on each axis of the bar chart. We need the sex column to be on the x (categorical) axis.
Now, apply some coloring to the resultant bar chart to indicate that surviving is good and not surviving is bad! You should end up with something that looks roughly like this:
That visual tells us the exact numbers of males and females that survived. How about proportionality? Right-click the visual and select
100% Stacked Bars
(or in web clients, right-click to get access to the visualization
Properties
dialog and change the setting there):
Let's return to the data panel once more in order to add a pie chart by showing the data panel and finding a pie chart that shows the same. In analyst clients, click
MORE LIKE THIS
to get to other representations of the relationship between
sex
and
survived
. In web clients, make sure
sex
and
survived
are selected in the data panel:
Color the pie chart using the same color scheme as the last bar chart—doing that ties the visualizations nicely and visually.
Your analysis should now look something like this. I have moved my visualizations around a bit by dragging their title bars:
The bar chart tells us a lot more than the pie chart and is indicative of several dimensions of data. You can see the total number of passengers of each sex and the proportion of each that survived at a glance.
Experiment with the settings of the bar chart by right-clicking on it and selecting the various options. For example:
Change the bars to horizontal, stacked bars (the default), 100% stacked bars (as we just did), or side-by-side bars
Change the
Sort bars by value
setting of the bar chart
The visualizations we've explored so far allows us to understand what is happening—in our case, we've understood the proportions of male and female survivors. That's great, but what about drilling into the details of the data? Drilling in can help us explore why something is happening in the main dataset.
Spotfire makes it really easy to drill in to the data:
Right-click on any of the visualizations you created earlier and highlight
Create Details Visualization
. A second menu will pop up:
For the purposes of this exercise, let's use a bar chart (again!), so click
Bar chart...
.Bar charts are some of the most often used visualizations in Spotfire as they can represent data in so many different ways. I find them to be very useful!
Spotfire will create an auto-configured visualization that in itself isn't useful. It's also empty:
Never fear—we can fix both these issues with a few clicks! Note that a new item has been added to the legend
Data limiting: Marking
. This means that, by default, no data will be shown on the visualization unless some data is marked in another visualization. In order to show some data, hover the mouse over some data in another visualization, click and drag it to create a rectangle, and select some data:
The details bar chart will now update to show the selected data. It's still not terribly useful as it's currently just showing the same as the original visualization chart. To configure the
x
-axis (the bottom one), hover over the visualization, then click the down arrow on the
x
-axis selector (it appears when you hover over the visualization):
From the resultant dropdown, choose
age
. That's more like it!
However, it's not yet quite as useful as it should be—notice that the overall shape of the graph is indicative of a distribution of the data (move on to see more), but the tall bars are often interspersed with very short bars between them. This is an example of a real-world data issue that prevents us from visualizing the trend in the data properly. The cause is that some people have been recorded with fractional ages. Babies under one year old have ages recorded as a fraction of a year; there are also some adults recorded as being x.5 years old. Why? I don't know, but let's fix it!
If you're using an analyst client, right-click on the
x
-axis selector (showing
age
) and choose
Auto-bin Column
:
In a web-based client, follow these steps:
Right-click on the
x
-axis column selector and choose
Custom Expression...
.
Enter the following custom expression:
AutoBinNumeric([age],80)
You'll notice that the visualization will change to look more
blocky
. What's happening is that Spotfire is
binning
the data, or grouping close values together to reduce the number of categories on the
x
-axis. In analyst clients, you can slide the little slider up and down on the axis slider to change the number of bins (this affects the granularity of the
x
-axis), or you can edit the custom expression, just like we did on the web clients. I surmise that 80 bins is a good number because that gives one bin per year of age in our data:
I have also rearranged the visualizations on the page slightly in order to give more room to this bar chart—you can do that by dragging the title bars of the visualizations and dragging the dividing lines between them.
Experiment with marking (selecting) different parts of the rest of the visualizations on the page in order to drill in to different parts of the data. Try selecting all male passengers, all female passengers, all females that survived, and so on.
As we described previously, the default behavior of a details visualization is for it to be empty if no data is marked elsewhere. That might not be what you want—you might want all data to be shown if nothing is selected. You can't change this behavior in the Spotfire web clients, but you can do it using Analyst. Right-click on the visualization and choose
Properties
, or click the cog wheel in the top right-hand corner of the visualization. The cog wheel isn't shown by default, so you'll need to hover over the corner of the visualization to make it visible.
Select the
Data
property page and open the setting under
If no items are marked in the master visualizations, show
:
Change the setting to
All data
.
Now, if you go back to the analysis and unmark any marked data by clicking outside of the marked items, you'll see that all data will be shown on the details visualization.
Now that we have created a useful details visualization, what conclusions, insights, and findings can be drawn from it? I'll share some of mine with you—see if you can find more of your own:
Here, I have selected all the data. The age of the passengers is mostly normally distributed, but with a peak at the lower age range (there seemed to be a lot of babies on board):
