Learning Kibana 7 - Anurag Srivastava - E-Book

Learning Kibana 7 E-Book

Anurag Srivastava

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A beginner's guide to analyzing and visualizing your Elasticsearch data using Kibana 7 and Timelion

Key Features

  • Gain a fundamental understanding of how Kibana operates within the Elastic Stack
  • Explore your data with Elastic Graph and create rich dashboards in Kibana
  • Learn scalable data visualization techniques in Kibana 7

Book Description

Kibana is a window into the Elastic Stack, that enables the visual exploration and real-time analysis of your data in Elasticsearch. This book will help you understand the core concepts of the use of Kibana 7 for rich analytics and data visualization.

If you’re new to the tool or want to get to grips with the latest features introduced in Kibana 7, this book is the perfect beginner's guide. You’ll learn how to set up and configure the Elastic Stack and understand where Kibana sits within the architecture. As you advance, you’ll learn how to ingest data from different sources using Beats or Logstash into Elasticsearch, followed by exploring and visualizing data in Kibana. Whether working with time series data to create complex graphs using Timelion or embedding visualizations created in Kibana into your web applications, this book covers it all. It also covers topics that every Elastic developer needs to be aware of, such as installing and configuring Application Performance Monitoring (APM) servers and agents. Finally, you’ll also learn how to create effective machine learning jobs in Kibana to find anomalies in your data.

By the end of this book, you’ll have a solid understanding of Kibana, and be able to create your own visual analytics solutions from scratch.

What you will learn

  • Explore the data-driven architecture of the Elastic Stack
  • Install and set up Kibana 7 and other Elastic Stack components
  • Use Beats and Logstash to get input from different data sources
  • Create different visualizations using Kibana
  • Build enterprise-grade Elastic dashboards from scratch
  • Use Timelion to play with time series data
  • Install and configure APM servers and APM agents
  • Work with Dev Tools, Spaces, Graph, and other important tools

Who this book is for

If you’re an aspiring Elastic developer or data analysts, this book is for you. You’ll also find it useful if you want to get up to speed with the new features of Kibana 7 and perform data visualization on enterprise data. No prior knowledge of Kibana is expected, but some experience with Elasticsearch will be helpful.

Anurag Srivastava is a senior technical lead in a multinational software company. He has more than 12 years' experience in web-based application development. He is proficient in designing architecture for scalable and highly available applications. He has handled dev teams and several clients from all over the globe over the last 10 years of his professional career. He has significant experience with the Elastic stack (Elasticsearch, Logstash, and Kibana) for creating dashboards using system metrics data, log data, application data, or relational databases. He has authored two books – Mastering Kibana 6.x, and Kibana 7 Quick Start Guide, published previously by Packt. Bahaaldine Azarmi, or Baha for short, is the head of solutions architecture in EMEA South at Elastic. Prior to this position, Baha co-founded ReachFive, a marketing data platform focused on user behavior and social analytics. He also worked for a number of different software vendors, including Talend and Oracle, where he held positions as a solutions architect and architect. Prior to Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana 5.0, Scalable Big Data Architecture, and Talend for Big Data. He is based in Paris and holds an MSc in computer science from Polytech'Paris.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 256

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Learning Kibana 7Second Edition

 

 

 

 

 

Build powerful Elastic dashboards with Kibana's data visualization capabilities

 

 

 

 

 

 

 

 

Anurag Srivastava
Bahaaldine Azarmi

 

 

 

 

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Learning Kibana 7  Second Edition

Copyright © 2019 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Amey VarangaonkarAcquisition Editor: Nelson MorrisContent Development Editors: Pratik Andrade, Anugraha ArunagiriSenior Editor: Ayaan HodaTechnical Editor: Snehal Dalmet, Dinesh PawarCopy Editor: Safis EditingProject Coordinator: Vaidehi SawantProofreader: Safis EditingIndexer: Manju ArasanProduction Designer: Deepika Naik

First published: February 2017 Second edition: July 2019

Production reference: 1190719

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-83855-036-3

www.packtpub.com

To my mom; my dad; my wife, Chanchal; and my son, Anvit.
– Anurag Srivastava
 

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Contributors

About the authors

Anurag Srivastava is a senior technical lead in a multinational software company. He has more than 12 years' experience in web-based application development. He is proficient in designing architecture for scalable and highly available applications. He has handled dev teams and multiple clients from all over the globe over the past 10 years of his professional career. He has significant experience with the Elastic Stack (Elasticsearch, Logstash, and Kibana) for creating dashboards using system metrics data, log data, application data, or relational databases. He has authored other two books—Mastering Kibana 6.x, and Kibana 7 Quick Start Guide, both published by Packt.

 

 

 

 

 

 

Bahaaldine Azarmi, or Baha for short, is the head of solutions architecture in the EMEA South region at Elastic. Prior to this position, Baha co-founded ReachFive, a marketing data platform focused on user behavior and social analytics. He has also worked for a number of different software vendors, including Talend and Oracle, where he held positions as a solutions architect and architect. Prior to Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana 5.0, Scalable Big Data Architecture, and Talend for Big Data. He is based in Paris and holds an MSc in computer science from Polytech'Paris.

About the reviewer

Giacomo Veneri graduated in computer science from the University of Siena. He holds a PhD in neuroscience, along with having various scientific publications to his name. He is Predix IoT-certified and an influencer, as well as being certified in SCRUM and Oracle Java. He has 20 years' experience as an IT architect and team leader. He has been an expert on IoT in the fields of oil and gas and transportation since 2013. He lives in Tuscany, where he loves cycling. He is also the author of Hands-On Industrial Internet of Things and Maven Build Customization, both published by Packt.

 

 

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Learning Kibana 7  Second Edition

Dedication

About Packt

Why subscribe?

Contributors

About the authors

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Understanding Kibana 7

Understanding Your Data for Kibana

Industry challenges

Use cases to explain industry issues

Understanding your data for analysis in Kibana

Data shipping

Data ingestion

Storing data at scale

Visualizing data

Technology limitations

Relational databases

Hadoop

NoSQL

Components of the Elastic Stack

Elasticsearch

Beats

Logstash

Kibana

X-Pack

Security

Monitoring

Alerting

Reporting

Summary

Installing and Setting Up Kibana

Installing Elasticsearch

Elasticsearch installation using the .zip or .tar.gz archives

Downloading and installing using the .zip archive

Downloading and installing using the .tar.gz archive

Running Elasticsearch

Elasticsearch installation on Windows using the .zip package

Downloading and installing the .zip package

Running Elasticsearch

Installing Elasticsearch as a service

Elasticsearch installation using the Debian package

Installing Elasticsearch using the apt repository

Manually installing using the Debian package

Elasticsearch installation using RPM

Installing using the apt repository

Manually installing using RPM

Running Elasticsearch

Running Elasticsearch with SysV

Running Elasticsearch with systemd

Checking whether Elasticsearch is running

Installing Kibana

Kibana installation using the .zip or .tar.gz archives

Downloading and installing using the .tar.gz archive

Running Kibana

Downloading and installing using the .zip archive

Running Kibana

Kibana installation using the Debian package

Installing using the apt repository

Manually installing Kibana using the Debian package

Running Kibana

Running Kibana with SysV

Running Kibana with systemd

Kibana installation using RPM

Installing using the apt repository

Manually installing using RPM

Running Kibana

Running Kibana with SysV

Running Kibana with systemd

Installing Logstash

Installing Logstash using the downloaded binary

Installing Logstash from the package repositories

Installing Logstash using the apt package

Installing Logstash using the yum package

Running Logstash as a service

Running Logstash using systemd

Running Logstash using upstart

Running Logstash using SysV

Installing Beats

Installing Filebeat

deb

rpm

macOS

Linux

win

Installing Metricbeat

deb

rpm

macOS

Linux

win

Installing Packetbeat

deb

rpm

macOS

Linux

win

Installing Heartbeat

deb

rpm

macOS

Linux

win

Installing Winlogbeat

Summary

Section 2: Exploring the Data

Business Analytics with Kibana

Understanding logs

Data modeling

Importing data

Beats

Configuring Filebeat to import data we need to enable the following command in the input section of the filebeat.yml file

Reading log files using Filebeat

Logstash

Reading CSV data using Logstash

Reading MongoDB data using Logstash

Reading MySQL data using Logstash

Creating an index pattern

Summary

Visualizing Data Using Kibana

Creating visualizations in Kibana

Identifying the data to visualize

Creating an area chart, a line chart, and a bar chart

Creating a pie chart

Creating the heatmap

Creating the data table

Creating the metric visualization

Creating the tag cloud

Inspecting the visualization

Sharing the visualization

Creating dashboards in Kibana

Sharing the dashboard

Generating reports

Summary

Section 3: Tools for Playing with Your Data

Dev Tools and Timelion

Introducing Dev Tools

Console

Search profiler

Aggregation profile

Grok Debugger

Timelion

.es()

.label()

.color()

.static()

.bars()

.points()

.derivative()

.holt()

.trend()

.mvavg()

A use case of Timelion

Summary

Space and Graph Exploration in Kibana

Kibana spaces

Creating a space

Editing a space

Deleting a space

Switching between spaces

Moving saved objects between spaces

Restricting space access

Creating a role to provide access to a space

Creating a user and assigning the space access role

Checking the user space access

Kibana graphs

Differences with industry graph databases

Creating a Kibana graph

Advanced graph exploration

Summary

Section 4: Advanced Kibana Options

Elastic Stack Features

Security

Roles

Users

Monitoring

Elasticsearch Monitoring

Kibana Monitoring

Alerting

Creating a threshold alert

Reporting

CSV reports

PDF and PNG reports

Summary

Kibana Canvas and Plugins

Kibana Canvas

Introduction to Canvas

Customizing the workpad

Managing assets

Adding elements

Data tables

Designing the data table

Pie charts

Images

Creating a presentation in Canvas

Kibana plugins

Installing plugins

Removing plugins

Available plugins

Summary

Application Performance Monitoring

APM components

APM agents

The APM Server

Installing the APM Server

APT

YUM

APM Server installation on Windows

Running the APM Server

Configuring the APM Server

Elasticsearch

Kibana

Configuring an application with APM

Configuring the APM agent for the Django application

Running the Django application

Monitoring the APM data

Summary

Machine Learning with Kibana

What is Elastic machine learning?

Machine learning features

Creating machine learning jobs

Data visualizer

Single metric jobs

Practical use case to explain machine learning

Forecasting using machine learning

Multi-metric jobs

Population jobs

Job management

Job settings

Job config

Datafeed

Counts

JSON

Job messages

Datafeed preview

Forecasts

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

This book is here to help you understand the core concepts and the practical implementation of Kibana in different use cases. It covers how to ingest data from different sources into Elasticsearch using Beats or Logstash. It then shows how to explore, analyze, and visualize the data in Kibana. This book covers how to play with time series data to create complex graphs using Timelion and show them along with other visualizations on your dashboard, then how to embed your dashboard or visualization on a web page. You will also learn how to use APM to monitor your application by installing and configuring the APM server and APM agents. We will explore how Canvas can be used to create awesome visualizations. We will also cover different X-Pack features such as user and role management in security, alerting, monitoring, and machine learning. This book will also explain how to create machine learning jobs to find anomalies in your data.

Who this book is for

Aspiring Elastic developers, data analysts, and those interested in learning about the new features of Kibana 7 will find this book very useful. No prior knowledge of Kibana is expected. Previous experience with Elasticsearch will help, but is not mandatory.

What this book covers

Chapter 1, Understanding Your Data for Kibana, introduces the notion of data drive architecture by explaining the main challenges in the industry, how the Elastic Stack is structured, and what data we'll use to implement some of the use cases in Kibana.

Chapter 2, Installing and Setting Up Kibana, walks the reader through the installation of the Elastic Stack on different platforms.

Chapter 3, Business Analytics with Kibana, describes what a business analytics use case is through a real-life example, and then walks the reader through the process of data ingestion.

Chapter 4, Data Visualization Using Kibana, describes visualization and dashboarding. The readers will learn how to create different visualizations, before moving on to how to create a dashboard using these visualizations.

 Chapter 5, Dev Tools and Timelion, is focused on Dev Tools and Timelion in Kibana. The readers will learn different options of Dev Tools, such as using Console to run Elasticsearch queries right from the Kibana interface. Then we will cover using Search Profiler to profile the Elasticsearch queries, and using Grok Debugger to create a Grok pattern with which we can convert unstructured data into structured data through Logstash. After that, we will cover Timelion, with which we can play with time-series data, because it provides some functions that can be chained together to create a complex visualization for specific use cases that can't be created using the Kibana Visualize option.

Chapter 6, Space and Graph Exploration in Kibana, describes the Elastic Stack Graph plugin, which provides graph analytics. The reader will be walked through the main use cases that the Graph plugin tries to solve, and will see how to interact with the data. After that, we will cover how to create different Spaces and add them with different roles and users.

Chapter 7, Elastic Stack Features, describes the importance of Elastic features. We will cover security using user and role management, and will then cover reporting, with which we can export CSV and PDF reports. After that, we will explore how to use monitoring to monitor the complete Elastic Stack, and with Watcher, we will configure the alerting system to send an email whenever a value crosses a specified threshold.

Chapter 8, Kibana Canvas and Plugins, describes the Kibana Canvas and explains how we can create custom dashboards with it.

Chapter 9, Application Performance Monitoring, describes Application Performance Monitoring (APM) and how it can be configured to monitor an application. We will cover the installation of APM Server and configure it to receive data from APM agents. Then, we will cover the installation and configuration of APM agents with the application in order to fetch the application data. Lastly, we will explain how to explore data with the built-in APM UI or Kibana Dashboard.

Chapter 10, Machine Learning with Kibana, introduces machine learning and explores how to find data anomalies and predict future trends.

To get the most out of this book

In this book, you will need to download and install the Elastic Stack, specifically, Elasticsearch, Kibana, Beats, Logstash, and APM. All the software is available from http://www.elastic.co/downloads. The Elastic Stack can be run on various environments on different machines and setups. The support matrix is available at https://www.elastic.co/support/matrix.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packt.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Learning-Kibana-7-Second-Edition. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838550363_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "For CentOS and older Red Hat-based distributions, we can use the yum command".

A block of code is set as follows:

input { file { path => "/home/user/Downloads/Popular_Baby_Names.csv" start_position => beginning }}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

elasticsearch { action => "index"

hosts => ["127.0.0.1:9200"]

index => "Popular_Baby_Names"}

Any command-line input or output is written as follows:

unzip elasticsearch-7.1.0-windows-x86_64.zip

cd elasticsearch-7.1.0/

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Now, we need to click on the Next step button."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Understanding Kibana 7

In this section, we will start with a basic introduction to the Elastic Stack and then discuss what's new in Elastic Stack 7. We will then cover the installation process of the Elastic Stack. By the end of this section, we will know how we can create an index pattern in Kibana.

The following chapters will be covered in this section:

Chapter 1

Understanding Your Data for Kibana

Chapter 2

Installing and Setting Up Kibana

Understanding Your Data for Kibana

We are living in a digital world in which data is growing at an exponential rate; every digital device sends data on a regular basis and it is continuously being stored. Now, storing huge amounts of data is not a problem—we can use cheap hard drives to store as much data as we want. But the most important thing that we can do with that data is to get the information that we need or want out of it. Once we understand our data, we can then analyze or visualize it. This data can be from any domain, such as accounting, infrastructure, healthcare, business, medical, Internet of Things (IoT), and more, and it can be structured or unstructured. The main challenge for any organization is to first understand the data they are storing, analyze it to get the information they need, create visualizations, and, from this, gain an insight of the data in a visual format that is easy to understand and enables people in management roles to take quick decisions.

However, it can be difficult to fetch information from data due to the following reasons:

Data brings complexity

: It is not easy to get to the root cause of any issue; for example, let's say that we want to find out why the traffic system of a city behaves badly on certain days of a month. This issue could be dependent on another set of data that we may not be monitoring. In this case, we could get a better understanding by checking the weather report data for the month. We can then try and find any correlations between the data and discover a pattern.

Data comes from different sources

: As I have already mentioned, one dataset can depend on another dataset and they can come from two different sources. Now, there may be instances where we cannot get access to all the data sources that are dependent on each other and, for these situations, it is important to understand and gather data from other sources and not just the one that you are interested in.

Data is growing at a faster pace

: As we move toward a digital era, we are capturing more and more data. As data grows at a quicker pace, it also creates issues in terms of what to keep, how to keep it, and how to process such huge amounts of data to get the relevant information that we need from it.

We can solve these issues by using the Elastic Stack, as we can store data from different sources by pushing it to Elasticsearch and then analyzing and visualizing it in Kibana. Kibana solves many data analysis issues as it provides many features that allow us to play around with the data, and we can also do a lot of things with it. In this book, we will cover all of these features and try to cover their practical implementation as well.

In this chapter, we will cover the following topics:

Data analysis and visualization 

challenges for industries

Understanding your data for analysis in Kibana

Limitations with existing tools

Components of the Elastic Stack

Industry challenges

Depending on the industry, the use cases can be very different in terms of data usage. In any given industry, data is used in different ways and for different purposes—whether it's for security analytics or order management. Data comes in various formats and different scales of volumes. In the telecommunications industry, for example, it's very common to see projects about the quality of services where data is taken from 100,000 network devices.

The challenge for these industries is to handle the huge quantities of data and to get real-time visualizations from which decisions can be taken. Data capture is usually performed for applications, but to utilize this data for creating a real-time dashboard is a challenge. For that, Kibana can be used, along with Beats and Logstash, to push data from different sources, Elasticsearch can be used to store that data, and then, finally, Kibana can be used to analyze and visualize it. So, if we summarize the industry issue, it has the same canonical issues as the following:

How to handle huge quantities of data as this comes with a lot of complexity

How to visualize data effectively and in a real-time fashion so that we can get data insights easily

Once this is achieved, we can easily recognize the visual patterns in data and, based on that, we can derive the information out of it that we need without dealing with the burden of exploring tons of data. So, let me now explain a real scenario that will help you to understand the actual challenge of data capture. I will take a simple use case to explain the issues and will then explain the technologies that can be used to solve them.

Use cases to explain industry issues

If we consider the ways in which we receive huge amounts of data, then you will note that there are many different sources that we can use to get structured or unstructured data. In this digital world, we use many devices that keep on generating and sending data to a central server where the data is then stored. For instance, the applications that we access generate data, the smartphones or smartwatches we use generate data, and even the cab services, railways, and air travel systems we use for transportation all generate data.

A system and its running processes also generate data, and so, in this way, there are many different ways in which we can get data. We get this data at regular intervals and it either accumulates on the physical drive of a computer or, more frequently, it can be hidden within data centers that are hard to fetch and explore. In order to explore this data and to analyze it, we need to extract (ship) it from different locations (such as from log files, databases, or applications), convert it from an unstructured data format into a structured data format (transform), and then push the transformed data into a central place (store) where we can access it for analysis. This flow of data streaming in the system requires a proper architecture to be shipped, transformed, stored, and accessed in a scalable and distributed way.

End users, driven by the need to process increasingly higher volumes of data while maintaining real-time query responses, have turned away from more traditional, relational database or data warehousing solutions, due to poor scalability or performance. The solution is increasingly found in highly distributed, clustered data stores that can easily be monitored. Let's take the example of application monitoring, which is one of the most common use cases we meet across industries. Each application logs data, sometimes in a centralized way (for example, by using syslog), and sometimes all the logs are spread out across the infrastructure, which makes it hard to have a single point of access to the data stream.

The majority of large organizations don't retain logged data for longer than the duration of a log file rotation (that is, a few hours or even minutes). This means that, by the time an issue has occurred, the data that could provide the answers is lost.

So, when you actually have the data, what do you do? Well, there are different ways to extract the gist of logs. A lot of people start by using a simple string pattern search (GREP). Essentially, they try to find matching patterns in logs using a regular expression. That might work for a single log file but, if you want to search something from different log files, then you need to open individual log files for each date to apply the regular expression. 

GREP is convenient but, clearly, it doesn't fit our need to react quickly to failure in order to reduce the Mean Time To Recovery (MTTR). Think about it: what if we were talking about a major issue in the purchasing API of an e-commerce website? What if the users experience a high latency on this page or, worse, can't go to the end of the purchase process? The time you will spend trying to recover your application from gigabytes of logs is money you could potentially lose. Another potential issue could be around a lack of security analytics and not being able to blacklist the IPs that try to brute force your application.

In the same context, I've seen use cases where people didn't know that every night there was a group of IPs attempting to get into their system, and this was just because they were not able to visualize the IPs on a map and trigger alerts based on their value. A simple, yet very effective, pattern in order to protect the system would have been to limit access to resources or services to the internal system only. The ability to whitelist access to a known set of IP addresses is essential. The consequence could be dramatic if a proper data-driven architecture with a solid visualization layer is not serving those needs. For example, it could lead to a lack of visibility and control, an increase in the MTTR, customer dissatisfaction, financial impact, security leaks, and bad response times and user experiences.

Understanding your data for analysis in Kibana

Here, we will discuss different aspects of data analysis such as data shipping, data ingestion, data storage, and data visualization. These are all very important aspects of data analysis and visualization, and we need to understand each of them in detail. The objective is to then understand how to avoid any confusion, and build an architecture that will serve the different following aspects.

Data shipping

Data-shipping architecture should support any sort of data or event transport that is either structured or unstructured. The primary goal of data shipping is to send data from remote machines to a centralized location in order to make it available for further exploration. For data shipping, we generally deploy lightweight agents that sit on the same server from where we want to get the data. These shippers fetch the data and keep on sending them to the centralized server. For data shipping, we need to consider the following: