Splunk Operational Intelligence Cookbook - Josh Diakun - E-Book

Splunk Operational Intelligence Cookbook E-Book

Josh Diakun

0,0
43,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Leverage Splunk's operational intelligence capabilities to unlock new hidden business insights and drive success

Key Features

  • Tackle any problems related to searching and analyzing your data with Splunk
  • Get the latest information and business insights on Splunk 7.x
  • Explore the all new machine learning toolkit in Splunk 7.x

Book Description

Splunk makes it easy for you to take control of your data, and with Splunk Operational Cookbook, you can be confident that you are taking advantage of the Big Data revolution and driving your business with the cutting edge of operational intelligence and business analytics.

With more than 80 recipes that demonstrate all of Splunk’s features, not only will you find quick solutions to common problems, but you’ll also learn a wide range of strategies and uncover new ideas that will make you rethink what operational intelligence means to you and your organization.

You’ll discover recipes on data processing, searching and reporting, dashboards, and visualizations to make data shareable, communicable, and most importantly meaningful. You’ll also find step-by-step demonstrations that walk you through building an operational intelligence application containing vital features essential to understanding data and to help you successfully integrate a data-driven way of thinking in your organization.

Throughout the book, you’ll dive deeper into Splunk, explore data models and pivots to extend your intelligence capabilities, and perform advanced searching with machine learning to explore your data in even more sophisticated ways. Splunk is changing the business landscape, so make sure you’re taking advantage of it.

What you will learn

  • Learn how to use Splunk to gather, analyze, and report on data
  • Create dashboards and visualizations that make data meaningful
  • Build an intelligent application with extensive functionalities
  • Enrich operational data with lookups and workflows
  • Model and accelerate data and perform pivot-based reporting
  • Apply ML algorithms for forecasting and anomaly detection
  • Summarize data for long term trending, reporting, and analysis
  • Integrate advanced JavaScript charts and leverage Splunk's API

Who this book is for

This book is intended for data professionals who are looking to leverage the Splunk Enterprise platform as a valuable operational intelligence tool. The recipes provided in this book will appeal to individuals from all facets of business, IT, security, product, marketing, and many more! Even the existing users of Splunk who want to upgrade and get up and running with Splunk 7.x will find this book to be of great value.

Josh Diakun Josh Diakun is an IT operations and security specialist with a focus on creating data-driven operational processes. He has over 10 years of experience managing and architecting enterprise-grade IT environments. For the past 7 years, he has been architecting, deploying and developing on Splunk as the core platform for organizations to gain security and operational intelligence. Josh is a founding partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. He is also a co-founder of the Splunk Toronto User Group. Paul R Johnson Paul R Johnson has over 10 years of data intelligence experience in the areas of information security, operations, and compliance. He is a partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. Paul previously worked for a Fortune 10 company, leading IT risk intelligence initiatives and managing a global Splunk deployment. Paul co-founded the Splunk Toronto User Group and lives and works in Toronto, Canada. Derek Mock Derek Mock is a software developer and big data architect who specializes in IT operations, information security, and cloud technologies. He has 15 years' experience developing and operating large enterprise-grade deployments and SaaS applications. He is a founding partner at Discovered Intelligence, a company specializing in data intelligence services and solutions. For the past 6 years, he has been leveraging Splunk as the core tool to deliver key operational intelligence. Derek is based in Toronto, Canada, and is a co-founder of the Splunk Toronto User Group.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 484

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Splunk Operational Intelligence CookbookThird Edition

 

 

Over 80 recipes for transforming your data into business-critical insights using Splunk

 

 

 

 

 

Josh Diakun
Paul R Johnson
Derek Mock

 

 

 

BIRMINGHAM - MUMBAI

Splunk Operational Intelligence Cookbook Third Edition

Copyright © 2018 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Veena PagareAcquisition Editor: Vinay ArgekarContent Development Editor: Aaryaman SinghTechnical Editor: Danish ShaikhCopy Editor: Safis EditingProject Coordinator: Manthan PatelProofreader: Safis EditingIndexer: Pratik ShirodkarGraphics: Tania DuttaProduction Coordinator: Shraddha Falebhai

First published: October 2014 Second edition: June 2016 Third edition: May 2018

Production reference: 1220518

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78883-523-7

www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

PacktPub.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the authors

Josh Diakun is an IT operations and information security specialist with over 15 years of experience managing and architecting enterprise-grade IT environments and security programs. He has spent the last 10 years specializing in the Splunk platform being recognize locally and globally for his expertise. Josh is a founding partner of Discovered Intelligence, a multi-award winning Splunk services company. Through Discovered Intelligence, Josh works with the most recognizable businesses worldwide helping them to achieve their IT operations, security and compliance goals.

Thank you to my co-authors, Derek Mock and Paul Johnson, for their endless effort and creativity in writing this book. Thank you to Jordan Crombie and Peter Maoloni, the path I am on stems from their support and enablement. My wife, Rachel, thank you for supporting me and making sure I slept. To Denyce and Jessika, your continued support means so much, and to my late father, John, my efforts will always be in your memory.

Paul has over 15 year’s data intelligence experience in the areas of information security, operations and compliance. He is passionate about helping businesses gain intelligence and insight from their large data at scale. Paul has award winning Splunk expertise and is a founding partner of Discovered Intelligence; a company known for its quality of Splunk service delivery. He previously worked for a Fortune 10 company, leading global IT risk intelligence initiatives.

I would like to thank my fellow authors, Josh Diakun and Derek Mock, for their support and collaborative efforts in writing this third edition of the book. Thanks guys for drowning in screenshots and giving up nights, days, and weekends to get it completed! I would also like to thank my wife, Stacey, for her continued support.

Derek Mock is a software developer and architect, with expertise in IT operations and cloud technologies. He has over 20 years experience developing, integrating, and operating large enterprise grade deployments and SaaS applications. Derek is a founding partner of Discovered Intelligence and had previously leveraged Splunk in a managed services company as a core tool for delivering key operational intelligence. Derek is a co-founder of the Splunk Toronto User Group and lives and works in Toronto, Canada.

I could not have asked for better co-authors than Josh Diakun and Paul Johnson, whose tireless efforts over many late nights brought this book into being. I would also like to thank Dave Penny, for all his support in my professional life. Finally, thanks to my partner Alison, and my children, Sarah and James, for cheering me on as I wrote and for always making sure I had enough coffee.

About the reviewer

Yogesh Raheja is a certified DevOps and cloud expert with a decade of IT experience. He has expertise in technologies such as OS, source code management, build & release tools, continuous integration/deployment/delivery tools, containers, config management tools, monitoring, logging tools, and public and private clouds. He loves to share his technical expertise with audience worldwide at various forums, conferences, webinars, blogs, and LinkedIn (https://in.linkedin.com/in/yogesh-raheja-b7503714). He has also reviewed Implementing Splunk 7 Third Edition written by James D. Miller. He has Published his online courses on Udemy and has written Automation with Puppet 5 and Automation with Ansible.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Splunk Operational Intelligence Cookbook Third Edition

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the authors

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Sections

Getting ready

How to do it...

How it works...

There's more...

See also

Get in touch

Reviews

Play Time – Getting Data In

Introduction

Indexing files and directories

Getting ready

How to do it...

How it works...

There's more...

Adding a file or directory data input using the CLI

Adding a file or directory input using inputs.conf

One-time indexing of data files using the Splunk CLI

Indexing the Windows event logs

See also

Getting data through network ports

Getting ready

How to do it...

How it works...

There's more...

Adding a network input using the CLI

Adding a network input using inputs.conf

See also

Using scripted inputs

Getting ready

How to do it...

How it works...

See also

Using modular inputs

Getting ready

How to do it...

How it works...

There's more...

See also

Using the Universal Forwarder to gather data

Getting ready

How to do it...

How it works...

There's more...

Adding the receiving indexer via outputs.conf

Receiving data using the HTTP Event Collector

Getting ready

How to do it...

How it works...

Getting data from databases using DB Connect

Getting ready

How to do it...

How it works...

Loading the sample data for this book

Getting ready

How to do it...

How it works...

See also

Data onboarding – defining field extractions

Getting ready

How to do it...

How it works...

See also

Data onboarding - defining event types and tags

Getting ready

How to do it...

How it works...

There's more...

Adding event types and tags using eventtypes.conf and tags.conf

See also

Installing the Machine Learning Toolkit

Getting ready

How to do it...

How it works...

Diving into Data – Search and Report

Introduction

The Search Processing Language 

Searching in Splunk

Boolean operators

Common commands

Time modifiers

Working with fields

Saving searches in Splunk

Making raw event data readable

Getting ready

How to do it...

How it works...

There's more...

Tabulating every field

Removing fields, then tabulating everything else

Finding the most accessed web pages

Getting ready

How to do it...

How it works...

There's more...

Searching for the top 10 accessed web pages

Searching for the most accessed pages by user

See also

Finding the most used web browsers

Getting ready

How to do it...

How it works...

There's more...

Searching for the web browser data for the most used OS types

See also

Identifying the top-referring websites

Getting ready

How to do it...

How it works...

There's more...

Searching for the top 10 using stats instead of top

See also

Charting web page response codes

Getting ready

How to do it...

How it works...

There's more...

Totaling success and error web page response codes

See also

Displaying web page response time statistics

Getting ready

How to do it...

How it works...

There's more...

Displaying web page response time by action

See also

Listing the top-viewed products

Getting ready

How to do it...

How it works...

There's more...

Searching for the percentage of cart additions from product views

See also

Charting the application's functional performance

Getting ready

How to do it...

How it works...

There's more...

See also

Charting the application's memory usage

Getting ready

How to do it...

How it works...

See also

Counting the total number of database connections

Getting ready

How to do it...

How it works...

See also

Dashboards and Visualizations - Make Data Shine

Introduction

About Splunk dashboards

Using dashboards for Operational Intelligence

Enriching data with visualizations

Available visualizations

Trellis layout

Best practices for visualizations

Creating an Operational Intelligence dashboard

Getting ready

How to do it...

How it works...

There's more...

Changing dashboard permissions

Using a pie chart to show the most accessed web pages

Getting ready

How to do it...

How it works...

There's more...

Searching for the top ten accessed web pages

See also

Displaying the unique number of visitors

Getting ready

How to do it...

How it works...

There's more...

Adding labels to a single value panel

Coloring the value based on ranges

Adding trends and sparklines to the values

See also

Using a gauge to display the number of errors

Getting ready

How to do it...

How it works...

There's more...

See also

Charting the number of method requests by type and host

Getting ready

How to do it...

How it works...

See also

Creating a timechart of method requests, views, and response times

Getting ready

How to do it...

How it works...

There's more...

Method requests, views, and response times by host

See also

Using a scatter chart to identify discrete requests by size and response time

Getting ready

How to do it...

How it works...

There's more...

Using time series data points with a scatter chart

See also

Creating an area chart of the application's functional statistics

Getting ready

How to do it...

How it works...

See also

Using metrics data and a trellis layout to monitor physical environment operating conditions

Getting ready

How to do it...

How it works...

See also

Using a bar chart to show the average amount spent by category

Getting ready

How to do it...

How it works...

See also

Creating a line chart of item views and purchases over time

Getting ready

How to do it...

How it works...

See also

Building an Operational Intelligence Application

Introduction

Creating an Operational Intelligence application

Getting ready

How to do it...

How it works...

There's more...

Creating an application from another application

Downloading and installing a Splunk app

See also

Adding dashboards and reports

Getting ready

How to do it...

How it works...

There's more...

Changing permissions of saved reports

See also

Organizing the dashboards more efficiently

Getting ready

How to do it...

How it works...

There's more...

Modifying the Simple XML directly

See also

Dynamically drilling down on activity reports

Getting ready

How to do it...

How it works...

There's more...

Disabling the drilldown feature in tables and charts

See also

Creating a form for searching web activity

Getting ready

How to do it...

How it works...

There's more...

Adding a Submit button to your form

See also

Linking web page activity reports to the form

Getting ready

How to do it...

How it works...

There's more...

Adding an overlay to the Sessions Over Time chart

See also

Displaying a geographical map of visitors

Getting ready

How to do it...

How it works...

There's more...

Adding a map panel using Simple XML

Mapping different distributions by area

See also

Highlighting average product price

Getting ready

How to do it...

How it works...

See also

Scheduling the PDF delivery of a dashboard

Getting ready

How to do it...

How it works...

See also

Extending Intelligence – Datasets, Modeling and Pivoting

Introduction

Creating a data model for web access logs

Getting ready

How to do it...

How it works...

There's more...

Viewing datasets using the dataset listing page

Searching datasets using the search interface

See also

Creating a data model for application logs

Getting ready

How to do it...

How it works...

See also

Accelerating data models

Getting ready

How to do it...

How it works...

There's more...

Viewing data model and acceleration summary information

Advanced configuration of data model acceleration

See also

Pivoting total sales transactions

Getting ready

How to do it...

How it works...

There's more...

Searching datasets using the pivot command

Searching accelerated datasets using the tstats command

See also

Pivoting purchases by geographic location

Getting ready

How to do it...

How it works...

See also

Pivoting slowest responding web pages

Getting ready

How to do it...

How it works...

See also

Pivot charting top error codes

Getting ready

How to do it...

How it works...

See also

Diving Deeper – Advanced Searching, Machine Learning and Predictive Analytics

Introduction

Identifying and grouping transactions

Converging data sources

Identifying relationships between fields

Predicting future values

Discovering anomalous values

Leveraging machine learning

Calculating the average session time on a website

Getting ready

How to do it...

How it works...

There's more...

Starts with a website visit, ends with a checkout

Defining maximum pause, span, and events in a transaction

See also

Calculating the average execution time for multi-tier web requests

Getting ready

How to do it...

How it works...

There's more...

Calculating the average execution time without using a join

See also

Displaying the maximum concurrent checkouts

Getting ready

How to do it...

How it works...

See also

Analyzing the relationship of web requests

Getting ready

How to do it...

How it works...

There's more...

Analyzing relationships of DB actions to memory utilization

See also

Predicting website traffic volumes

Getting ready

How to do it...

How it works...

There's more...

Create and apply a machine learning model of traffic over time

Predicting the total number of items purchased

Predicting the average response time of function calls

See also

Finding abnormally-sized web requests

Getting ready

How to do it...

How it works...

There's more...

The anomalies command

The anomalousvalue command

The anomalydetection command

The cluster command

See also

Identifying potential session spoofing

Getting ready

How to do it...

How it works...

There's more...

Creating logic for urgency

See also

Detecting outliers in server response times

Getting ready

How to do it...

How it works...

Forecasting weekly sales

Getting ready

How to do it...

How it works...

Summary

Enriching Data – Lookups and Workflows

Introduction

Lookups

Workflows

DB Connect

Looking up product code descriptions

Getting ready

How to do it...

How it works...

There's more...

Manually adding the lookup to Splunk

See also

Flagging suspect IP addresses

Getting ready

How to do it...

How it works...

There's more...

Modifying an existing saved search to populate a lookup table

See also

Creating a session state table

Getting ready

How to do it...

How it works...

There's more...

Use the Splunk KV store to maintain the session state table

See also

Adding hostnames to IP addresses

Getting ready

How to do it...

How it works...

There's more...

Enabling automatic external field lookups

See also

Searching ARIN for a given IP address

Getting ready

How to do it...

How it works...

There's more...

Limiting workflow actions by event types

See also

Triggering a Google search for a given error

Getting ready

How to do it...

How it works...

There's more...

Triggering a Google search from the chart drilldown options

See also

Generating a chat notification for application errors

Getting ready

How to do it...

How it works...

There's more...

Adding a workflow action manually in Splunk

See also

Looking up inventory from an external database

Getting ready

How to do it...

How it works...

There's more...

Using DB Connect for direct external DB lookups

See also

Being Proactive – Creating Alerts

Introduction

About Splunk alerts

Types of alert

Alert Trigger Conditions

Alert Trigger Actions

Alerting on abnormal web page response times

Getting ready

How to do it...

How it works...

There's more...

Viewing alerts in Splunk's Triggered Alert view

See also

Alerting on errors during checkout in real time

Getting ready

How to do it...

How it works...

There's more...

Building alerts via a configuration file

Editing alert configuration attributes using Advanced edit

Identify the real-time searches that are running

See also

Alerting on abnormal user behavior

Getting ready

How to do it...

How it works...

There's more...

Alerting on abnormal user purchases without checkouts

See also

Alerting on failure and triggering a chat notification

Getting ready

How to do it...

How it works...

There's more...

See also

Alerting when predicted sales exceed inventory

Getting ready

How to do it...

How it works...

See also

Generating alert events for high sensor readings

Getting ready

How to do it...

How it works...

There's more...

Speeding Up Intelligence – Data Summarization

Introduction

Data summarization

Data summarization methods

About summary indexing

How summary indexing helps

About report acceleration

The simplicity of report acceleration

Calculating an hourly count of sessions versus completed transactions

Getting ready

How to do it...

How it works...

There's more...

Generating the summary more frequently

Avoiding summary index overlaps and gaps

See also

Backfilling the number of purchases by city

Getting ready

How to do it...

How it works...

There's more...

Backfilling a summary index from within a search directly

See also

Displaying the maximum number of concurrent sessions over time

Getting ready

How to do it...

How it works...

There's more...

Viewing the status of an accelerated report and how 

See also

Above and Beyond – Customization, Web Framework, HTTP Event Collector, REST API, and SDKs

Introduction

Web framework

REST API

Software development kits (SDKs)

HTTP Event Collector (HEC)

Customizing the application navigation

Getting ready

How to do it...

How it works...

There's more...

Adding a Sankey diagram of web hits

Getting ready

How to do it...

How it works...

There's more...

Changing the Sankey diagram options

See also

Developing a tag cloud of purchases by country

Getting ready

How to do it...

How it works...

There's More...

See also

Adding Cell Icons to Highlight Average Product Price

Getting ready

How to do it...

How it works...

See also

Remotely querying Splunk's REST API for unique page views

Getting ready

How to do it...

How it works...

There's more...

Authenticating with a session token

See also

Creating a Python application to return unique IP addresses

Getting ready

How to do it...

How it works...

There's more...

Paginating the results of your search

See also

Creating a custom search command to format product names

Getting ready

How to do it...

How it works...

See also

Collecting data from remote scanning devices

Getting ready

How to do it...

How it works...

See also

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Splunk makes it easy for you to take control of your data, and with Splunk Operational Intelligence Cookbook, you can be confident that you are taking advantage of the Big Data revolution and driving your business with the cutting edge of operational intelligence and business analytics.

With more than 80 recipes that demonstrate all of Splunk's features, not only will you find quick solutions to common problems, but you'll also learn a wide range of strategies and uncover new ideas that will make you rethink what operational intelligence means to you and your organization.

Who this book is for

This book is intended for users of all levels who are looking to leverage the Splunk Enterprise platform as a valuable operational intelligence tool. The recipes provided in this book will appeal to individuals from all facets of business, IT, security, product, marketing, and many more!

Also, existing users of Splunk who want to upgrade and get up and running with the latest release of Splunk will find this book invaluable.

What this book covers

Chapter 1, Play Time – Getting Data In, introduces you to the many ways in which you can get data into Splunk, whether it is collecting data locally from files and directories, receiving it through TCP/UDP port inputs, directly from a Universal Forwarder, or simply utilizing Scripted and Modular Inputs. Regardless of how Operational Intelligence is approached, the right data at the right time is pivotal to success; this chapter will play a key role in highlighting what data to consider and how to efficiently and effectively get that data into Splunk. It will also introduce the data sets that will be used throughout this book and where to obtain samples that can be used to follow each of the recipes as they are written.

Chapter 2, Diving into Data – Search and Report, introduces you to the first set of recipes in the book. Leveraging the data now available as a result of the previous chapter, the information and recipes will guide you through searching event data using Splunk's SPL (Search Processing Language); applying field extractions; grouping common events based on field values; and then building basic reports using the table, top, chart, and stats commands.

Chapter 3, Dashboards and Visualizations – Make Data Shine, guides you through building visualizations based on reports that can now be created as a result of this chapter. The information and recipes provided in this chapter will empower you to take their data, and reports, and bring it to life through the powerful visualizations provided by Splunk. Visualizations introduced will include single values, charts (bar, pie, line, and area), scatter charts, and gauges.

Chapter 4, Building an Operational Intelligence Application, builds on the understanding of visualizations that you gained as a result of the previous chapter to now introduce the concept of dashboards. Dashboards provide a powerful way to bring visualizations together and provide the holistic visibility required to fully capture the operational intelligence that is most important. The information and recipes provided in this chapter will outline the purpose of dashboards, how to properly utilize dashboards, using the dashboard editor to build a dashboard, building a form for searching event data and much more.

Chapter 5, Extending Intelligence – Datasets, Modeling and Pivoting, covers powerful features found in Splunk Enterprise, the ability to create datasets and the pivot tool. The information and recipes provided in this chapter will introduce you to the concept of Splunk Datasets. You will build data models, use the pivot tool and write accelerated searches to quickly create intelligence driven reports and visualizations.

Chapter 6, Diving Deeper – Advanced Searching, Machine Learning and Predictive Analytics, helps you harness the ability to converge data from different sources and understand or build relationships between the events. By now you will have an understanding of how to derive operational intelligence from data by using some of Splunk's most common features. The information and recipes provided in this chapter will take you deeper into the data by introducing the Machine Learning Toolkit, transactions, subsearching, concurrency, associations, and more advanced search commands.

Chapter 7, Enriching Data – Lookups and Workflows, enables you to apply this functionality to further enhance their understanding of the data being analyzed. As illustrated in the preceding chapters, event data, whether from a single tier or multi-tier web application stack, can provide a wealth of operational intelligence and awareness. That intelligence can be further enriched through the use of lookups and workflow actions.The information and recipes provided in this chapter will introduce the concept of lookups and workflow actions for the purpose of augmenting the data being analyzed.

Chapter 8, Being Proactive – Creating Alerts, guides you through creating alerts based on the knowledge gained from previous chapters. A key asset to complete operational intelligence and awareness is the ability to be proactive through scheduled or real-time alerts. The information and recipes provided in this chapter will introduce you to this concept, the benefits of proactive alerts and provide context of when alerts are best applied.

Chapter 9, Speed Up Intelligence – Data Summarization, provides you with a short introduction to common situations where summary indexing can be leveraged to speed up reports or preserve focused statistics over long periods of time. With big data being just that, big, it can sometimes be very time consuming searching massive sets of data or costly to store the data for long periods of time. The information and recipes provided in this chapter will introduce you to the concept of summary indexing for the purposes of accelerating reports and speeding up the time it takes to unlock business insight.

Chapter 10, Above and Beyond – Customization, Web Framework, HTTP Event Collector, REST API, and SDKs, introduces you to four very powerful features of Splunk. These features provide the ability to create a very rich and powerful interactive experience with Splunk. This will open you up to the possibilities beyond core Splunk Enterprise and show you a method to create your own operational intelligence application including powerful visualizations. It will also provide a recipe for querying Splunk's REST API and a basic Python application leveraging Splunk's SDK to execute a search.

To get the most out of this book

You'll need the Splunk Enterprise 7.1 (or greater) software.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packtpub.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Splunk-Operational-Intelligence-Cookbook-Third-Edition. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "However, in addition to using the GUI, you can also specify time ranges directly in your search string using the earliest and latest time modifiers. When a time modifier is used in this way, it will automatically override any time range that might be set in the GUI time range picker."

A block of code is set as follows:

index=main sourcetype=access_combined | eval browser=useragent | replace *Firefox* with Firefox, *Chrome* with Chrome, *MSIE* with "Internet Explorer", *Version*Safari* with Safari, *Opera* with Opera in browser | top limit=5 useother=t browser

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select the Search & Reporting application."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it..., How it works..., There's more..., and See also).

To give clear instructions on how to complete a recipe, use these sections as follows:

Getting ready

This section tells you what to expect in the recipe and describes how to set up any software or any preliminary settings required for the recipe.

How to do it...

This section contains the steps required to follow the recipe.

How it works...

This section usually consists of a detailed explanation of what happened in the previous section.

There's more...

This section consists of additional information about the recipe in order to make you more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Play Time – Getting Data In

 In this chapter, we will cover the basic ways to get data intoSplunk, in addition to some other recipes that will help prepare you for later chapters. You will learn about the following recipes:

Indexing files and directories

Getting data through network ports

Using scripted inputs

Using modular inputs

Using the Universal Forwarder to gather data

Receiving data using the HTTP Event Collector

Getting data from databases using DB Connect

Loading the sample data for this book

Data onboarding: Defining field extractions

Data onboarding: Defining event types and tags

Installing the Machine Learning Toolkit

Introduction

The machine data that facilitates operational intelligence comes in many different forms and from many different sources. Splunk can collect and index data from several sources, including log files written by web servers or business applications, syslog data streaming in from network devices, or the output of custom developed scripts. Even data that looks complex at first can be easily collected, indexed, transformed, and presented back to you in real time.

This chapter will walk you through the basic recipes that will act as the building blocks to get the data you want into Splunk. The chapter will further serve as an introduction to the sample data sets that we will use to build our own operational intelligence Splunk app. The datasets will be coming from a hypothetical three-tier e-commerce web application and will contain web server logs, application logs, and database logs.

Splunk Enterprise can index any type of data; however, it works best with time-series data (data with timestamps). When Splunk Enterprise indexes data, it breaks it into events, based on timestamps and/or event size, and puts them into indexes. Indexes are data stores that Splunk has engineered to be very fast, searchable, and scalable across a distributed server environment.

All data indexed into Splunk is assigned a source type. The source type helps identify the data format type of the event and where it has come from. Splunk has several preconfigured source types, but you can also specify your own. The example source types include access_combined, cisco_syslog, and linux_secure. The source type is added to the data when the indexer indexes it into Splunk. It is a key field that is used when performing field extractions and when conducting many searches to filter the data being searched.

The Splunk community plays a big part in making it easy to get data into Splunk. The ability to extend Splunk has provided the opportunity for the development of inputs, commands, and applications that can be easily shared. If there is a particular system or application you are looking to index data from, there is most likely someone who has developed and published relevant configurations and tools that can be easily leveraged by your own Splunk Enterprise deployment.

Splunk Enterprise is designed to make the collection of data very easy, and it will not take long before you are being asked or you yourself try to get as much data into Splunk as possible—at least as much as your license will allow for!

Indexing files and directories

File- and directory-based inputs are the most commonly used ways of getting data into Splunk. The primary need for these types of input will be to index logfiles. Almost every application or system produces a logfile, and it is generally full of data that you want to be able to search and report on.

Splunk can continuously monitor for new data being written to existing files or new files being added to a directory, and it is able to index this data in real time. Depending on the type of application that creates the logfiles, you would set up Splunk to either monitor an individual file based on its location, or scan an entire directory and monitor all the files that exist within it. The latter configuration is more commonly used when the logfiles being produced have unique filenames, such as filenames containing a timestamp.

This recipe will show you how to configure Splunk to continuously monitor and index the contents of a rolling logfile located on the Splunk server. The recipe specifically shows how to monitor and index a Red Hat Linux system's messages logfile (/var/log/messages). However, the same principle can be applied to a logfile on a Windows system, and a sample file is provided. Do not attempt to index the Windows event logs this way, as Splunk has specific Windows event inputs for this.

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server and access to read the /var/log/messages file on Linux. No other prerequisites are required. If you are not using Linux and/or do not have access to the /var/log/messages location on your Splunk server, use the cp01_messages.log file that is provided and upload it to an accessible directory on your Splunk server.

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.

How to do it...

Follow these steps to monitor and index the contents of a file:

Log in to your Splunk server.

From the menu in the top right-hand corner, click on the

Settings

menu and then click on the

Add Data

link:

If you are prompted to take a quick tour, click on

Skip

.

In the

How do you want to add data

section, click on

monitor

:

Click on the

Files & Directories

section:

In the

File or Directory

section, enter the path to the logfile (

/var/log/messages

or the location of the

cp01_messages.log

file), ensure

Continuously Monitor

is selected, and click on

Next

:

If you are just looking to do a one-time upload of a file, you can select Index Once instead. This can be useful to index a set of data that you would like to put into Splunk, either to backfill some missing or incomplete data or just to take advantage of its searching and reporting tools.

If you are using the provided file or the native

/var/log/messages

file, the data preview will show the correct line breaking of events and timestamp recognition. Click on the

Next

button.

A

Save Source Type

box will pop up. Enter

linux_messages

as the

Name

and then click on

Save

:

On the

Input Settings

page, leave all the default settings and click

Review

.

Review the settings and if everything is correct, click

Submit

.

If everything was successful, you should see a

File input has been created successfully

message:

Click on the

Start

searching button. The

Search & Reporting

app will open with the search already populated based on the settings supplied earlier in the recipe.

In this recipe, we could have simply used the common syslog source type or let Splunk choose a source type name for us; however, starting a new source type is often a better choice. The syslog format can look completely different depending on the data source. As knowledge objects, such as field extractions, are built on top of source types, using a single syslog source type for everything can make it challenging to search for the data you need.

How it works...

When you add a new file or directory data input, you are basically adding a new configuration stanza into an inputs.conf file behind the scenes. The Splunk server can contain one or more inputs.conf files, and these files are either located in $SPLUNK_HOME/etc/system/local or in the local directory of a Splunk app.

Splunk uses the monitor input type and is set to point to either a file or a directory. If you set the monitor to a directory, all the files within that directory will be monitored. When Splunk monitors files, it initially starts by indexing all the data that it can read from the beginning. Once complete, Splunk maintains a record of where it last read the data from, and if any new data comes into the file, it reads this data and advances the record. The process is nearly identical to using the tail command in Unix-based operating systems. If you are monitoring a directory, Splunk also provides many additional configuration options, such as blacklisting files you don't want Splunk to index.

For more information on Splunk's configuration files, visit https://docs.splunk.com/Documentation/Splunk/latest/Admin/Aboutconfigurationfiles.

There's more...

While adding inputs to monitor files and directories can be done through the web interface of Splunk, as outlined in this recipe, there are other approaches to add multiple inputs quickly. These allow for customization of the many configuration options that Splunk provides.

Adding a file or directory data input using the CLI

Instead of using the GUI, you can add a file or directory input through the Splunk command-line interface (CLI). Navigate to your $SPLUNK_HOME/bin directory and execute the following command (replacing the file or directory to be monitored with your own):

For Unix, we will be using the following code to add a file or directory input:

./splunk add monitor /var/log/messages -sourcetype linux_messages

For Windows, we will be using the following code to add a file or directory input:

splunk add monitor c:/filelocation/cp01_messages.log -sourcetype linux_messages

There are a number of different parameters that can be passed along with the file location to monitor.

See the Splunk documentation for more on data inputs using the CLI (https://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorfilesanddirectoriesusingtheCLI).

One-time indexing of data files using the Splunk CLI

Although you can select Upload and Index a file from the Splunk GUI to upload and index a file, there are a couple of CLI functions that can be used to perform one-time bulk loads of data.

Use the oneshot command to tell Splunk where the file is located and which parameters to use, such as the source type:

./splunk add oneshot XXXXXXX

Another way is to place the file you wish to index into the Splunk spool directory, $SPLUNK_HOME/var/spool/splunk, and then add the file using the spool command, as shown in the following code:

./splunk spool XXXXXXX

If using Windows, omit the dot and slash (./) that is in front of the Splunk commands mentioned earlier.

See also

The

Getting data through network ports

recipe

The

Using scripted inputs

recipe

The

Using modular inputs

recipe

Getting data through network ports

Not every machine has the luxury of being able to write logfiles. Sending data over network ports and protocols is still very common. For instance, sending logs through syslog is still the primary method to capture network device data such as firewalls, routers, and switches.

Sending data to Splunk over network ports doesn't need to be limited to network devices. Applications and scripts can use socket communication to the network ports that Splunk is listening on. This can be a very useful tool in your back pocket, as there can be scenarios where you need to get data into Splunk but don't necessarily have the ability to write to a file.

This recipe will show you how to configure Splunk to receive syslog data on a UDP network port, but it is also applicable to the TCP port configuration.

Getting ready

To step through this recipe, you will need a running Splunk Enterprise server. No other prerequisites are required.

How to do it...

Follow these steps to configure Splunk to receive network UDP data:

Log in to your Splunk server.

From the menu in the top right-hand corner, click on the

Settings

menu and then click on the

Add Data

link.

If you are prompted to take a quick tour, click on

Skip

.

In the

How do you want to add data

section, click on

Monitor.

Click on the

TCP / UDP

section:

Ensure the

UDP

option is selected and in the

Port

section, enter

514

. On Unix/Linux, Splunk must be running as root to access privileged ports such as

514

. An alternative would be to specify a higher port, such as port 1514, or route data from 514 to another port using routing rules in

iptables

. Then, click on

Next

:

In the

Source type

section, select

Select

and then select

syslog

from the

Select Source Type

drop-down list and click

Review

:

Review the settings and if everything is correct, click

Submit

.

If everything was successful, you should see a

UDP input has been created successfully

message:

Click on the

Start

 S

earching

button. The

Search & Reporting

app will open with the search already populated based on the settings supplied earlier in the recipe. Splunk is now configured to listen on UDP port

514

. Any data sent to this port now will be assigned the syslog source type. To search for the syslog source type, you can run the following search:

source="udp:514" sourcetype="syslog"

Understandably, you will not see any data unless you happen to be sending data to your Splunk server IP on UDP port 514.

How it works...

When you add a new network port input, you basically add a new configuration stanza into an inputs.conf file behind the scenes. The Splunk server can contain one or more inputs.conf files, and these files are either located in the $SPLUNK_HOME/etc/system/local or the local directory of a Splunk app.

To collect data on a network port, Splunk will set up a socket to listen on the specified TCP or UDP port and will index any data it receives on that port. For example, in this recipe, you configured Splunk to listen on port 514 for UDP data. If data was received on that port, then Splunk would index it and assign a syslog source type to it.

Splunk also provides many configuration options that can be used with network inputs, such as how to resolve the host value to be used on the collected data.

For more information on Splunk's configuration files, visit https://docs.splunk.com/Documentation/Splunk/latest/Admin/Aboutconfigurationfiles.

There's more...

While adding inputs to receive data from network ports can be done through the web interface of Splunk, as outlined in this recipe, there are other approaches to add multiple inputs quickly; these inputs allow for customization of the many configuration options that Splunk provides.

Adding a network input using the CLI

You can also add a file or directory input via the Splunk CLI. Navigate to your $SPLUNK_HOME/bin directory and execute the following command (just replace the protocol, port, and source type you wish to use):

We will use the following code for Unix:

./splunk add udp 514 -sourcetype syslog

We will use the following code for Windows:

splunk add udp 514 -sourcetype syslog

There are a number of different parameters that can be passed along with the port. See the Splunk documentation for more on data inputs using the CLI (https://docs.splunk.com/Documentation/Splunk/latest/Data/MonitorfilesanddirectoriesusingtheCLI).

See also

The

Indexing files and directories

recipe

The

Using scripted inputs

recipe

The

Using modular inputs

recipe

Using scripted inputs

Not all data that is useful for operational intelligence comes from logfiles or network ports. Splunk will happily take the output of a command or script and index it along with all your other data.

Scripted inputs are a very helpful way to get that hard-to-reach data. For example, if you have third-party-supplied command-line programs that can output data you would like to collect, Splunk can run the command periodically and index the results. Typically, scripted inputs are often used to pull data from a source, whereas network inputs await a push of data from a source.

This recipe will show you how to configure Splunk on an interval to execute your command and direct the output into Splunk.

Getting ready

To step through this recipe, you will need a running Splunk server and the provided scripted input script suited to the environment you are using. For example, if you are using Windows, use the cp01_scripted_input.bat file. This script should be placed in the $SPLUNK_HOME/bin/scripts directory. No other prerequisites are required.

How to do it...

Follow these steps to configure a scripted input:

Log in to your Splunk server.

From the menu in the top right-hand corner, click on the

Settings

menu and then click on the

Add Data

link.

If you are prompted to take a quick tour, click on

Skip

.

In the

How do you want to add data

section, click on

Monitor

.

Click on the

Scripts

section:

A form will be displayed with a number of input fields. In the

Script Path

drop-down, select the location of the script. All scripts must be located in a Splunk

bin

directory, either in

$SPLUNK_HOME/bin/scripts

or an appropriate bin directory within a Splunk app, such as

$SPLUNK_HOME/etc/apps/search/bin

.

In the

Script Name

dropdown, select the name of the script. In the

Commands

field, add any command-line arguments to the auto-populated script name.

Enter the value in the

Interval

field (in seconds) in which the script is to be run (the default value is

60.0

seconds) and then click

Next

:

In the

Source Type

section, you have the option to either select a predefined source type or to select

New

and enter your desired value. For the purpose of this recipe, select

New

as the source type and enter

cp01_scripted_input

as the value for the source type. Then click

Review

:

By default, data will be indexed into the Splunk index of main. To change this destination index, select your desired index from the drop-down list in the Index section of the form.

Review the settings. If everything is correct, click Submit.

If everything was successful, you should see a Script input has been created successfully message:

Click on the

Start searching

button. The

Search & Reporting

app will open with the search already populated based on the settings supplied earlier in the recipe. Splunk is now configured to execute the scripted input you provided every 60 seconds, in accordance with the specified interval. You can search for the data returned by the scripted input using the following search:

sourcetype=cp01_scripted_input

How it works...

When adding a new scripted input, you are directing Splunk to add a new configuration stanza into an inputs.conf file behind the scenes. The Splunk server can contain one or more inputs.conf files, located either in $SPLUNK_HOME/etc/system/local or the local directory of a Splunk app.

After creating a scripted input, Splunk sets up an internal timer and executes the command that you have specified, in accordance with the defined interval. It is important to note that Splunk will only run one instance of the script at a time, so if the script gets blocked for any reason, it will cause the script to not be executed again, until after it has been unblocked.

Since Splunk 4.2, any output of the scripted inputs that are directed to stderr (causing an error) are captured to the splunkd.log file, which can be useful when attempting to debug the execution of a script. As Splunk indexes its own data by default, you can search for that data and put an alert on it if necessary.

For security reasons, Splunk does not execute scripts located outside of the bin directories mentioned earlier. To overcome this limitation, you can use a wrapper script (such as a shell script in Linux or batch file in Windows) to call any other script located on your machine.

See also

The

Indexing files and directories

recipe

The

Getting data through network ports

recipe

The

Using modular inputs

recipe

Using modular inputs

Since Splunk 5.0, the ability to extend data input functionality has existed such that custom input types can be created and shared while still allowing for user customization to meet needs.

Modular inputs build further upon the scripted input model. Originally, any additional functionality required by the user had to be contained within a script. However, this presented a challenge, as no customization of this script could occur from within Splunk itself. For example, pulling data from a source for two different usernames needed two copies of a script or meant playing around with command-line arguments within your scripted input configuration.

By leveraging the modular input capabilities, developers are now able to encapsulate their code into a reusable app that exposes parameters in Splunk and allows for configuration through processes familiar to Splunk administrators.

This recipe will walk you through how to install the Command Modular Input, which allows for periodic execution of commands and subsequent indexing of the command output. You will configure the input to collect the data outputted by the vmstat command in Linux and the systeminfo command in Windows.