43,19 €
Master the art of getting the maximum out of your machine data using Splunk
This book is for Splunk developers looking to learn advanced strategies to deal with big data from an enterprise architectural perspective. It is expected that readers have a basic understanding and knowledge of using Splunk Enterprise.
Master the power of Splunk and learn the advanced strategies to get the most out of your machine data with this practical advanced guide. Make sense of the hidden data of your organization – the insight of your servers, devices, logs, traffic and clouds. Advanced Splunk shows you how.
Dive deep into Splunk to find the most efficient solution to your data problems. Create the robust Splunk solutions you need to make informed decisions in big data machine analytics. From visualizations to enterprise integration, this well-organized high level guide has everything you need for Splunk mastery.
Start with a complete overview of all the new features and advantages of the latest version of Splunk and the Splunk Environment. Go hands on with uploading data, search commands for basic and advanced analytics, advanced visualization techniques, and dashboard customizing. Discover how to tweak Splunk to your needs, and get a complete on Enterprise Integration of Splunk with various analytics and visualization tools. Finally, discover how to set up and use all the new features of the latest version of Splunk.
This book follows a step by step approach. Every new concept is built on top of its previous chapter, and it is full of examples and practical scenarios to help the reader experiment as they read.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 385
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: June 2016
Production reference: 1030616
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-435-1
www.packtpub.com
Author
Ashish Kumar Tulsiram Yadav
Reviewer
Randy Rosshirt
Commissioning Editor
Veena Pagare
Acquisition Editor
Manish Nainani
Content Development Editor
Viranchi Shetty
Technical Editor
Ravikiran Pise
Copy Editors
Karuna Narayanan
Neha Vyas
Project Coordinator
Izzat Contractor
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Abhinash Sahu
Production Coordinator
Manu Joseph
Cover Work
Manu Joseph
Ashish Kumar Tulsiram Yadav is a BE in computers and has around four and a half years of experience in software development, data analytics, and information security, and around four years of experience in Splunk application development and administration. He has experience of creating Splunk applications and add-ons, managing Splunk deployments, machine learning using R and Python, and analytics and visualization using various tools, such as Tableau and QlikView.
He is currently working with the information security operations team, handling the Splunk Enterprise security and cyber security of the organization. He has worked as a senior software engineer at Larsen & Toubro Technology Services in the telecom consumer electronics and semicon unit providing data analytics on a wide variety of domains, such as mobile devices, telecom infrastructure, embedded devices, Internet of Things (IOT), Machine to Machine (M2M), entertainment devices, and network and storage devices.
He has also worked in the area of information, network, and cyber security in his previous organization. He has experience in OMA LWM2M for device management and remote monitoring of IOT and M2M devices and is well versed in big data and the Hadoop ecosystem. He is a passionate ethical hacker, security enthusiast, and Linux expert and has knowledge of Python, R, .NET, HTML5, CSS, and the C language.
He is an avid blogger and writes about ethical hacking and cyber security on his blogs in his free time. He is a gadget freak and keeps on writing reviews on various gadgets he owns. He has participated in and has been a winner of hackathons, technical paper presentations, white papers, and so on.
I would like to take this opportunity to thank my wonderful mom and dad for their blessings and for everything. I would sincerely like to thank Karishma Jain and Apurv Srivastav for helping me with examples, test data, and various other required material that enabled me to complete this book on time. I would also like to thank my friends, team, and colleagues at L&T TS for their support and encouragement. Special thanks to Nate Mckervey and Mitesh Vohra for guiding and helping me in various stages of writing this book. Last, but not least, a big thanks to Manish, Viranchi, Ravikiran, and the entire Packt Publishing team for their timely support and help.
Randy Rosshirt has had a 25-year career in technology, specializing in enterprise software and big data challenges. Much of his background has been in the healthcare industry. Since he started working with Splunk in 2012, his focus has been to introduce Splunk into the healthcare informatics community. While working at Splunk, Randy was involved with creating Splunk solutions for HIPAA privacy, clinical quality indicators, and adverse events data. He also spoke on behalf of Splunk at the 2014 HIMSS event on the topic Mining Big Data for Quality Indicators. He continues to provide private consulting to solve healthcare problems with Splunk.
For more, look at www.rrosshirt.com
I would like to thank Packt Publishing, especially the project coordinator and the author for inviting me to participate in this project.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Get notified! Find out when new books are published by following @PacktEnterprise on Twitter or the Packt Enterprise Facebook page.
Big data: the term itself suggests a large amount of data. Big data can be defined as high-volume, high-velocity, and high-variety information. Data is sometimes also referred to as logs generated from machines that can be used for the purpose of operations, engineering, business insight, analytics and prediction, and so on as the case may be.
Now, as we have a large amount of data, there is a need for a platform or tool that can be used to create visualizations and derive insights and patterns to make informed business decisions beforehand. To overcome all these challenges of big data, Splunk came into the picture. Splunk is a big data tool that generates insights and reveals patterns, trends, and associations from machine data. It is a powerful and robust big data tool used to derive real-time or near real-time insights, and it enables you to take informed corrective measures.
Splunk can be put to use for data generated from any source and available in a human readable format. As Splunk is a feature-rich tool, it becomes difficult for a Splunk user to start and make the best use of Splunk right away. This book takes the reader through a complete understanding of making the best and most efficient use of Splunk for machine data analytics and visualization. The book covers everything from which type of data can be uploaded to how to do it in an efficient way. It also covers creating applications and add-ons on Splunk, learning analytics commands, and learning visualizations and customizations as per one's requirements. The book also talks about how Splunk can be tweaked to make the best out of Splunk, along with how it can be integrated with R for analytics and Tableau for visualization.
This step-by-step comprehensive guide to Splunk will help readers understand Splunk's capabilities, thus enabling you to make the most efficient and best use of Splunk for big data.
Chapter 1, What's New in Splunk 6.3?, explains in detail how Splunk works in the backend, and also explains the backbone of Splunk, thanks to which it can process big data in real time. We will also go through all the new techniques and architectural changes that have been introduced in Splunk 6.3 to make Splunk faster, better, and provide near real-time results.
Chapter 2, Developing an Application on Splunk, talks about creating and managing an application and an add-on on Splunk Enterprise. You will also learn how to use different applications available on the Splunk app store to minimize the work by using the already available applications for similar requirements.
Chapter 3, On-boarding Data in Splunk, details the various methods by which data can be indexed on Splunk. We will also have a look at various customization options available while uploading data onto Splunk in order to index the data in such a way that trends, pattern detection, and other important features can be used efficiently and easily.
Chapter 4, Data Analytics, helps the reader learn the usage of commands related to searching, data manipulation, field extraction, subsearches, and so on on Splunk, thus enabling him/her to create analytics out of the data.
Chapter 5, Advanced Data Analytics, teaches the reader to generate reports and become well-versed with commands related to geographic and locations. This chapter will also cover advanced section of commands such as anomaly detection, correlation, prediction, and machine learning.
Chapter 6, Visualization, goes through the basic visualization options available in Splunk to represent data in an easier-to-understand format. Along with visualization, we will also discuss tweaking visualizations to make them easier to read and understand.
Chapter 7, Advanced Visualization, teaches the reader to use custom plugins and extensions to implement advanced visualizations in Splunk. These advanced visualizations can even be used by the nontechnical audience to generate useful insight and derive business decisions.
Chapter 8, Dashboard Customization, teaches the reader to create basic custom dashboards with the visualization and analytics you've learned so far. We will go through the various dashboard customization techniques that can be implemented to make the most of out the data on Splunk.
Chapter 9, Advanced Dashboard Customization, instructs the reader about the techniques that will help in developing a highly dynamic, customizable, and useful dashboard over the data on Splunk.
Chapter 10, Tweaking Splunk, talks about how we can make the best use of Splunk features so that we can get the maximum use out of Splunk efficiently. You will also learn the various management and customization techniques to use Splunk in the best possible way.
Chapter 11, Enterprise Integration with Splunk, teaches the reader to set up and use the Splunk SDK along with the integration of Splunk with R for analytics and Tableau for visualization.
Chapter 12, What Next? Splunk 6.4, discusses the features introduced in Splunk 6.4, along with how they can be put to use to maximize the benefit of Splunk for analytics and visualizations.
Listed as follows are the requirements for getting through the series of tasks performed through this book:
This book is for anyone who wants to learn Splunk and understand its advanced capabilities and doesn't want to get lost in loads of online documentation. This book will help readers understand how Splunk can be put to use to derive valuable insights from machine data in no time. This book covers Splunk from end to end, along with examples and illustrations, to make the reader a "master" of Splunk.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/AdvancedSplunk_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
Splunk has now come up with the data integrity managing feature in its latest version 6.3. It provides a way to verify the integrity of data that is indexed over Splunk. On enabling this feature, Splunk computes hashes on every slice of uploaded data and stores those hashes so that they can be used to verify the integrity of the data. It is a very useful feature where the logs are from sources such as bank transactions and other critical data where an integrity check is necessary.
On enabling this feature, Splunk computes hashes on every slice of newly indexed raw data and writes it to an l1Hashes file. When the bucket rolls from one bucket to another, say from hot to warm, Splunk computes the hash of contents of the l1Hashes file and stores it into the l2Hash file.
Hash validation can be done on Splunk's data by running the following CLI command:
In case hashes are lost, they can be regenerated using the following commands:
Let's now configure data integrity control. To configure data integrity control, modify the indexes.conf file located at $SPLUNK_HOME\etc\system\local as follows:
In a clustered environment, all the clusters and peers should run Splunk 6.3 to enable accurate data integrity control.
