Elasticsearch 7 Quick Start Guide - Anurag Srivastava - E-Book

Elasticsearch 7 Quick Start Guide E-Book

Anurag Srivastava

0,0
28,14 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Get the most out of Elasticsearch 7's new features to build, deploy, and manage efficient applications




Key Features



  • Discover the new features introduced in Elasticsearch 7


  • Explore techniques for distributed search, indexing, and clustering


  • Gain hands-on knowledge of implementing Elasticsearch for your enterprise



Book Description



Elasticsearch is one of the most popular tools for distributed search and analytics. This Elasticsearch book highlights the latest features of Elasticsearch 7 and helps you understand how you can use them to build your own search applications with ease.






Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. You'll learn how to install, configure, manage, secure, and deploy Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also understand how to troubleshoot any issues that you may encounter along the way. Finally, the book will help you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results.






By the end of this book, you'll have a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch.




What you will learn



  • Install Elasticsearch and use it to safely store data and retrieve it when needed


  • Work with a variety of analyzers and filters


  • Discover techniques to improve search results in Elasticsearch


  • Understand how to perform metric and bucket aggregations


  • Implement best practices for moving clusters and applications to production


  • Explore various techniques to secure your Elasticsearch clusters



Who this book is for



This book is for software developers, engineers, data architects, system administrators, and anyone who wants to get up and running with Elasticsearch 7. No prior experience with Elasticsearch is required.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 171

Veröffentlichungsjahr: 2019

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Elasticsearch 7 Quick Start Guide

 

 

 

 

 

 

 

 

 

 

Get up and running with the distributed search and analytics capabilities of Elasticsearch

 

 

 

 

 

 

 

 

 

Anurag Srivastava
Douglas Miller

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Elasticsearch 7 Quick Start Guide

Copyright © 2019 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor:Amey VarangaonkarAcquisition Editor:Reshma RamanContent Development Editor:Roshan KumarSenior Editor: Jack CummingsTechnical Editor: Manikandan KurupCopy Editor: Safis EditingProject Coordinator:Kirti PisatProofreader: Safis EditingIndexer:Tejal Daruwale SoniProduction Designer:Shraddha Falebhai

First published: October 2019

Production reference: 1231019

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78980-332-7

www.packt.com

 

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. 

Contributors

About the authors

Anurag Srivastavais a senior technical lead in a multinational software company. He has more than 12 years' experience in web-based application development. He is proficient in designing architecture for scalable and highly available applications. He has handled development teams and multiple clients from all over the globe over the past 10 years of his professional career. He has significant experience with the Elastic Stack (Elasticsearch, Logstash, and Kibana) for creating dashboards using system metrics data, log data, application data, and relational databases. He has authored three other books—Mastering Kibana 6.x, and Kibana 7 Quick Start Guide, and Learning Kibana 7 - Second Edition, all published by Packt.

 

 

 

 

Douglas Miller is an expert in helping fast-growing companies to improve performance and stability, and in building search platforms using Elasticsearch. Clients (including Walgreens, Nike, Boeing, and Dish Networks) have seen sales increase, fast performance times, and lower overall costs in terms of the total costs of ownership for their Elasticsearch clusters.

 

About the reviewer

Craig Brown is an independent consultant, offering services for Elasticsearch and other big data software. He is a core Java developer with 25+ years' experience and more than 10 years of Elasticsearch experience. He has also practiced with machine learning, Hadoop, and Apache Spark, is a co-founder of the Big Mountain Data user group in Utah, and is a speaker on Elasticsearch and other big data topics.

Craig has founded NosqlRevolution LLC, focusing on Elasticsearch and big data services, and PicoCluster LLC, a desktop data center designed for learning and prototyping cluster computing and big data frameworks.

 

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Elasticsearch 7 Quick Start Guide

About Packt

Why subscribe?

Contributors

About the authors

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Introduction to Elastic Stack

Brief history and background

Why use Elasticsearch?

What is log analysis?

Elastic Stack architecture

Elasticsearch

Kibana

Logstash

Beats

Filebeat

Metricbeat

Packetbeat

Auditbeat

Winlogbeat

Heartbeat

Use cases of the Elastic Stack

System monitoring

Log management

Application performance monitoring

Data visualization

Summary

Installing Elasticsearch

Installation of Elasticsearch

Installing Elasticsearch on Linux

Installing Elasticsearch using the Debian package

Installing Elasticsearch using the rpm package

Installing rpm manually

SysV

systemd

Installing Elasticsearch using MSI Windows Installer

Elasticsearch upgrade on Windows

Uninstall Elasticsearch on Windows

Installing Elasticsearch on macOS

Checking whether Elasticsearch is running

Summary

Many as One – the Distributed Model

API conventions

Handling multiple indices

Common options for the API response

Cluster state and statistics

Cluster health status

Cluster state

Cluster stats

Cluster administration

Node state and statistics

Operating system information

Process information

Plugin information

Index APIs

Document APIs

Single-document APIs

Creating a document

Viewing a document

Deleting a document

Delete by query

Updating a document

Multi-document APIs

Summary

Prepping Your Data – Text Analysis and Mapping

What is an analyzer?

Anatomy of an analyzer

How to use an analyzer

The custom analyzer

The standard analyzer

The simple analyzer

The whitespace analyzer

The stop analyzer

The keyword analyzer

The pattern analyzer

The language analyzer

The fingerprint analyzer

Normalizers

Tokenizers

The standard tokenizer

The letter tokenizer

The lowercase tokenizer

The whitespace tokenizer

The keyword tokenizer

The pattern tokenizer

The simple pattern tokenizer

Token filters

Character filters

The HTML strip character filter

The mapping character filter

The pattern replace character filter

Mapping

Datatypes

The simple datatype

The complex datatype

The specialized datatype

Multi-field mapping

Dynamic mapping

Explicit mapping

Summary

Let's Do a Search!

Introduction to data search

Search API

URI search

Request body search

Query

From/size

Sort

Source filtering

Fields

Script fields

Doc value fields

Post filter

Highlighting

Rescoring

Search type

Scroll

Preference

Explanation

Version

min_score

Named queries

Inner hits

Field collapsing

Search template

Multi search template

Search shards API

Suggesters

Multi search API

Count API

Validate API

Explain API

Profile API

Profiling queries

Profiling aggregations

Profiling considerations

Field capabilities API

Summary

Performance Tuning

Data sparsity

Solutions to common problems

Mixing exact search with stemming

Inconsistent scoring

How to tune for indexing speed

Bulk requests

Smart use of the Elasticsearch cluster

Increasing the refresh interval

Disabling refresh and replicas

Allocating memory to the filesystem cache

Using auto generated IDs

Using faster hardware

Indexing buffer size

How to tune for search speed

Allocating memory to the filesystem cache

Using faster hardware

Document modeling

Searching as few fields as possible

Pre-index data

Mapping identifiers as keywords

Avoiding scripts

Searching with rounded dates

Force-merging read-only indices

Prepping global ordinals

Prepping the filesystem cache

Using index sorting for conjunctions

Using preferences to optimize cache utilization

Balancing replicas

How to tune search queries with the Profile API

Faster phrase queries

Faster prefix queries

How to tune for disk usage

Disabling unused features

Do not use default dynamic string mappings

Monitoring shard size

Disabling source

Using compression

Force merge

Shrink indices

Using the smallest numeric type needed

Putting fields in order

Summary

Aggregating Datasets

What is an aggregation framework?

Advantages of aggregations

Structure of aggregations

Metrics aggregations

Avg aggregation

Weighted avg aggregation

Cardinality aggregation

Extended stats aggregation

Max aggregation

Min aggregation

Percentiles aggregation

Scripted metric aggregation

Stats aggregation

Sum aggregation

Bucket aggregations

Adjacency matrix aggregation

Auto-interval date histogram aggregation

Intervals

Composite aggregation

Date histogram aggregation

Date range aggregation

Filter/filters aggregation

Geo distance aggregation

Geohash grid aggregation

Geotile grid aggregation

Histogram aggregation

Significant terms aggregation

Significant text aggregation

Terms aggregation

Pipeline aggregations

Avg bucket aggregation

Derivative aggregation

Max bucket aggregation

Min bucket aggregation

Sum bucket aggregation

Stats bucket aggregation

Extended stats bucket aggregation

Percentiles bucket aggregation

Moving function aggregation

Cumulative sum aggregation

Bucket script aggregation

Bucket selector aggregation

Bucket sort aggregation

Matrix aggregations

Matrix stats

Summary

Best Practices

Failure to obtain the required data

Incorrectly processed text

Gazillion shards problem

Elasticsearch as a generic key-value store

Scripting and halting problem

The best cluster configuration approaches

Cloud configuration

On-site configuration

Data-ingestion patterns

Index aliases to simplify workflow

Why use aliases?

Using index templates to save time

Using _msearch for e-commerce applications

Using the Scroll API to read large datasets

Data backup and snapshots

Monitoring snapshot status

Managing snapshots

Deleting a snapshot

Restoring a snapshot

Renaming indices

Restoring to another cluster

Data Analytics using Elasticsearch

Summary

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Elasticsearch is one of the most popular tools for distributed open source search and analytics. This book will help you in understanding everything about the new features of Elasticsearch, and how to use them efficiently for searching, aggregating, and indexing data with speed and accuracy, while also helping you understand how you can use them to build your own search applications with ease. You will also acquire a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch.

Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. Next, you'll learn how to deploy and manage Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also discover how to install, configure, manage, and secure Elasticsearch clusters, in addition to understanding how to troubleshoot any issues you may encounter along the way. Finally, the book helps you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results.

Who this book is for

This book is for software developers, engineers, data architects, system administrators, or anyone who wants to get up and running with Elasticsearch 7. 

What this book covers

Chapter 1, Introduction to Elastic Stack, will give you a brief history and background on Elasticsearch. We will also get introduced to log analysis and will cover some of the core components of the Elastic Stack architecture. 

Chapter 2, Installing Elasticsearch, will cover the installation process of Elasticsearch in different environments. We will also look into installation using the Debian and rpm packages, followed by installation on Windows using the MSI installer of Elasticsearch.

Chapter 3, Many as One – the Distributed Model, will cover how to interact with Elasticsearch using REST calls to call different operations. We will also look at how we can handle multiple indices, followed by looking at some of the common options for the API response. We will also learn how to create, delete, and retrieve indices.

Chapter 4, Prepping Your Data – Text Analysis and Mapping, will walk through the details of how full text is analyzed and indexed in Elasticsearch, followed by looking into some of the various analyzers and filters and how they can be configured. We will also learn how Elasticsearch mappings are used for defining documents and fields and storing and indexing them, including how to define multi-fields and custom analyzers.

Chapter 5, Let's Do a Search!, will go into further detail regarding data searches, where we will cover URI search and body search. We will also cover some query examples using term, from/size, sort, and source filtering. Following that, we will also cover highlighting, rescoring, search type, and named queries.

Chapter 6, Performance Tuning, will cover data sparsity and how to improve the performance of Elasticsearch. We will also cover how to adjust the search speed by means of allocating memory to the filesystem cache, faster hardware, document modeling, pre-index data, avoiding replicas, and so on.

Chapter 7, Aggregating Datasets, will cover how to aggregate datasets and will explain the different types of aggregations that Elasticsearch supports.

Chapter 8, Best Practices, will cover the best practices we can follow in order to manage an Elasticsearch cluster.

To get the most out of this book

No prior experience with the Elastic Stack is required. The steps for installing and running Elasticsearch are covered in the book.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packt.com

.

Select the

Support

tab.

Click on

Code Downloads

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Elasticsearch-7-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789803327_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Let's take the example of kibana_sample_data_flight data to understand how we can prettify the results using the pretty keyword."

A block of code is set as follows:

PUT

index_name

{ "settings": { "number_of_shards": 1 }, "mappings": { "_doc": { "properties": { "field_number_1": { "type": "text" } } } }}

Any command-line input or output is written as follows:

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.1.1-linux-x86_64.tar.gz

Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "A manual uninstall must be performed through Add or remove programs."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Introduction to Elastic Stack

The Elastic Stack consists of Elasticsearch, Logstash, and Kibana, which together form the ELK Stack. Elasticsearch is an open source search engine developed by Shay Banon, with an easy-to-use web interface that provides excellent flexibility through plugins that expand the functionality of a wide range of applications. Because it is open source, it is easily accessible to everyone, and user input provides great feedback for ongoing, constant improvement of the product. Elasticsearch can be used for everything from simple to complex searches. For example, a simple search for old maps could involve counting the number of cartographers, or studying cartographers' products, or analyzing map contents. Many criteria can be used for searches, for a wide range of purposes.

Elasticsearch supports multi-tenancy, meaning it can store multiple indices on a server, and information can be retrieved from multiple indices using a single query. It uses documents with JSON format; for requests, responses, and during transfer, they are automatically indexed. In this chapter, we are going to cover the following topics:

Brief history and background

Why use Elasticsearch?

What is log analysis?

Elastic Stack architecture

Use cases of the Elastic Stack

 

Brief history and background

Developed in 2012, Elastic is an open source company that develops a distributed open source search engine based on Lucene. The history of Elastic starts with its main founder, Shay Banon, who wanted to explore making searching easier. In 2004, he released his first open source search-based product called Compass. This first iteration of open source search tools served as an inspiration, and, from Compass onward, searching has improved.

Around Elasticsearch grew a small community that would later lead to important partnerships that grew the company's capabilities. Jordan Sissel was working on a plugin ingestion tool named Logstash