Elastic Stack 8.x Cookbook - Huage Chen - E-Book

Elastic Stack 8.x Cookbook E-Book

Huage Chen

0,0
38,39 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Learn how to make the most of the Elastic Stack (ELK Stack) products—including Elasticsearch, Kibana, Elastic Agent, and Logstash—to take data reliably and securely from any source, in any format, and then search, analyze, and visualize it in real-time. This cookbook takes a practical approach to unlocking the full potential of Elastic Stack through detailed recipes step by step.
Starting with installing and ingesting data using Elastic Agent and Beats, this book guides you through data transformation and enrichment with various Elastic components and explores the latest advancements in search applications, including semantic search and Generative AI. You'll then visualize and explore your data and create dashboards using Kibana. As you progress, you'll advance your skills with machine learning for data science, get to grips with natural language processing, and discover the power of vector search. The book covers Elastic Observability use cases for log, infrastructure, and synthetics monitoring, along with essential strategies for securing the Elastic Stack. Finally, you'll gain expertise in Elastic Stack operations to effectively monitor and manage your system.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 588

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Elastic Stack 8.x Cookbook

Over 80 recipes to perform ingestion, search, visualization, and monitoring for actionable insights

Huage Chen

Yazid Akadiri

Elastic Stack 8.x Cookbook

Copyright © 2024 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

The authors acknowledge the use of cutting-edge AI, such as ChatGPT, with the sole aim of enhancing the language and clarity within the book, thereby ensuring a smooth reading experience for readers. It’s important to note that the content itself has been crafted by the authors and edited by a professional publishing team.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kaustubh Manglurkar

Publishing Product Manager: Deepesh Patel

Book Project Manager: Aparna Ravikumar Nair

Senior Editor: Tazeen Shaikh

Technical Editor: Seemanjay Ameriya

Copy Editor: Safis Editing

Proofreader: Tazeen Shaikh

Indexer: Rekha Nair

Production Designer: Prashant Ghare

Senior DevRel Marketing Executive: Nivedita Singh

First published: June 2024

Production reference: 1070624

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK.

ISBN 978-1-83763-429-3

www.packtpub.com

To my parents, Ying and Shinlang, for their love and unconditional support.

To Noël Jaffré, who has always been an inspiration throughout my career.

– Huage Chen

To my beloved parents, who have tirelessly shaped a world where I could chase my dreams—your efforts have been my foundation. To my dear wife and my children, Safaa, Adam, and Zaki, whose love, patience, and incredible support have been my strength on this incredible journey. Thank you.

– Yazid Akadiri

Foreword

As we explore the growing world of data, the skill to understand it and use it to its full strength becomes a key challenge for data practitioners, architects, search specialists, DevOps and SREs, and others.

Since the inception of Elasticsearch and the progressive addition of key components to form what is currently known as the Elastic Stack, we’ve wanted to help people make sense of their data through the power of search and analytics. The launch of Elastic Stack 8 marks a big milestone in our journey. It is a version enriched with new capabilities, optimized performance, and an ever-stronger foundation for machine learning and AI.

This book serves as a practical resource for anyone who interacts with data and wants to learn how to exploit the power of the Elastic Stack, including Elasticsearch, Kibana, and various integrations, to make data-driven decisions and gain richer insights from their data environments.

As you turn the pages of this cookbook, you will uncover the innovations introduced in version 8.x. Our goal has always been to simplify the complex, and this book aligns perfectly with that ethos—breaking down advanced concepts into easy-to-follow, step-by-step instructions. Whether you are taking your initial steps in Elasticsearch and the Elastic Stack or looking to expand your expertise, the cookbook format provides a unique opportunity to build your skills progressively and systematically.

I am excited about the endless possibilities that Elastic Stack 8.x unlocks, and I look forward to hearing about the innovative ways in which you employ these recipes.

Shay Banon

Creator of Elasticsearch and CTO of Elastic

Contributors

About the authors

Huage Chen is a member of Elastic’s customer engineering team and has been with Elastic for over five years, helping users throughout Europe to innovate and implement cloud-based solutions for search, data analysis, observability, and security. Before joining Elastic, he worked for 10 years in web content management, web portals, and digital experience platforms.

Yazid Akadiri has been a solutions architect at Elastic for over four years, helping organizations and users solve their data and most critical business issues by harnessing the power of the Elastic Stack. At Elastic, he works with a broad range of customers, with a particular focus on Elastic observability and security solutions. He previously worked in web services-oriented architecture, focusing on API management and helping organizations build modern applications.

About the reviewers

Evelien Schellekens is a senior solutions architect at Elastic. Evelien enjoys sharing knowledge through public speaking and interacting with the technical community. She’s passionate about observability and open source technologies such as Kubernetes.

Giuseppe Santoro is a senior software engineer at Elastic. With deep expertise in Kubernetes, the cloud, and observability, Giuseppe contributes to the tech community through mentoring and technical writing.

Acknowledgments

We would like to express our gratitude to Evelien Schellekens and Giuseppe Santoro for their invaluable contributions and meticulous review of this book. Their expertise and thoughtful feedback have been instrumental in refining our work. We also extend our thanks to our fellow Elasticians for their contributions: Amanda Branch, Bahaaldine Azarmi, Carson Ip, Nicholas Drost, Sean Collin, Yan Savitski, Yannick Fhima, and the entire Elastic South EMEA Solutions Architect and Customer Architect teams.

Table of Contents

Preface

1

Getting Started – Installing the Elastic Stack

Deploying the Elastic Stack on Elastic Cloud

How to do it…

How it works…

There’s more…

Installing the Elastic Stack with ECK

Technical requirements

Getting ready

How to do it…

How it works…

There’s more…

See also

Installing a self-managed Elastic Stack

Getting ready

How to do it…

How it works…

There’s more…

Creating and setting up data tiering

Getting ready

How to do it on your local machine…

How it works (on self-managed)…

How to do it on Elastic Cloud…

How to do it on ECK…

There’s more…

See also

Creating and setting up additional Elasticsearch nodes

Getting ready

How to do it...

How it works…

How to do it on Elastic Cloud...

How to do it on ECK…

There’s more…

See also

Creating and setting up Fleet Server

Getting ready

How to do it on a self-managed Elastic Stack…

How it works…

Setting up on Elastic Cloud

See also

Setting up snapshot repository

Getting ready

How to do it…

How it works…

There’s more…

2

Ingesting General Content Data

Introducing the Wikipedia Movie Plots dataset

Technical requirements

Adding data from the Elasticsearch client

Getting ready

How to do it…

How it works...

There’s more…

Updating data in Elasticsearch

Getting ready

How to do it…

How it works...

There’s more…

Deleting data in Elasticsearch

Getting ready

How to do it…

How it works...

There’s more…

See also

Using an analyzer

Getting ready

How to do it…

How it works...

There’s more…

Defining index mapping

Getting ready

How to do it…

How it works...

There’s more…

See also

Using dynamic templates in document mapping

Getting ready

How to do it…

How it works...

There’s more…

See also

Creating an index template

Getting ready

How to do it…

How it works...

There’s more…

Indexing multiple documents using Bulk API

Getting ready

How to do it…

How it works...

There’s more…

See also

3

Building Search Applications

Technical requirements

Searching with Query DSL

Getting ready

How to do it...

How it works...

There’s more...

Building advanced search queries with Query DSL

Getting ready

How to do it...

How it works...

There’s more...

See also

Using search templates to pre-render search requests

Getting ready

How to do it...

How it works...

There’s more...

See also

Getting started with Search Applications for your Elasticsearch index

Getting ready

How to do it...

How it works...

Building a search experience with the Search Application client

Getting ready

How to do it...

How it works...

There’s more...

See also

Measuring the performance of your Search Applications with Behavioral Analytics

Getting ready

How to do it...

How it works...

There’s more...

See also

4

Timestamped Data Ingestion

Technical requirements

Deploying Elastic Agent with Fleet

Getting ready

How to do it...

How it works...

There’s more...

See also

Monitoring Apache HTTP logs and metrics using the Apache integration

Getting ready

How to do it...

How it works...

There’s more...

See also

Deploying standalone Elastic Agent

Getting ready

How to do it...

How it works...

There’s more...

See also

Adding data using Beats

Getting ready

How to do it...

How it works...

There’s more...

See also

Setting up a data stream manually

Dataset

Getting ready

How to do it...

How it works...

There’s more...

See also

Setting up a time series data stream manually

Getting ready

How to do it...

How it works…

There’s more...

See also

5

Transform Data

Technical requirements

Creating an ingest pipeline

Getting ready

How to do it...

How it works...

There’s more...

See also

Enriching data with a custom ingest pipeline for an existing Elastic Agent integration

Getting ready

How to do it...

How it works...

There’s more...

Using a processor to enrich your data before ingesting with Elastic Agent

Getting ready

How to do it...

How it works...

There’s more...

See also

Installing self-managed Logstash

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating a Logstash pipeline

Getting ready

How to do it...

How it works...

There’s more...

See also

Setting up pivot data transform

Getting ready

How to do it...

How it works...

There’s more...

See also

Setting up the latest data transform

Getting ready

How to do it...

How it works...

There’s more...

See also

Downsampling your time series data

Getting ready

How to do it...

How it works...

There’s more...

See also

6

Visualize and Explore Data

Technical requirements

Exploring your data in Discover

Getting ready

How to do it...

How it works...

There’s more...

See also

Exploring your data with ES|QL

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating visualizations with Kibana Lens

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating visualizations from runtime fields

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating Kibana maps

Getting ready

How to do it...

How it works...

There’s more…

See also

Creating and using Kibana dashboards

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating Canvas workpads

Getting ready

How to do it...

How it works...

There’s more…

See also

7

Alerting and Anomaly Detection

Technical requirements

Creating alerts in Kibana

Getting ready

How to do it...

How it works...

There’s more…

See also

Monitoring alert rules

Getting ready

How to do it...

How it works...

There’s more...

See also

Investigating data with log rate analysis

Getting ready

How to do it...

How it works...

There’s more…

See also

Investigating data with log pattern analysis

Getting ready

How to do it...

How it works...

There’s more...

Investigating data with change point detection

Getting ready

How to do it...

How it works...

There’s more...

See also

Detecting anomalies in your data with unsupervised machine learning jobs

Getting ready

How to do it...

How it works...

There’s more...

See also

Creating anomaly detection jobs from a Lens visualization

Getting ready

How to do it...

How it works...

There’s more...

8

Advanced Data Analysis and Processing

Technical requirements

Finding deviations in your data with outlier detection

Getting ready

How to do it…

How it works…

See also

Building a model to perform regression analysis

Getting ready

How to do it…

How it works…

There’s more…

See also

Building a model for classification

Getting ready

How to do it…

How it works…

There’s more…

See also

Using a trained model for inference

Getting ready

How to do it…

How it works…

There’s more…

See also

Deploying third-party NLP models and testing via the UI

Getting ready

How to do it…

How it works…

There’s more…

See also

Running advanced data processing with trained models

Getting ready

How to do it…

How it works…

There’s more…

See also

9

Vector Search and Generative AI Integration

Technical requirements

Implementing semantic search with dense vectors

Getting ready

How to do it…

How it works…

There’s more…

See also

Implementing semantic search with sparse vectors

Getting ready

How to do it...

How it works...

There’s more...

See also

Using hybrid search to build advanced search applications

Getting ready

How to do it...

How it works...

There’s more...

See also

Developing question-answering applications with Generative AI

Getting ready

How to do it...

How it works...

There’s more...

See also

Using advanced techniques for RAG applications

Getting ready

How to do it...

How it works...

There’s more...

See also

10

Elastic Observability Solution

Technical requirements

Instrumenting your application with Elastic APM

Getting ready

How to do it…

How it works…

There’s more…

See also

Setting up RUM

Getting ready

How to do it…

How it works…

There’s more…

See also

Instrumenting and monitoring with OpenTelemetry

Getting ready

How to do it…

How it works…

There’s more…

See also

Monitoring Kubernetes environments with Elastic Agent

Getting ready

How to do it…

How it works…

There’s more…

See also

Managing synthetics monitoring

Getting ready

How to do it…

How it works…

There’s more…

See also

Gaining comprehensive system visibility with Elastic Universal Profiling

Getting ready

How to do it…

How it works…

There’s more…

See also

Detecting incidents with alerting and machine learning

Getting ready

How to do it…

How it works…

There’s more…

See also

Gaining insights with the AI Assistant

Getting ready

How to do it…

How it works…

There’s more…

See also

11

Managing Access Control

Technical requirements

Using built-in roles

Getting ready

How to do it…

How it works…

See also

Defining custom roles

Getting ready

How to do it…

How it works…

There’s more…

See also

Granting additional privileges

Getting ready

How to do it…

How it works…

There’s more…

See also

Managing and securing access to Kibana spaces

Getting ready

How to do it…

How it works…

There’s more…

See also

Managing access with API keys

Getting ready

How to do it…

How it works…

There’s more…

See also

Configuring single sign-on

Getting ready

How to do it…

How it works…

There’s more…

See also

Mapping users and groups to roles

Getting ready

How to do it…

How it works…

There’s more…

12

Elastic Stack Operation

Technical requirements

Setting up an index lifecycle policy

Getting ready

How to do it…

How it works…

There’s more…

See also

Optimizing time series data streams with downsampling

Getting ready

How to do it…

How it works…

There’s more…

See also

Managing the snapshot lifecycle

Getting ready

How to do it…

How it works…

There’s more…

Configuring Elastic Stack components with Terraform

Getting ready

How to do it…

How it works…

There’s more…

See also

Enabling and configuring cross-cluster search

Getting ready

How to do it…

How it works…

There’s more…

See also

13

Elastic Stack Monitoring

Technical requirements

Setting up Stack Monitoring

Getting ready

How to do it…

How it works…

There’s more…

See also

Building custom visualizations for monitoring data

Getting ready

How to do it…

How it works…

There’s more…

Monitoring cluster health via an API

Getting ready

How to do it…

How it works…

There’s more…

See also

Enabling audit logging

Getting ready

How to do it…

How it works…

There’s more…

See also

Index

Other Books You May Enjoy

Preface

In this cookbook, you will explore practical recipes and step-by-step instructions for solving real-world data challenges using the latest versions of the Elastic Stack’s components, including Elasticsearch, Kibana, Elastic Agent, Logstash, and Beats. This book equips you with the knowledge and skills necessary to unlock the full potential of the Elastic Stack.

The book begins with practical guides on installing the stack through various deployment methods. Subsequently, it delves into the ingestion and search of general content data, illustrating how to develop enhanced search experiences. As you progress, you will explore timestamped data ingestion, data transformation, and enrichment using various components of the Elastic Stack. You will also learn how to visualize, explore, and create dashboards with your data using Kibana. Moving forward, you will refine your skills in anomaly detection and data science, employing advanced techniques in data frame analytics and natural language processing. Equipped with these concepts, you will investigate the latest advancements in search technology, including semantic search and generative AI. Additionally, you will explore Elastic Observability use cases for log, infrastructure, and synthetic monitoring, alongside essential strategies for securing the Elastic Stack. Ultimately, you will gain expertise in Elastic Stack operations, enabling you to monitor and manage your system effectively.

By the end of the book, you will have acquired the necessary knowledge and skills to build scalable, reliable, and efficient data analytics and search solutions with the Elastic Stack.

Note

The Elastic Security solution, a significant component of the Elastic Stack, would have merited considerable attention in this book. However, due to considerations regarding the length of the book and the intended audience, we have opted not to include this section in the current edition.

Who this book is for

This book is intended for Elastic Stack users, developers, observability practitioners, and data professionals of all levels, from beginners to experts, seeking practical experience with the Elastic Stack:

Developers will find easy-to-follow recipes for utilizing APIs and features to craft powerful applications.Observability practitioners will benefit from use cases that cover APM, Kubernetes, and cloud monitoring.Data engineers and AI enthusiasts will be provided with dedicated recipes focusing on vector search and machine learning.

No prior knowledge of the Elastic Stack is required.

What this book covers

Chapter 1, Getting Started – Installing the Elastic Stack, explores the installation of the Elastic Stack across environments such as Elastic Cloud and Kubernetes, detailing the setup for Elasticsearch, Kibana, and Fleet along with insights on cluster components and deployment strategies for stack optimization.

Chapter 2, Ingesting General Content Data, dives into the data ingestion process, focusing on indexing, updating, and deleting operations within Elasticsearch, and emphasizes analyzers, index mappings, and templates for effective Elasticsearch index management.

Chapter 3, Building Search Applications, guides you through constructing search experiences using Elasticsearch’s Query DSL and new features in Elastic Stack 8, culminating in comprehensive search applications with advanced queries and analytics.

Chapter 4, Timestamped Data Ingestion, delves into data transformation using Elastic Stack tools, instructing on data structuring, enrichment, reorganization, and downsampling, while utilizing ingest pipelines, processors, Transforms, and Logstash.

Chapter 5, Transform Data, delves into data transformation techniques using Elastic Stack tools. You will learn how to structure, enrich, reorganize, and downsample your data to glean actionable insights. This chapter delivers practical know-how on utilizing ingest pipelines, processors, transforms, and Logstash for efficient data manipulation.

Chapter 6, Visualize and Explore Data, shows how to turn transformed data into visualizations, teaching data exploration in Discover, visual creation with Kibana Lens, and the use of dashboards and maps to deeply understand your data.

Chapter 7, Alerting and Anomaly Detection, outlines the setup of alerts and anomaly detection for proactive data management, covering alert creation and monitoring, anomaly investigation, and unsupervised machine learning job implementation.

Chapter 8, Advanced Data Analysis and Processing, delves into machine learning within the Elastic Stack, covering outlier detection, regression, and classification modeling, as well as deploying NLP models for deep data insights.

Chapter 9, Vector Search and Generative AI Integration, explores advanced search technologies and AI integrations, teaching you about vector search, hybrid search, and Generative AI applications for developing sophisticated AI-driven conversational tools.

Chapter 10, Elastic Observability Solution, demonstrates how to employ the Elastic Stack for comprehensive system insights, covering application instrumentation, real-user monitoring, Kubernetes observability, synthetic monitors, and incident detection.

Chapter 11, Managing Access Control, navigates access control within the Elastic Stack, detailing authentication management, custom role definition, Kibana space security, API key utilization, and single sign-on implementation.

Chapter 12, Elastic Stack Operation, provides essential recipes for Elastic Stack management, such as index life cycle, data stream optimization, and snapshot life cycle management, and explores cluster automation with Terraform and cross-cluster search.

Chapter 13, Elastic Stack Monitoring, equips you with techniques for Elastic Stack monitoring and troubleshooting, focusing on the stack monitoring setup, custom visualization creation, cluster health assessment, and audit logging strategies.

To get the most out of this book

Before starting this book, you should have a basic understanding of databases, web servers, and data formats such as JSON. No prior Elastic Stack experience is needed, as the book starts with foundational topics. Familiarity with terminal commands and web technologies will be beneficial for following along. Each chapter progresses into more advanced Elastic Stack applications and techniques.

Software/hardware covered in the book

Operating system requirements

Elastic Stack 8.12

Windows, macOS, or Linux

Python 3.11+

Windows, macOS, or Linux

Docker 4.27.0

Windows, macOS, or Linux

Kubernetes 1.24+

Windows, macOS, or Linux

Node.js 19+

Windows, macOS, or Linux

Terraform 1.8.0

Windows, macOS, or Linux

Amazon Web Services (AWS)

Windows, macOS, or Linux

Google Cloud Platform (GCP)

Windows, macOS, or Linux

Okta

Windows, macOS, or Linux

Ollama

Windows, macOS, or Linux

OpenAI/Azure OpenAI

Windows, macOS, or Linux

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook. In case there’s an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The or and and operators yield results that are too broad or too strict; you can use the minimum_should_match parameter to filter less relevant results.”

A block of code is set as follows:

GET /movies/_search {   "query": {     "multi_match": {       "query": "come home",       "fields": ["title", "plot"]     }   } }

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

GET movies-dense-vector/_search {   "knn": {     "field": "plot_vector",     "k": 5,     "num_candidates": 50,     "query_vector_builder": {       "text_embedding": {         "model_id": ".multilingual-e5-small_linux-x86_64",         "model_text": "romantic moment"       }     }   },   "fields": [ "title", "plot" ] }

Any command-line input or output is written as follows:

$ kubectl apply -f elastic-agent-managed-kubernetes.yml$ sudo metricbeat modules enable tomcat

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: “In Kibana, go to Observability | APM | Services, to check whether the different microservices have been correctly instrumented.”

Tips or important notes

Appear like this.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it..., How it works..., There’s more..., and See also).

To give clear instructions on how to complete a recipe, use these sections as follows:

Getting ready

This section tells you what to expect in the recipe and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There’s more…

This section consists of additional information about the recipe in order to make you more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Elastic Stack 8.x Cookbook, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/978-1-83763-429-3

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

1

Getting Started – Installing the Elastic Stack

The Elastic Stack is a suite of components that allows you to ingest, store, search, analyze, and visualize your data from diverse sources. Previously known as the ELK Stack, today, it consists of four core components: Elasticsearch, Logstash, Elastic Agent, and Kibana.

Elasticsearch is a distributed search and analytics engine that can handle petabytes of unstructured data. Logstash, Beats, and Elastic Agent are data ingestion tools that can collect, transform, and load data from various sources into Elasticsearch. Kibana is a web-based interface that allows you to visualize and explore your data, as well as access various solutions built on top of the Elastic Stack. All integrate seamlessly so you can use your data for a variety of use cases such as search, analytics, observability, and security.

The Elastic Stack can be deployed on Elastic Cloud, as well as on-premises, and it can be deployed in a hybrid and orchestrated setup. In this chapter, we will guide you through setting up and running Elastic deployments in different environments, including a hosted Elasticsearch service on Elastic Cloud, Kubernetes infrastructure, and self-managed solutions. We will also discuss additional components and nodes within the cluster. By the end of this chapter, you’ll have a comprehensive understanding of the various deployment strategies and how to use the Elastic Stack.

Figure 1.1 illustrates the key components of the Elastic Stack and the relationship between different components from a data flow perspective:

Figure 1.1 – The Elastic Stack components

In this chapter, we are going to learn how to install Elasticsearch, Kibana, and Fleet with different deployment options (Elastic Cloud, self-managed, and Elastic Cloud on Kubernetes (ECK)) highlighted in the right part of the following figure, and then we will proceed to the data ingestion part in the next chapters.

To determine the most suitable deployment option for your needs, Figure 1.2 provides a comparative summary of the key differences among the various deployment methods:

Figure 1.2 – Deployment options comparison

We’ll be covering the following recipes:

Deploying the Elastic Stack on Elastic CloudInstalling the Elastic Stack with ECKInstalling a self-managed Elastic StackAdding data tiering to your deploymentSetting up additional nodesSetting up Fleet ServerSetting up a snapshot repository

Deploying the Elastic Stack on Elastic Cloud

Elastic Cloud is the most straightforward way to deploy and manage your Elasticsearch, Kibana, Integrations Server (a combined component for the application performance monitoring server and Fleet Server), and other components of the Elastic Stack. This recipe will guide you through the process of getting started with Elastic Cloud, from signing up for an account to creating your first Elastic deployment.

How to do it…

Before we begin, let’s learn how to create a deployment on Elastic Cloud and verify it using this step-by-step guide:

We will create an account on Elastic Cloud:Visit the Elastic Cloud website at https://cloud.elastic.co/.Click on the Sign up button (a 14-day trial without needing a credit card is offered by default).Fill out the registration form with your details, including your name, email address, and desired password.

Next, we will create a deployment.

On the next screen, you’ll be prompted to create your first deployment, you can choose between the following options as shown in Figure 1.3:Cloud provider: Google Cloud, Azure, or AWS.Region: The supported regions for different cloud providers (the list of supported regions can be found here: https://www.elastic.co/guide/en/cloud/current/ec-reference-regions.html).Hardware profile: You can simply start with the General-purpose profile. Elastic Cloud allows you to change hardware later.Version: The latest minor version of Elastic Stack 7 or 8.

Figure 1.3 – Creating a cloud deployment

On the next screen (shown in Figure 1.4), you’ll be given a password. Be sure to save it as you’ll need it to log in to both Kibana (the application interface) as well as command-line operations:

Figure 1.4 – Cloud deployment credentials

Finally, let’s check the created deployment.

After the deployment creation, you will be redirected to the Home page of Kibana, where you can choose one of the data onboarding guides as shown in Figure 1.5:

Figure 1.5 – Kibana onboarding screen

You can also check the deployment status from Elastic Cloud’s main console (https://cloud.elastic.co/home):

Figure 1.6 – Cloud deployment status

You can then click on Manage to see the details of your deployment and management options as shown in Figure 1.7:

Figure 1.7 – Cloud deployment console

How it works…

At this stage, the following components have been provisioned automatically:

2 Elasticsearch hot nodes with 2 GB of RAM1 Elasticsearch master tie-breaker node with 1 GB of RAM1 Kibana node with 1 GB of RAM1 Integrations Server node with 1 GB of RAM1 Enterprise Search node with 2 GB of RAM

You can see the detailed list view of the components that we just mentioned in your deployment as shown in Figure 1.8. It gives you valuable information about each component such as Health, Size, Role, Zone, Disk, and Memories:

Figure 1.8 – Cloud deployment components view

We also get different endpoints to access different components of the Elastic Stack:

Figure 1.9 – Cloud deployment endpoints

Note

You will need to save your cloud ID from this screen, as it will be useful and convenient to configure Elasticsearch clients, Beats, Elastic Agent, and so on, with your cloud ID so you can send data to your Elastic deployment.

There’s more…

Once you deploy the Elastic Stack on Elastic Cloud, there are different possibilities to manage and configure your deployment. Let us look at a few possibilities.

Here’s how to scale and configure your deployment:

Scale/autoscale your deployment to meet your growing needs (https://www.elastic.co/guide/en/cloud/current/ec-autoscaling.html).Add or remove nodes, change the node type, adjust the node size, and make other configuration changes (More on this in the Creating and setting up additional Elasticsearch nodes recipe in this chapter).Configure the data tiering (see the Creating and setting up data tiering recipe in this chapter for more information).Monitor, backup your data, and configure your backup repository (More details in the Setting up snapshot repository recipe of this chapter).Monitor your deployment health (More details in the Setting up stack monitoring recipe in Chapter 13).

Here’s how you secure and control access to your deployment:

Configure authentication methods, such as username/password or single sign-on (SSO) (See Chapter 11).Set up role-based access control (RBAC) to define user roles and permissions (See Chapter 11).Configure a deployment traffic filter (https://www.elastic.co/guide/en/cloud/current/ec-traffic-filtering-deployment-configuration.html).

Installing the Elastic Stack with ECK

ECK is the official Kubernetes operator for automating the deployment and management of Elasticsearch and other Elastic components on Kubernetes. ECK enables the use of Kubernetes-native tools and APIs to manage Elasticsearch clusters, offering capabilities for monitoring and securing them. It supports scaling, rolling upgrades, availability zone awareness, and the implementation of hot-warm-cold storage architectures. ECK allows for the exploitation of Elasticsearch’s power and flexibility on Kubernetes, both on-premises and in the cloud. In this guide, we will first install the ECK operator in a Kubernetes cluster and then use it to deploy an Elasticsearch cluster and Kibana.

Technical requirements

Ensure you have a Kubernetes cluster ready before deploying ECK and the Elastic Stack. For this recipe, you can use either minikube or Google Kubernetes Engine (GKE). Elastic Cloud on Kubernetes also supports other Kubernetes distributions such as OpenShift, Amazon Elastic Kubernetes Service (Amazon EKS), and Azure Kubernetes Service (Microsoft AKS). To ensure smooth deployment and optimal performance, allocate appropriate resources to your cluster. Your cluster should have at least 16 GB of RAM and 4 CPU cores to provide a seamless experience during the deployment of ECK, Elasticsearch, Kibana, Elastic Agent, and the sample application.

You can find all the related YAML files on the GitHub repository: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/tree/main/Chapter1/eck.

The snippets of this recipe can be found at the following address: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/snippets.md#installing-elastic-stack-with-elastic-cloud-on-kubernetes.

Getting ready

Before installing ECK, you need to prepare your Kubernetes environment and ensure that you have the necessary resources and permissions. This recipe presumes that your Kubernetes cluster is already up and running. Your Kubernetes nodes need to have at least 2 GB of free memory. Make sure to check the supported versions of Kubernetes on the official Elastic documentation website: https://www.elastic.co/support/matrix#matrix_kubernetes.

How to do it…

Let’s start:

First, you need to have an ECK operator deployed in your Kubernetes cluster. Let’s begin by creating the ECK custom resource definitions:$ kubectl create -f https://download.elastic.co/downloads/eck/2.11.0/crds.yaml

The following Elastic resources will be created in your Kubernetes cluster:

Figure 1.10 – Created resources when deploying ECK

Now that the custom resources definitions have been created, proceed with the installation of the ECK operator:$ kubectl apply -f https://download.elastic.co/downloads/eck/2.11.0/operator.yaml

Executing the previous command will give you the following output:

Figure 1.11 – Installing the ECK operator in a Kubernetes cluster

Important note

The best practice is to use a dedicated Kubernetes namespace for all workloads related to ECK, which offers enhanced isolation for various applications and robust security with RBAC permissions by default. The provided manifest uses the elastic-system namespace by default.

We can then monitor the operator logs:$ kubectl -n elastic-system logs -f statefulset.apps/elastic-operatorNow, let’s deploy a three-node Elasticsearch cluster by applying the YAML file provided in the GitHub repository:$ kubectl apply -f elasticsearch.yamlTo check the status of Elasticsearch, you can get an overview of the clusters with the following kubectl command:$ kubectl get elasticsearch

Note

This might take a couple of minutes if you need to pull the images.

Figure 1.12 shows the results of this command when the cluster has been successfully deployed:

Figure 1.12 – Checking the cluster status

Now, deploy the Kibana instance by applying the following kibana.yaml file in your cluster:$ kubectl apply -f kibana.yamlSimilar to Elasticsearch, you can find details about Kibana instances with the following command:$ kubectl get kibana

Figure 1.13 – Checking Kibana status

Finally, let’s connect to Kibana. This is quite straightforward, as ECK automatically creates a ClusterIP service for Kibana. Follow the next steps to log in to your Kibana instance.

Get the ClusterIP service created for Kibana:$ kubectl get service kibana-sample-kb-http

You should expect to see an output like Figure 1.14:

Figure 1.14 – Printing the Kibana ClusterIP

Now, use kubectl port-forward to access Kibana from your host:$ kubectl port-forward service/kibana-sample-kb-http 5601Before visiting the Kibana login page, we’ll need to retrieve the password of the elastic user provisioned by the operator with the following command:$ kubectl get secret elasticsearch-sample-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode; echo

Copy the output of the command.

Now that you’ve forwarded the port, open it in your web browser and use the credentials obtained in the previous steps to log in to Kibana as shown in the following figure:

Figure 1.15 – Log in to Kibana on ECK

Important note

When accessing Kibana, you might see a security warning due to self-signed certificates not being trusted by the browser. You can safely bypass this warning and proceed to Kibana’s URL. For production environments, it’s recommended to use certificates from your own certificate authority (CA) to ensure security.

How it works…

As you have seen, ECK greatly simplifies the setup of Elasticsearch and Kibana, getting you up and running in a few minutes. It accomplishes this by managing a variety of tasks on our behalf. Let’s review what ECK has done for us in the cluster:

Security: Security features are enabled in ECK, ensuring robust protection for all deployed Elastic Stack resources. By default, all resources deployed through ECK are secured. The system provisions a built-in basic authentication user named elastic. Transport Layer Security (TLS) is configured to secure network traffic within and to your Elasticsearch cluster.Certificates: A self-signed, internally generated CA certificate is used by default for each cluster, providing secure communication within the Elasticsearch cluster. For advanced configurations, you have the option to use externally signed certificates or other custom certificate setups.Default service exposure: Your cluster is automatically set up with a ClusterIP service, which offers internal network connectivity. You also have the option to configure these services to be of the LoadBalancer type, making them accessible from external networks.Elasticsearch connection: You may have noticed by looking at the provided kibana.yaml file that there are no explicit Elasticsearch connection details. The information is provided to Kibana with the ElasticsearchRef specification defined in the ECK operator.

There’s more…

As an alternative installation method, ECK can also be installed using a Helm chart from the Elastic Helm repository:

$ helm repo add elastic https://helm.elastic.co$ helm repo update

Starting with ECK version 2.8, Logstash can be managed as a custom resource using the operator.

See also

Here is an excellent blog article on ECK in production: https://www.elastic.co/blog/eck-in-production-environmentIf you’re interested in deploying ECK with Terraform, check out the following: https://www.elastic.co/blog/installing-eck-with-terraform-on-gcpAlso, there is this practical article on running ECK with Helm: https://www.elastic.co/blog/using-eck-with-helm

Installing a self-managed Elastic Stack

In this recipe, you will learn how to install and manage the Elastic Stack on your local machine, focusing primarily on the essential components: Elasticsearch and Kibana.

Getting ready

Before proceeding with the installation, make sure your system meets the minimum requirements for running Elasticsearch, Kibana, and Fleet Server. Check the official documentation for the specific version you want to install to ensure compatibility with your operating system (https://www.elastic.co/support/matrix).

How to do it…

Let’s first look at how to download Elasticsearch:

Visit the Elasticsearch download page (https://www.elastic.co/downloads/elasticsearch).By default, the official Elasticsearch download page provides you with the download links for the latest release. Choose the right package for your operating system.Once the download is complete, extract the contents of the package to a working directory of your choice.

Next, let’s configure Elasticsearch:

Open the Elasticsearch configuration file located in the extracted directory. For example, in Linux, it’s found at config/elasticsearch.yml.Adjust the settings as needed, such as the cluster name, network settings, and heap size.Save the configuration file.

Now, let’s see how you start Elasticsearch:

Open a terminal or command prompt and navigate to the Elasticsearch directory.Run the Elasticsearch executable or script that is appropriate for your operating system:For Linux/Mac, it is the following:$ ./bin/elasticsearchFor Windows, it is the following:$ bin\elasticsearch.bat

On the first launch, Elasticsearch will perform an initial security configuration, which includes generating a password for the built-in elastic user, an enrollment token for Kibana (valid for 30 minutes), and certificates and keys for transport and HTTP layers:

The Elasticsearch node is up and running and reachable at HTTPS port 9200< You can check the Elasticsearch node with curl command line curl --cacert <PATH_TO_CERTIFICATE> -u elastic https://localhost:9200

Next, we will download and install Kibana:

Go to the official Kibana download page and go to the Downloads section.By default, the official Kibana download page provides you with the download links for the latest release of Kibana. Download the appropriate package for your operating system (tar.gz/zip, deb, or rpm).Extract the downloaded Kibana package to a directory of your choice.Open a terminal or command prompt and navigate to the Kibana directory.Run the Kibana executable file (e.g., bin/kibana for Unix-like systems or bin\kibana.bat for Windows) to start Kibana.In your browser, access Kibana with the https://localhost:5601 default URL, use the enrollment token from the earlier step when Kibana starts, and click the button to confirm the connection with Elasticsearch.Use the elastic superuser and the previously generated password to log in to Kibana.

How it works…

Starting with Elastic 8.0, security features such as TLS for both inter-node communication and HTTP layer security are enabled by default in self-managed clusters. As a result, certificates and keys are automatically generated during the Elasticsearch installation process. This allows stack-level security, activating, by default, both node-to-node TLS and Elasticsearch API TLS, which we have seen during the installation of Kibana.

There’s more…

You can also use Docker as a self-managed deployment option – please refer to the official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html.

Creating and setting up data tiering

A data tier consists of several Elasticsearch nodes that have the same data role and usually run on similar hardware. Often, different hardware is configured for each tier; for example, the hot tier might use the most powerful and expensive hardware, while the cold or frozen tiers could utilize less expensive, storage-oriented hardware. Using data tiers is an efficient strategy for reducing hardware requirements in an Elasticsearch cluster while maintaining access to data and the ability to search through it. To illustrate, a single frozen node can keep up to 100 TB of data compared to 2 TB of data for a hot node.

However, there is a caveat: as data moves to colder tiers, query performance can decrease. This is expected since the data is less frequently queried.

Figure 1.16 – Elasticsearch data tiering

As we can see in Figure 1.16, there are four data tiers provided by Elasticsearch:

Hot tier: This tier handles mostly indexing and query for timestamped data (most recent and frequently accessed data). This tier can also be referenced as the content tier for non-timestamped data.Warm tier: This tier is used for less recent timestamped data (more than seven days) that does not need to be updated. Extends storage capacity up to five times compared to the hot tier.Cold tier: This tier is used for timestamped data that is not so frequently accessed and not updated anymore. This tier is built on searchable snapshots technology and can store twice as much data compared to the warm tier.Frozen tier: This tier is used for timestamped data that is never updated and queried rarely but needs to be kept for regulation, compliance, or security use cases such as forensics. The frozen tier stores most of the data on searchable snapshots and only the necessary data based on query is pulled and cached on a local disk inside the node.

In this recipe, you’ll learn how to set up data tiering in a self-managed Elasticsearch cluster. We will also discuss implementation on Elastic Cloud.

Getting ready

Make sure your self-managed cluster for the earlier recipe is up and running. For the sake of simplicity, we will create two additional nodes on the same local machine. We’ll add two data tiers to our cluster:

A node for the cold tierA node for the frozen tier

The code snippets for this recipe can be found at the following link: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/snippets.md#creating-and-setting-up-data-tiering.

How to do it on your local machine…

On your local machine, execute the following steps:

Open the elasticearch.yaml file of the cluster you’ve previously set up and uncomment the transport.host setting at the end.Create two new directories for the new nodes, and let’s call those directories the following:node-coldnode-frozenDownload and extract the content of Elasticsearch package in each directory. Make sure to use the same version and operating system as previously used in the Installing a self-managed Elastic Stack recipe.In a separate terminal from where your cluster from the previous recipe is running, navigate to the directory where Elasticsearch is installed and run the following command:$ ./bin/elasticsearch --enrollment-token -s node

This command generates an enrollment token that you’ll copy and use to enroll new nodes with your Elasticsearch cluster.

Go to the cold node directory and open the elasticseach.yaml file and add the following settings: node.name: node-cold node.roles: ["data_cold"]From the installation directory of the cold node, start Elasticsearch and pass the enrollment token with --enrollment-token:$ ./bin/elasticsearch --enrollment-token <enrollment-token>

Check that your node has successfully started.

Now, let’s do the same for the frozen node. Open the elasticseach.yaml file and add the following settings: node.name: node-frozen node.roles: ["data_frozen"]From the installation directory of the frozen node, start Elasticsearch and pass the enrollment token with --enrollment-token:$ bin/elasticsearch --enrollment-token <enrollment-token>

Check that the new frozen node has successfully started.

How it works (on self-managed)…

Elasticsearch now provides specific roles that match the different data tiers (hot, warm, cold, frozen). It means we can add one of the data_hot, data_warm, data_cold, or data_frozen node roles to the roles setting in the configuration file. Once the appropriate roles are defined in the configuration file, new nodes are introduced into the cluster using an enrollment token. The –s node argument specifies that we’re creating a token to enroll an Elasticsearch node into a cluster.

How to do it on Elastic Cloud…

Adding data tiers on an Elastic Cloud deployment is a more straightforward and streamlined process. There is no configuration file to edit and no infrastructure to provision; just head to your deployment and follow these steps:

On the Elastic Cloud deployment page, click Manage.Click on Edit on the left navigation pane.Click on Add capacity for any data tiers you wishto add.

Figure 1.17 – Cloud deployment data tiering

How to do it on ECK…

In ECK, you define your cluster’s topology using a concept called nodeSets. Within the nodeSets attribute, each entry represents a group of Elasticsearch nodes sharing the same Kubernetes and Elasticsearch configurations. For instance, you might have one nodeSets attribute for master nodes, another for your hot tier nodes, and so forth. You can find an example configuration in the GitHub repository: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/eck/elasticsearch-data-tiers.yaml.

When examining the provided configuration, it’s clear that there are three nodeSets attributes named hot, cold, and frozen, as illustrated in the following code block. Please note that for readability, the code has been abbreviated; the complete code is accessible at the specified GitHub repository location:

spec:   version: 8.12.2   nodeSets:     - name: hot       config:         node.store.allow_mmap: false       podTemplate:         ...       count: 3     - name: cold       config:           node.roles: ["data_cold"]           node.store.allow_mmap: false       podTemplate:           ...       count: 1     - name: frozen       config:         node.roles: [ "data_frozen" ]         node.store.allow_mmap: false         podTemplate:           ...         count: 1

In a real production scenario, additional configuration such as Kubernetes node affinity is necessary. Kubernetes node affinity uses NodeSelector to ensure Elasticsearch workloads are confined to selected Kubernetes nodes. Under the hood, Elasticsearch shard allocation awareness is used to allocate shards to the specified Kubernetes nodes.

There’s more…

In the production scenario, adding new data tiers to the cluster on a self-managed Elastic Stack is a bit more complex. For high availability and resilience, you’ll need to deploy nodes on separate machines and thus it requires additional configuration steps that were not covered in this recipe, such as binding to an address other than localhost.

Data tiers are the first steps of your data management strategy with Elasticsearch. The next step is to define an index life cycle management (ILM) policy that’ll automate the migration of your data between the different tiers. This will be covered in the Setting up index life cycle policy recipe in Chapter 12.

Data tiering is primarily intended for timestamped data. To fully leverage data tiers, matching infrastructure resources must be allocated for each tier. For instance, warm and cold tiers can use spinning disks rather than SSDs and have a larger RAM-to-disk ratio, enabling them to store more data. These tiers are ideal for frequent read access to your data. Meanwhile, the frozen tier depends entirely on searchable snapshots, making it most suitable for long-term retention and infrequent searches.

See also

For more information about data tiering, see https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.htmlYou can also read the blog, Data lifecycle management with data tiers, by Lee Hinman: https://www.elastic.co/blog/elasticsearch-data-lifecycle-management-with-data-tiers

Creating and setting up additional Elasticsearch nodes

An Elasticsearch cluster can have a variety of node roles, besides data tiers, to function efficiently. Figure 1.18 outlines the several types of nodes available in a cluster:

Figure 1.18 – Elasticsearch node types

Roles such as Master, Machine Learning, or Ingest can be dedicated to specific Elasticsearch instances, and this is often a best practice in a production environment.

In this recipe, we will learn how to configure dedicated nodes for both self-managed deployments and Elastic Cloud.

Getting ready

Ensure that your self-managed cluster from the previous recipe is operational. For simplicity, we will create additional nodes on the same local machine. The nodes will undertake the following roles:

A dedicated master eligible nodeA machine learning node

The snippets for this recipe are available at https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/snippets.md#creating-and-setting-up-additional-elasticsearch-nodes.

How to do it...

On your local machine, proceed with the following steps:

Create two new directories for the new nodes, which we will name the following:node-masternode-mlRepeat Steps 1 and 2 of the Installing a self-managed Elastic Stack recipe in each directory.In a separate terminal from where your cluster from the previous recipe is running, navigate to the directory where Elasticsearch is installed and run the following command:$ bin\elasticsearch-create-enrollment-token -s nodeCopy the enrollment token. You will use it to enroll the new nodes with your Elasticsearch cluster.Navigate to the node-master directory and open the elasticsearch.yaml file and add the following settings: node.name: node-master node.roles: ["master"]From the installation directory of the cold node, start Elasticsearch and pass the enrollment token with –enrollment-token:$ bin/elasticsearch --enrollment-token <enrollment-token>

Verify that the node has started successfully.

Now, let’s follow the same steps to add a dedicated machine learning node.

Open the elasticsearch.yaml file in the node-ml directory and add the following settings: node.name: node-ml node.roles: ["ml"]Start the node with the following command:$ bin/elasticsearch --enrollment-token <enrollment-token>

Check the new machine learning node has successfully started and joined the cluster.

How it works…

As explained in the previous recipe, we’re basically using the node.roles attributes to specify the roles.

How to do it on Elastic Cloud...

On Elastic Cloud, dedicated master nodes are provisioned based on the number of Elasticsearch nodes available in your deployment. If your deployment has more than six Elasticsearch nodes, dedicated master nodes are automatically created. If your deployment has less than six Elasticsearch nodes, a tie-breaker node is set up behind the scenes to ensure high availability.

For machine learning and the other node types, follow the steps outlined here:

On the Elastic Cloud deployment page, click Manage.Click on Edit on the left navigation pane.Click on Add capacity for the node type you wish to add (coordinating and ingest, machine learning).

How to do it on ECK…

In ECK, expanding your cluster with additional node types requires you to update your YAML specification. As discussed, when setting up data tiering, you introduce the concept of nodeSets. By simply adding a nodeSets attribute with the necessary role (e.g., ml, master, ingest, etc.), you instruct the operator to allocate those resources within your cluster. A sample YAML file is available at the following link: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/eck/elasticsearch-dedicated-master-ml.yaml.

There’s more…

In a production scenario, it’s always best to have dedicated hardware and hosts for specific node roles. You can also configure voting-only nodes that participate in the election for the master node but don’t serve as the master. A configuration with at least two dedicated master nodes and one voting-only node can be a suitable alternative to three full master nodes.

See also

For more information on the machine learning setup and requirements, see the following link: https://www.elastic.co/guide/en/machine-learning/current/setup.htmlFor an in-depth exploration of the various node roles and their specific functions, see the following link: https://www.elastic.co/guide/en/elasticsearch/reference/8.13/modules-node.html#node-roles

Creating and setting up Fleet Server

Fleet Server is a key component of the new ingest architecture in the Elastic Stack, which revolves around the Elastic Agent. Before delving into this recipe, let’s review some important concepts about Fleet and the Agent.

Fleet serves as the central management component, providing a UI within Kibana that manages Agents and their configurations at scale. The Elastic Agent is a single, unified binary responsible for data collection tasks – gathering logs, metrics, security events, and more, running on your hosts.

Fleet Server connects the Elastic Agent to Fleet and acts as a control plane for Elastic Agents. It is an essential piece if you intend to use Fleet for centralized management. The schema in Figure 1.19 illustrates the various components and their interactions:

Figure 1.19 – Architecture including Elastic Agent and Fleet Server

In this recipe, we’ll cover the setup of Fleet Server for self-managed deployments and Elastic Cloud.

Getting ready

Make sure you have an Elasticsearch cluster up and running with Kibana connected to the cluster.

For self-managed setups, this recipe assumes that you will be installing Fleet Server on the same local machine as your cluster.

Note

This configuration is not recommended for production environments.

How to do it on a self-managed Elastic Stack…

We will use the quick-start wizard in Kibana for our setup:

In Kibana, on the left menu pane, go to Management | Fleet.Click on Add Fleet Servers. This will present instructions for adding a Fleet Server with two options: Quick Start and Advanced. We’ll use the Quick Start option:

Figure 1.20 – Fleet Server configuration

Fill in the name and the URL.Click on Generate Fleet Server policy.Copy the generated command and paste it into your terminal.

Figure 1.21 – Fleet Server installation

If the installation is successful, you will see a confirmation showing that Fleet Server is operational and connected.

How it works…

By using the Quick Start option, Fleet automatically creates a Fleet Server instance and an enrollment token object in the background. Note that this option relies on self-signed certificates and is not suitable for production environments. For more details on how to set up Fleet using the Advanced mode, refer to the See also section of this recipe.

Setting up on Elastic Cloud

Elastic Cloud offers a hosted Integrations Server that includes Fleet Server, simplifying the setup process considerably.

To verify the availability of Fleet Server in your cloud deployment, do the following:

In Kibana, on the left menu pane, go to Management | Fleet.Look for Elastic Cloud agent policy on the Agents tab:

Figure 1.22 – Centralized management for Elastic Agents

Check that the agent status is healthy.

See also

For configuration samples to set up Fleet Server on ECK, see the following examples:

To gain deeper insights into Fleet and Agent you also look at this recorded webinar: https://www.elastic.co/webinars/introducing-elastic-agent-and-fleetTo set up Fleet Server for production, check the official documentation: https://www.elastic.co/guide/en/fleet/current/add-fleet-server-on-prem.html

Setting up snapshot repository

After you’ve set up a functional Elastic cluster, we recommend setting up a snapshot repository according to your deployment method. This allows you to back up your valuable data. Elasticsearch features a native capability for data backup and restoration.

When you create a deployment on Elastic Cloud, it comes also with a default repository called found repository. In this recipe, you’ll learn how to register and manage a snapshot repository for an Amazon S3 bucket with Elastic Cloud, a popular option. The setup concepts can also apply to other cloud repositories, such as Google Cloud Storage or Azure Blob Storage, and self-managed repositories.

Later in the book, we will provide a guide on how to configure and execute snapshot and restore operations.

Getting ready

Make sure that your Elastic Cloud deployment is up and running and that you have sufficient permissions to create and configure S3 buckets on AWS.

How to do it…

In the first step, we will create a S3 bucket:

First, let us go to AWS Console | S3 | Create Bucket. Provide a name for the bucket, for instance: elasticsearch-s3-bucket-repository. Make sure to choose Block all public access before proceeding to create the bucket:

Figure 1.23 – Creating an S3 bucket

Create an AWS policy to allow the Identity and Access Management user (IAM user) to access the S3 Bucket.

Navigate to AWS Management Console, then go to IAM | Policies.Click on Create Policy.Switch to the JSON editor and set up the policy with the following snippet (the snippet can be found at this address: https://github.com/PacktPublishing/Elastic-Stack-8.x-Cookbook/blob/main/Chapter1/snippets.md#sample-aws-s3-policy): { "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:*", "Resource": [ "arn:aws:s3:::elasticsearch-s3-bucket-repository", "arn:aws:s3:::elasticsearch-s3-bucket-repository/*" ] } ] }On the next screen, give the policy the name elasticsearch-s3-bucket-policy and click on Create Policy.

Create an IAM user and attach the policy we created.

Navigate to the AWS Management Console and then go to IAM | Access Management | Users.Click Create User, and provide elastic-s3-default-user as the username:

Figure 1.24 – Creating Elastic S3 default user

On the next screen (Figure 1.25), choose Attach policies directly and attach the policy that you previously generated:

Figure 1.25 – Attaching permission policy to the user

On the next screen (Figure 1.26), click on Create user to complete the user creation:

Figure 1.26 – Finalizing the user creation

Now, we will generate an access key and secret access.

Open the Security credentials tab, and then choose Create access key.On the next screen (Figure 1.27), you will need to choose Third-party service for the use case, confirm the recommendation, and click on Next:

Figure 1.27 – Access key configuration

On the next screen, click on Create access key.Download the key pair on the last screen of the wizard and choose Download .csv file. Store the .csv file with keys in a secure location that we will use in the next step.Store the access secrets in the Elastic Cloud deployment keystore (if you are configuring for an on-premises Elasticsearch cluster, you will have to use the elasticsearch-keystore command-line tool: https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-keystore.html).Go to the Elastic Cloud console and navigate to the management console of your deployment then go to the security page. Add settings for the Elasticsearch keystore with Type set to Single string and add the keys and values with the access key and secret access from the previous step: s3.client.secondary.access_key s3.client.secondary.secret_keyMake sure you get the following security keys on the security page of your deployment and restart the deployment to apply the changes:

Figure 1.28 – Elastic Cloud keystore setting

For a self-managed deployment, you can set up the same keys with thefollowing commands:$ bin/elasticsearch-keystore add s3.client.secondary.access_key$ bin/elasticsearch-keystore add s3.client. secondary.secret_keyWe can now register the repository with Kibana. Let’s go to Kibana | Management | Stack Management | Snapshot & Restore | Repositories | Register a repository, name it my-custom-s3-repo, and choose AWS S3 as the Repository type option, as shown in Figure 1.29:

Figure 1.29 – Snapshot repository creation

Set Client to secondary; this is part of your s3.client.secondary.access_key keystore secrets. Make sure to use the exact same bucket name that you created on AWS, elasticsearch-s3-bucket-repository, as shown in Figure 1.30:

Figure 1.30 – Snapshot repository client configuration

Click to verify the repository and make sure that your S3 bucket is successfully connected as a snapshot repository:

Figure 1.31 – Snapshot repository status

How it works…