Implementing Enterprise Observability for Success - Manisha Agrawal - E-Book

Implementing Enterprise Observability for Success E-Book

Manisha Agrawal

0,0
25,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Observability can be implemented in multiple ways within an organization based on the organization’s needs. So, it’s crucial for organizations to decide whether they need observability and to what extent, what skills and tools will suit them, and how long will it take to implement it. Implementing Enterprise Observability for Success provides a step-by-step approach to help you create an observability strategy, understand the principles behind the creation of the strategy, and logical steps to plan and execute the implementation.

You’ll learn about observability fundamentals and challenges, the importance of data and analytics along with different tools. Further, you’ll discover the various layers from which data should be collected for setting up observability.

Through real- life examples distilled from the author's experience in implementing observability at an enterprise level, you’ll uncover some of the non-technical & technical drivers of observability like the culture of the organization, the hierarchy of stakeholders, tools at disposal and the willingness to invest.

By the end of this book, you’ll be well-equipped to plan the observability journey, identify different stakeholders, spot the technology stack required, and lay out an effective plan for organization-wide adoption.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 276

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Implementing Enterprise Observability for Success

Strategically plan and implement observability using real-life examples

Manisha Agrawal

Karun Krishnannair

BIRMINGHAM—MUMBAI

Implementing Enterprise Observability for Success

Copyright © 2023 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Mohd Riyan Khan

Publishing Product Manager: Niranjan Naikwadi

Senior Editor: Sayali Pingale

Technical Editor: Nithik Cheruvakodan

Copy Editor: Safis Editing

Project Manager: Ashwin Kharwa

Proofreader: Safis Editing

Indexer: Tejal Daruwale Soni

Production Designer: Alishon Mendonca

Marketing Coordinator: Agnes D’souza

First published: June 2023

Production reference: 1190523

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80461-569-0

www.packtpub.com

To my mother, Asha Agrawal, for her belief in me and providing me with unconditional love and immeasurable inspiration. To my husband, Samya Maiti, for being my rock and encouraging me to stay focused regardless of the circumstances.

– Manisha Agrawal

To my father, S Krishnan Nair, for all his sacrifices, hard work, and, most importantly, unwavering commitment to his family. I lost him before I could bid a proper goodbye and this book is dedicated to him. To my mother, S Ammukutty Amma, for her unconditional love and great memories. To my wife, Sneha Mohan, for her encouragement and invaluable support. To my two angels, Tanvi and Vamika, for inspiring me and keeping my spirits high in the process.

– Karun Krishnannair

Contributors

About the authors

Manisha Agrawal is a Solution Architect with expertise in implementing scalable monitoring and observability solutions. She is committed to grooming herself and fellow women in IT to fulfill their ambitions and attain the recognition they truly merit.

As an advocate of building repeatable processes that function seamlessly at scale, Manisha possesses sharp attention to detail and a keen sense of improving existing processes. With over 12 years of experience in the finance and retail sectors, she has strong exposure to technology, people, and culture, enabling her to adapt quickly and work collaboratively toward shared goals.

Manisha holds a Bachelor of Technology in Information Technology from Rajasthan University, India, and resides with her husband in Bangalore. When she is not busy with work, Manisha enjoys indulging in her passions for traveling the world and devouring books.

Karun Krishnannair is a technology leader, analytics engineer, and architect with vast experience in working with enterprise customers and a deep understanding of monitoring and observability tools. As a technology leader and an architect, he is an avid believer in systems thinking and a strong proponent of a balanced approach to technology, tools, people, and processes to solve technology and business problems.

Karun earned a Master of Science in telecoms in 2006 and later a Graduate Diploma in Engineering Management, both from RMIT University. He has worked with two large telecommunication providers in Australia, a large financial institution, a telecommunication vendor, and two consultancies spanning over 15 years. Such a diverse background has given him the ability to see technology, people, processes, and organizational challenges from very different perspectives.

Originally from India, Karun now resides with his wife and two daughters in Melbourne, Australia.

About the reviewers

Laura Lu has over 10 years of experience in the IT industry, dedicated to observability and cloud operations. She is a former Dynatrace certified professional and delivered Dynatrace products and services to enterprise customers across APAC. She is currently employed by one of the world’s largest software companies, as an observability expert. She helps the cloud operations team manage their observability services for multiple cloud products at a massive scale. She also has a focus on automation and intelligent incident detection and remediation.

I’m thankful to my former and current employers, who invested heavily in my career and personal growth, and provided the platforms for me to thrive. I also truly believe that what I have achieved today would not be possible without the support and help of all the managers and colleagues that I worked with.

Mark Helotie has worked in IT his entire adult life, with over half of that experience being in the banking and finance sectors. He has focused on the observability/monitoring arena for the last 6-7 years, with a keen interest specifically in cybersecurity. Mark has mentored many people over the years and continues to pour his efforts into the larger community to improve the tools and processes we all use every day in observability.

Table of Contents

Preface

Part 1 – Understanding Observability in the Real World

1

Why Observe?

What is observability?

What was used before observability?

Issues with traditional monitoring techniques

Modern infrastructure

Pre-empting issues

Identifying why and where the problem exists

Key benefits of observability

Summary

2

The Fundamentals of Observability

Understanding logs, metrics, and traces

Logs

Metrics

Traces

Getting to know service views

User experience maps

Customer journey maps (processes)

System maps

Service aggregate maps

Exploring CMDBs

What is a CMDB?

Why is a CMDB important?

CMDB providers and their life cycles

Identifying KPIs

Google’s golden signals

Summary

3

The Real World and Its Challenges

Is observability difficult to implement?

Google versus a financial institution

Diverse service versus focused service

Technology leader versus follower

Challenges faced by organizations in the real world

Infrastructure and architecture complexity

Mindset and culture

A lack of executive support

Tools galore

Mechanisms to measure success

The price tag

Overcoming challenges

Navigating through infrastructure and architectural complexity

Taking stock of your estate

How can executives help?

Tool rationalization and usage

What does success look like?

Cost rationalization

Summary

4

Collecting Data to Set Up Observability

Data collection layer one – Infrastructure

Understanding infrastructure

Collecting data to monitor infrastructure

Using infrastructure data

Data collection layer two – The application

Data collection for monitoring the application

Collecting application log data

APM data

Telemetry

Data collection layer three – the business service

Digital experience monitoring

Synthetic transaction monitoring

Endpoint monitoring

Real user monitoring

Data collection layer four – The organization

Summary

5

Observability Outcomes: Dashboards, Alerts, and Incidents

Getting to know dashboards

Introducing alerts and incidents

Alerts and incidents – the finer details

At what point should an alert be set up?

What should be the frequency of the alert?

How to manage alerts?

Observability consumers – self healing

Summary

Part 2 – Planning and Implementation

6

Gauging the Organization for Observability Implementation

Organization and culture

Assessing and driving the organization’s culture

Being data-driven

Ensuring data literacy

Providing executive endorsement

Establishing a governance model

Summary

7

Achieving and Measuring Observability Success

Exploring observability maturity levels

Initial

Managed

Defined

Quantitatively Managed

Optimized

Understanding people and skills

Technical skills

Communication skills

Problem-solving skills

Mapping skills and maturity levels

Measuring observability

Summary

8

Identifying the Stakeholders

Enhancement drivers of an organization

The actors of observability

How users prompt improvement

Exploring the supporters of observability

Enterprise architects

Enterprise data team

Sourcing team

Compliance and regulatory teams

Introducing the RASCI matrix

Summary

9

Deciding the Tools for Observability

Developing a strategy

Desirable features of observability tools

Build, leverage, or buy?

Exploring observability tools

Emerging observability trends

Standardizing observability for open source projects

Increased adoption of tracing

Enhancing security with observability

Auto-healing

Considering the total cost of ownership for observability

Summary

Part 3 – Use Cases

10

Kickstarting Your Own Observability Journey

Understanding the observability implementation workflow

Preparation – organization-wide change

Implementation – adoption by the organization

Case study 1 – goFast

Identifying the problem

Addressing the problem

Case study 2 – superEats

Identifying the problem

Addressing the problem

Case study 3 – bigBuys

Identifying the problem

Addressing the problem

Case study 4 – gruvyCars

Identifying the problem

Addressing the problem

Summary

Index

Other Books You May Enjoy

Preface

This book is a complete guide for technology leaders and engineers on how to scope out, plan, and implement observability in an enterprise-scale environment. The key topics covered in the book are the following:

Observability concepts and key data formatsLearn how to gauge the organization for implementing observabilityPrinciples for identifying stakeholders, tools, and processesDevelop strategies to self-sustain the observability journeyCase studies and guidance for setting up observability

Observability has become a popular term in the technology industry for monitoring and analytics. It’s considered to be the next level of monitoring, and most vendors now offer an observability solution as part of their product suite. However, determining whether an organization requires observability, the extent to which it’s needed, and the suitable skills and tools to implement it can be challenging. This book presents a systematic approach to guide organizations in developing an observability strategy. It covers the underlying principles and the logical steps for planning and executing the implementation. It also highlights the ownership of the tasks and responsibilities of different teams and how to ensure observability remains a continuous and self-sustaining process. In addition, this book introduces the Observability Maturity Model, the skills required to achieve the levels of this model, and some helpful case studies for inspiring you in your observability journey.

Who this book is for

This book is targeted toward technology leaders, architects, and initiative leads who are responsible for enhancing monitoring and/or implementing observability. The book also benefits engineers, developers, and professionals who are already working on monitoring and analytics and are responsible for scaling the observability implementation across multiple teams or at an organization.

You are expected to a have a good understanding of monitoring concepts, a general understanding of IT systems and processes, and familiarity with working with various stakeholders.

What this book covers

Chapter 1, Why Observe?, explains what observability is, what was used before observability, issues with traditional monitoring techniques, and the key benefits of observability.

Chapter 2, The Fundamentals of Observability, covers various types of data required for observability, how to map dependencies and relationships, how to handle configuration items, and KPIs used for measuring performance. When these concepts are combined, they form the building blocks for observability.

Chapter 3, The Real World and Its Challenges, presents some of the challenges that organizations may encounter while implementing observability and potential workarounds for the challenges.

Chapter 4, Collecting Data to Set Up Observability, outlines the different layers of data collection that make up the observability landscape. These layers are infrastructure, then application, business service, and organization. It also covers efficient methods for data collection across all the layers.

Chapter 5, Observability Outcomes: Dashboards, Alerts, and Incidents, discusses dashboards, alerts, and incidents in detail in the context of observability. We will look at what they mean, what the benefits are, who sets them up, who the consumer is, and how to maintain these outcomes.

Chapter 6, Gauging the Organization for Implementing Observability, is intended to encourage you to analyze the culture of your organization and find out whether it is clan, control, create, or complete culture. This will help in planning and implementing a suitable observability strategy and also helps to gauge the effort required for successful implementation. It also introduces the concept of a governance model that helps in developing and maintaining standards and frameworks for observability.

Chapter 7, Achieving and Measuring Observability Success, introduces maturity levels, namely Initial, Defined, Managed, and Quantitatively Managed, and provides guardrails to guide you through the implementation process. This chapter also emphasizes the importance of the skills required by the organization, particularly the application teams, for this cultural shift.

Chapter 8, Identifying the Stakeholders, discusses how Drivers, Users, Actors, and Supporters are the stakeholders of observability. It also provides a RASCI matrix to help you understand how all these stakeholders work together, clearly calling out their responsibilities and ownership.

Chapter 9, Deciding the Tools for Observability, provides guidance on selecting the appropriate observability toolsets for the organization, along with references to tools across different categories. You will find some guidelines on what to consider before buying, building, or leveraging observability tools.

Chapter 10, Kickstarting Your Own Observability Journey, provides ideas on what an observability implementation looks like in the real world and discusses four case studies of fictitious companies that can be used as inspiration totally or in part as suits the organization.

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/EbzLW.

Conventions used

There are a couple of text conventions used throughout this book.

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “"Do you need to apply Extract, Transform, Load (ETL), or filtering?”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Implementing Enterprise Observability for Success, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/9781804615690

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

Part 1 – Understanding Observability in the Real World

This first part of the book provides a comprehensive guide to implementing observability in an organization. It covers various topics, including the benefits of observability, the fundamental building blocks of observability, the challenges that may arise during the implementation process, the different layers of data collection required, and the importance of dashboards, alerts, and incidents in the context of observability. The chapters provide practical guidance and tools to help readers assess their organization’s culture, select the appropriate observability tools, and identify the stakeholders. Overall, this part offers a roadmap for organizations looking to implement observability and achieve greater visibility into their operations.

This part has the following chapters:

Chapter 1, Why Observe?Chapter 2, The Fundamentals of ObservabilityChapter 3, The Real World and Its ChallengesChapter 4, Collecting Data to Set Up ObservabilityChapter 5, Observability Outcomes: Dashboards, Alerts, and Incidents

1

Why Observe?

Observability is a fast-growing new discipline, and all organizations want to adopt it. As you will see throughout this book, implementing observability is a journey that involves multiple teams and practices. Before you embark on this journey, it is important to understand what observability means, why it emerged, and how it can help.

This chapter will be the foundation for all the other chapters in this book. Additionally, we will introduce a fictional company that will be used throughout this book to discuss the concepts.

In a nutshell, the following topics will be covered in this chapter:

What is observability?What was used before observability?Issues with traditional monitoring techniquesKey benefits of observability

What is observability?

A quick Google search will give you definitions of observability in many forms from a variety of authors, vendors, and organizations. Since this book assumes you have a fair understanding of observability, we are not going to repeat a detailed definition here again. However, we will try to explain a few key concepts that are required for this book. In short, observability is not a tool, not a technology, not a strategy but a concept or a capability that will force you to think about how you are going to gain insights into the health of your application and services, at a conceptual stage of application development itself. It’s a combination of robust architecture and development practices, streamlining existing data management tools, and adopting and standardizing processes that will aid the former.

In simplistic terms, many people call observability next-generation monitoring or supercharged monitoring. But it’s fundamentally different in many ways. For starters, monitoring is fully dependent on a set of tools to generate the information required for operating a healthy application, while a highly observable system will generate the data that points toward existing problems or potential problems. For this to be achieved, the system developers and architects have to build observability capabilities into the product as a core function of the application itself, thereby reducing the dependency on external systems or tools to monitor. This is an ideal scenario; however, in reality, for observability, we have to depend on external applications to analyze the state of health of your applications and services. When observability is built within the application, it can reveal a lot more information about it, and as a result the dependencies on external systems or tools can be reduced significantly, as well as the cost.

Observability does not replace any application’s existing monitoring tools, but it standardizes and amalgamates the capabilities of Application Performance Monitoring (APM), log and metrics management tools, and the data that’s generated from applications, and effectively uses the distributed tracing methodology to achieve observability.

The Holy Grail of automation is the ability of the applications or systems to find out their issues and problems and self-heal before the users are impacted. Hence, observability can be considered a stepping stone for self-healing applications.

Throughout this book, we will use an example of a fictional company called MK Tea that supplies varieties of tea across the globe. They collect tea leaves from various locations, get them trucked to their plants, sort the tea as per flavors, quality, grade, package, and ship them all over the world. This entire process has many moving parts – each location where tea grows has different soil, moisture, and altitude characteristics; tea leaves, once collected, are packed and trucked to the plant by a trucking company; tea leaf sorting happens at the plant, which is an important, tedious, and manual process; tea leaves are crushed into powders of different grain sizes or dried and retained as leaves, packaged by machines, and shipped off to suppliers all over the world based on the demand for various flavors. We will see how observability can help MK Tea manage its overwhelming process, which involves human labor, skilled technicians, and fully automated machines.

You can use this example as inspiration to plan for observability in your organization.

What was used before observability?

Observability, as a term, this contradicts what you say a couple of sentences later, where you say the term was coined in 1960. please review the wording of this paragraph, with Google’s definition stating “observability is defined as a measure of how well internal states of a system can be inferred from knowledge of its external outputs.” This started doing rounds in technical talks and presentations. This definition was coined by engineer Rudolf E. Kálmán in 1960 in his paper on control theory. In the modern IT world, observability is just a concept. Even before it became the talk of the town, some engineers were probably already building rounded monitoring systems and the ecosystem around it that made their services observable. It’s just that they did not know the buzzword!

In a single instance of a web application, you can add some scripting to check the service’s status, use Nagios to monitor the infrastructure, write smart logs and scrape them with scripts or some tool to keep an eye on connectivity and errors, plug results into a ticketing system such as BMC or set up SNMP traps, and there you go! The system is observable, yes that’s true – all aspects of the system are covered, engineers have a hold on the infrastructure and services, they know whether the systems have connectivity, and tickets are raised when something goes wrong. It’s all there. Hold on – there is something still missing though, which we will discover at the end of this section.

When thinking of what was used before observability, we are not talking about mainframe systems that were a black box for decades until some bright brains opened up that tough nut with Syncsort; there is no need to start from the beginning to understand what was used before observability. In the 90s, software and desktops were batch-oriented, had single instances, and focused less on the GUI. The outputs that they emitted were either hardware signals or code that only a few skilled technicians could decipher. With the advent of sophisticated OSs such as Linux, the game started to evolve and you might be surprised that, for a long time, humble commands such as vmstat, top, and syslogs were sought after as monitoring tools for Linux and Unix-based OSs. But we will not start from there either.

As an example, take a look at the following figure for a quick contrast between the humble beginnings of monitoring and its current state:

Figure 1.1 – Monitoring then (left) and now (right) (Creative Commons—Attribution 2.0 Generic—CC BY 2.0)

Let’s fast forward a bit. The world started shrinking with the internet when the era of eCommerce started. All of a sudden, single-instance apps started evolving into monolithic apps (which we know entered the black hole soon after). And this is where we will start!

With eCommerce, infrastructure monitoring and traffic-light monitoring of services was not enough. Businesses needed frequent metrics on products, web traffic, and, most importantly, user behavior to assess current business and actionable insights to make future decisions for expansion. These came to be known as business metrics – data for the eyes of the executives. Logs being produced could no longer be at the mercy of the developers; logging frameworks and normalization techniques were introduced to help developers produce meaningful logs that could be used to derive application health and business metrics. Early-age monitoring tools such as Cacti, Nagios, scripts (shell or Python), and some commands could only cater to a handful of the monitoring requirements. Areas such as APM, customer behavior analytics, and measuring incident impact on customers remained largely untouched. As eCommerce platforms gained popularity, the volume of customers increased, and data volumes started exploding way beyond the capacity of the available monitoring tools. System architectures evolved from monolithic to distributed, making it even more difficult for traditional monitoring techniques to provide meaningful insights.

As the tech stack was increasing, each technology or tool started offering a monitoring tool. Windows had Event Viewer and SCOM, Linux had its commands, databases had RockSolid and OEM, Unix had HP products, and Apache servers had highly structured standard logs – this list can go on. Soon, the monitoring space was cluttered with micro-monitoring tools when the need was to have macro monitoring that could provide an end-to-end unified view of the distributed platforms consisting of various technologies.

As per Gartner’s report, log volumes have increased 1,000 times in the last 10 years! And, all these monitoring tools and utilities have started consuming more and more data and keep evolving:

Figure 1.2 – Log volume ingestion growth (source: Gartner)

So, before observability, there was only monitoring, which was limited to a particular technology in most cases. Then, a lot of big data monitoring tools were introduced, such as AppDynamics, New Relic, Splunk, Dynatrace, and others, that could collect data from various sources and make it available to end users on a single screen. The micro-based monitoring bubbles soon started converging toward these tools and a mature ecosystem started shaping up. When you look at the fancy visualizations that these tools offer, it’s hard to believe that monitoring in its primitive days was based on hardware-based signals, commands, and scripts.

Issues with traditional monitoring techniques

Traditional monitoring techniques focused on collecting and analyzing a few predefined metrics and leveraging them to analyze the system’s health and use them for alerting. IT systems were managed and operated in isolation and all the IT management and engineering processes in an organization were framed around this construct and followed this isolated approach. Many IT system providers created monitoring tools to primarily monitor the application’s health in isolation.

Let’s discuss the issues with traditional monitoring techniques and why they no longer fit the bill for observability implementations. You may already be past these challenges, but we still recommend reading through each of the challenges as we talk about them from an observability perspective.

Modern infrastructure

Let’s consider a service that depends on three applications. The traditional approach Would have identified key parameters that define the health of each of these three individual applications. Each of these services will be monitored separately, assuming that if the applications are healthy individually then the business service that depends on these applications (fully or partially) would also be healthy and will serve the customers efficiently. There was no concept of service in this approach.

This method would have worked well for a traditional infrastructure, where the application was monolithic and hosted on physical hardware in data centers. This guaranteed a certain amount of resources for the application to run. Then came virtualization, which added another layer on top of the physical hardware, and the guarantee of dedicated resources was gone. The adoption of cloud infrastructure services such as AWS and GCP and cloud-native technologies such as serverless architecture, microservices, and containers have completely de-coupled infrastructure and applications, making the IT system more complex and interdependent. These technologies have introduced a level of unpredictability in IT systems’ operations. Hence the concepts, practices, and tools used for managing and maintaining the health of applications also have to change accordingly.

Pre-empting issues

One of the key issues with the traditional monitoring approach is that you pre-empt the metrics that need to be collected and monitored. Many of these key indicators or metrics are decided based on the past experiences of vendors, administrators, and system engineers. With more experience, engineers can come up with multiple and better key indicators. While this was effective to a certain extent in traditional infrastructure environments, modern distributed architecture has introduced a lot of interdependencies and complexity in IT environments, where the source of the problems or issues can drastically vary. Hence, pre-empting potential health indicators or metrics can be quite inaccurate and challenging.

Identifying why and where the problem exists

The main purpose of conventional monitoring is to detect when there is a problem. This provides a simple green, amber, or red health status indication but doesn’t answer why and where the issue originates. Once the issues have been flagged, it’s up to the administrators and engineers to figure out where and why the problem exists. Since modern infrastructure services are very transient, identifying the source of the problem is quite difficult or time-consuming. Hence, answering why and where as quickly as possible is critical in reducing MTTR and maintaining a stable service.

Key benefits of observability

The first step toward implementing observability is not just knowing application design, infrastructure, and business functions – it’s also about considering customer behavior, the impact of incidents, application performance, adoption in the market, and the dollar value, to name a few. All members of the team need to come together to implement observability.

From the inception stage, you will require inputs from architects for design, developers for putting it together, operations for ensuring the right alert triggers, the business for clearly defining what they need, and a strategy to assess customer behavior and impact. As the project proceeds in the development and testing phases, continue to assess measures that help establish the success of a business function. Ensure that those measures are captured in outputs (logs/metrics/traces). Ensure that applications are not seen in silos but can be correlated as per business functions. This will give you visibility into business metrics and their impact on customers when things go south. The responsibility of knowing the fine-grained details of the app is shifted from architects and business analysts to every member of the team.

By now, if you have gathered that observability requires planning and hard work to implement, then you are on the right path! Congratulations, you have achieved your first milestone in your observability journey. It’s not something that you think about at the end of the project so that you can tick a box before it’s released into production. You need to think of observability from the inception of your new projects, plan for it and reframe the perspective for existing projects, and replan your observability strategy. We will talk about this a lot throughout this book. After all, this book has been purposely written to help you plan for observability.

The picture we have painted in this chapter is completely achievable. But what do you get after implementing observability for your applications?

Correlated applications that deliver higher business value

Modern architectures are delivered with crippling complexity, sophisticated infrastructure, smart networks, and an intertwined web of applications. A transaction originating in an on-premises web application may end up traversing containerized applications hosted in the cloud before it reaches completion. Observability lets you embrace this complexity as it focuses on correlating applications. Breaks or slowness in any application will quickly map out the impact on other applications, business functions, and customers. If your applications are observable, you will observe that the conversation in war rooms will change from bringing up the application to restoring business functions and minimizing customer impact.

An improved customer experience that drives customer loyalty

Observability delivers information faster. A high-severity incident may be super critical for infrastructure but if that particular infrastructure is only serving a very small percentage of low-value customers, it is not a high-priority incident. Observability gives you this information. It also tells you the symptoms before the customers sense them, giving you a thin window to analyze, detect, and act. Sometimes, the issues can’t be fixed in this thin window, but you can still use the time to prepare your response to the customers so that social media doesn’t explode and the service desk responds coherently. All your investments in observability are bound to result in improved customer experience.

Tools rationalization for improved ROI

Cut down the time required in interacting with various teams to identify the epicenter of the problem by integrating available tools that provide relevant insights for your application. Allow the tools to work in their own space but integrate the important metrics (infrastructure, application processes, deployments, database, networks, SRE, business, capacity, and more) from all the tools into a single tool that can easily construct and deconstruct your application, enabling you to measure performance on good days and manage incidents. A single or set of carefully chosen tools for observing business functions will also increase the transparency in the team as every single member of the team will have access to the same level of insights. Modern applications can generate a ton of data at high velocity. Observability helps in optimizing the data generation and collection mechanism to improve reliability and reduce cost by managing big data problems.

Focus on not just tech but also the process