The Kubernetes Operator Framework Book - Michael Dame - E-Book

The Kubernetes Operator Framework Book E-Book

Michael Dame

0,0
37,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

From incomplete collections of knowledge and varying design approaches to technical knowledge barriers, Kubernetes users face various challenges when developing their own operators. Knowing how to write, deploy, and pack operators makes cluster management automation much easier – and that's what this book is here to teach you.
Beginning with operators and Operator Framework fundamentals, the book delves into how the different components of Operator Framework (such as the Operator SDK, Operator Lifecycle Manager, and OperatorHub.io) are used to build operators. You’ll learn how to write a basic operator, interact with a Kubernetes cluster in code, and distribute that operator to users. As you advance, you’ll be able to develop a sample operator in the Go programming language using Operator SDK tools before running it locally with Operator Lifecycle Manager, and also learn how to package an operator bundle for distribution. The book covers best practices as well as sample applications and case studies based on real-world operators to help you implement the concepts you’ve learned.
By the end of this Kubernetes book, you’ll be able to build and add application-specific operational logic to a Kubernetes cluster, making it easier to automate complex applications and augment the platform.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 377

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



The Kubernetes Operator Framework Book

Overcome complex Kubernetes cluster management challenges with automation toolkits

Michael Dame

BIRMINGHAM—MUMBAI

The Kubernetes Operator Framework Book

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Rahul Nair

Publishing Product Manager: Surbhi Suman

Senior Editor: Sapuni Rishana Athiko

Content Development Editor: Yasir Ali Khan

Technical Editor: Shruthi Shetty

Copy Editor: Safis Editing

Project Coordinator: Shagun Saini

Proofreader: Safis Editing

Indexer: Rekha Nair

Production Designer: Prashant Ghare

Marketing Coordinator: Nimisha Dua

Senior Marketing Coordinator: Sanjana Gupta

First published: June 2022

Production reference: 1070622

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80323-285-0

www.packt.com

To my wife, Meaghan, my parents and sister, the boys of Gold Team and Gamma Tetarton, and all of the amazing mentors and colleagues who have guided me along in my career, helping me to grow into a better engineer.

– Mike Dame

Contributors

About the author

Michael Dame is a software engineer who has several years of experience in working with Kubernetes and Operators. He is an active contributor to the Kubernetes project, focusing his work on the Scheduler and Descheduler sub-projects, for which he received a 2021 Kubernetes Contributor Award. Michael has a degree in computer science from Rensselaer Polytechnic Institute and has worked on the Kubernetes control plane and OpenShift Operators as a senior engineer at Red Hat. He currently works at Google as a software engineer for Google Cloud Platform.

I want to thank Meaghan, for supporting and encouraging me throughout the writing process, and the many brilliant colleagues who took the time and had the patience to teach me everything I know about Kubernetes.

About the reviewers

Mustafa Musaji is an experienced architect and engineer in the enterprise IT space. He develops and designs open source software solutions for large organizations looking to adopt transformational changes within their IT estate. Mustafa currently works at Red Hat as an associate principal solutions architect, where he supports customers in the journey to cloud-native applications. That involves not only discussing using container-based deployments but also the underlying orchestration and dependent services.

Drew Hodun is the co-author of O'Reilly's Google Cloud Cookbook. He has had a number of roles at Google, both customer-facing and in product engineering. His focus has been on machine learning applications running on the Google cloud and Kubernetes for various clients, including in the autonomous vehicle, financial services, and media and entertainment industries. Currently, Drew works as a machine learning tech lead working on AI-powered EdTech products. He has spoken at a variety of conferences, including Predictive Analytics World, Google Cloud Next, and O'Reilly's AI Conference. (Twitter: @DrewHodun)

Thomas Lloyd III is an information technology professional who specializes in systems and network administration. He currently works in the non-profit sector, lending his skills and experience to St. Catherine's Center for Children located in Albany, NY. Prior to this. he worked for a variety of private sector companies, where he honed his skill set. He has an A.A.S. in systems and network administration and is currently in the process of earning his B.S. in technology from SUNY Empire.

I'd like to thank the countless individuals who have helped me throughout my education and career. I wouldn't be where I am today without the endless support of my family, friends, professors, and many colleagues who gave me guidance and the opportunity to succeed.

Table of Contents

Preface

Part 1: Essentials of Operators and the Operator Framework

Chapter 1: Introducing the Operator Framework

Technical requirements

Managing clusters without Operators

Demonstrating on a sample application

Reacting to changing cluster states

Introducing the Operator Framework

Exploring Kubernetes controllers

Knowing key terms for Operators

Putting it all together

Developing with the Operator SDK

Managing Operators with OLM

Distributing Operators on OperatorHub.io

Defining Operator functions with the Capability Model

Using Operators to manage applications

Summarizing the Operator Framework

Applying Operator capabilities

Summary

Chapter 2: Understanding How Operators Interact with Kubernetes

Interacting with Kubernetes cluster resources

Pods, ReplicaSets, and Deployments

Custom resource definitions

ServiceAccounts, roles, and RoleBindings (RBAC)

Namespaces

Identifying users and maintainers

Cluster administrators

Cluster users

End users and customers

Maintainers

Designing beneficial features for your operator

Planning for changes in your Operator

Starting small

Iterating effectively

Deprecating gracefully

Summary

Part 2: Designing and Developing an Operator

Chapter 3: Designing an Operator – CRD, API, and Target Reconciliation

Describing the problem

Designing an API and a CRD

Following the Kubernetes API design conventions

Understanding a CRD schema

Example Operator CRD

Working with other required resources

Designing a target reconciliation loop

Level- versus edge-based event triggering

Designing reconcile logic

Handling upgrades and downgrades

Using failure reporting

Reporting errors with logging

Reporting errors with status updates

Reporting errors with events

Summary

Chapter 4: Developing an Operator with the Operator SDK

Technical requirements

Setting up your project

Defining an API

Adding resource manifests

Additional manifests and BinData

Writing a control loop

Troubleshooting

General Kubernetes resources

Operator SDK resources

Kubebuilder resources

Summary

Chapter 5: Developing an Operator – Advanced Functionality

Technical requirements

Understanding the need for advanced functionality

Reporting status conditions

Operator CRD conditions

Using the OLM OperatorCondition

Implementing metrics reporting

Adding a custom service metric

RED metrics

Implementing leader election

Adding health checks

Summary

Chapter 6: Building and Deploying Your Operator

Technical requirements

Building a container image

Building the Operator locally

Building the Operator image with Docker

Deploying in a test cluster

Pushing and testing changes

Installing and configuring kube-prometheus

Redeploying the Operator with metrics

Key takeaways

Troubleshooting

Makefile issues

kind

Docker

Metrics

Additional errors

Summary

Part 3: Deploying and Distributing Operators for Public Use

Chapter 7: Installing and Running Operators with the Operator Lifecycle Manager

Technical requirements

Understanding the OLM

Installing the OLM in a Kubernetes cluster

Interacting with the OLM

Running your Operator

Generating an Operator's bundle

Exploring the bundle files

Building a bundle image

Pushing a bundle image

Deploying an Operator bundle with the OLM

Working with OperatorHub

Installing Operators from OperatorHub

Submitting your own Operator to OperatorHub

Troubleshooting

OLM support

OperatorHub support

Summary

Chapter 8: Preparing for Ongoing Maintenance of Your Operator

Technical requirements

Releasing new versions of your Operator

Adding an API version to your Operator

Updating the Operator CSV version

Releasing a new version on OperatorHub

Planning for deprecation and backward compatibility

Revisiting Operator design

Starting small

Iterating effectively

Deprecating gracefully

Complying with Kubernetes standards for changes

Removing APIs

API conversion

API lifetime

Aligning with the Kubernetes release timeline

Overview of a Kubernetes release

Start of release

Enhancements Freeze

Call for Exceptions

Code Freeze

Test Freeze

GA release/Code Thaw

Retrospective

Working with the Kubernetes community

Summary

Chapter 9: Diving into FAQs and Future Trends

FAQs about the Operator Framework

What is an Operator?

What benefit do Operators provide to a Kubernetes cluster?

How are Operators different from other Kubernetes controllers?

What is the Operator Framework?

What is an Operand?

What are the main components of the Operator Framework?

What programming languages can Operators be written in?

What is the Operator Capability Model?

FAQs about Operator design, CRDs, and APIs

How does an Operator interact with Kubernetes?

What cluster resources does an Operator act on?

What is a CRD?

How is a CRD different from a CR object?

What Kubernetes namespaces do Operators run within?

How do users interact with an Operator?

How can you plan for changes early in an Operator's lifecycle?

How does an Operator's API relate to its CRD?

What are the conventions for an Operator API?

What is a structural CRD schema?

What is OpenAPI v3 validation?

What is Kubebuilder?

What is a reconciliation loop?

What is the main function of an Operator's reconciliation loop?

What are the two kinds of event triggering?

What is a ClusterServiceVersion (CSV)?

How can Operators handle upgrades and downgrades?

How can Operators report failures?

What are status conditions?

What are Kubernetes events?

FAQs about the Operator SDK and coding controller logic

What is the Operator SDK?

How can operator-sdk scaffold a boilerplate Operator project?

What does a boilerplate Operator project contain?

How can you create an API with operator-sdk?

What does a basic Operator API created with operator-sdk look like?

What other code is generated by operator-sdk?

What do Kubebuilder markers do?

How does the Operator SDK generate Operator resource manifests?

How else can you customize generated Operator manifests?

What are go-bindata and go:embed?

What is the basic structure of a control/reconciliation loop?

How does a control loop function access Operator config settings?

What information does a status condition report?

What are the two basic kinds of metrics?

How can metrics be collected?

What are RED metrics?

What is leader election?

What are the two main strategies for leader election?

What are health and ready checks?

FAQs about OperatorHub and the OLM

What are the different ways to compile an Operator?

How does a basic Operator SDK project build a container image?

How can an Operator be deployed in a Kubernetes cluster?

What is the OLM?

What benefit does running an Operator with the OLM provide?

How do you install the OLM in a cluster?

What does the operator-sdk olm status command show?

What is an Operator bundle?

How do you generate a bundle?

What is a bundle image?

How do you build a bundle image?

How do you deploy a bundle with the OLM?

What is OperatorHub?

How do you install an Operator from OperatorHub?

How do you submit an Operator to OperatorHub?

Future trends in the Operator Framework

How do you release a new version of an Operator?

When is it appropriate to add a new API version?

How do you add a new API version?

What is an API conversion?

How do you convert between two versions of an API?

What is a conversion webhook?

How do you add a conversion webhook to an Operator?

What is kube-storage-version-migrator?

How do you update an Operator's CSV?

What are upgrade channels?

How do you publish a new version on OperatorHub?

What is the Kubernetes deprecation policy?

How can API elements be removed in the Kubernetes deprecation policy?

How long are API versions generally supported?

How long is the Kubernetes release cycle?

What is Enhancements Freeze?

What is Code Freeze?

What is Retrospective?

How do Kubernetes community standards apply to Operator development?

Summary

Chapter 10: Case Study for Optional Operators – the Prometheus Operator

A real-world use case

Prometheus overview

Installing and running Prometheus

Configuring Prometheus

Summarizing the problems with manual Prometheus

Operator design

CRDs and APIs

Reconciliation logic

Operator distribution and development

Updates and maintenance

Summary

Chapter 11: Case Study for Core Operator – Etcd Operator

Core Operators – extending the Kubernetes platform

RBAC Manager

The Kube Scheduler Operator

The etcd Operator

etcd Operator design

CRDs

Reconciliation logic

Failure recovery

Stability and safety

Upgrading Kubernetes

Summary

Other Books You May Enjoy

Preface

The emergence of Kubernetes as a standard platform for distributed computing has revolutionized the landscape of enterprise application development. Organizations and developers can now easily write and deploy applications with a cloud-native approach, scaling those deployments to meet the needs of them and their users. However, with that scale comes increasing complexity and a maintenance burden. In addition, the nature of distributed workloads exposes applications to increased potential points of failure, which can be costly and time-consuming to repair. While Kubernetes is a powerful platform on its own, it is not without its own challenges.

The Operator Framework has been developed specifically to address these pain points by defining a standard process for automating operational tasks in Kubernetes clusters. Kubernetes administrators and developers now have a set of APIs, libraries, management applications, and command-line tools to rapidly create controllers that automatically create and manage their applications (and even core cluster components). These controllers, called Operators, react to the naturally fluctuating state of a Kubernetes cluster to reconcile any deviation from the desired administrative stasis.

This book is an introduction to the Operator Framework for anyone who is interested in, but unfamiliar with, Operators and how they benefit Kubernetes users, with the goal of providing a practical lesson in designing, building, and using Operators. To that end, it is more than just a technical tutorial for writing Operator code (though it does walk through writing a sample Operator in Go). It is also a guide on intangible design considerations and maintenance workflows, offering a holistic approach to Operator use cases and development to guide you toward building and maintaining your own Operators.

Who this book is for

The target audience for this book is anyone who is considering using the Operator Framework for their own development, including engineers, project managers, architects, and hobbyist developers. The content in this book assumes some prior knowledge of basic Kubernetes concepts (such as Pods, ReplicaSets, and Deployments). However, there is no requirement for any prior experience with Operators or the Operator Framework.

What this book covers

Chapter 1, Introducing the Operator Framework, provides a brief introduction to the fundamental concepts and terminology that describe the Operator Framework.

Chapter 2, Understanding How Operators Interact with Kubernetes, provides sample descriptions of the ways that Operators function in a Kubernetes cluster, including not only the technical interactions but also descriptions of different user interactions.

Chapter 3, Designing an Operator – CRD, API, and Target Reconciliation, discusses high-level considerations to take into account when designing a new Operator.

Chapter 4, Developing an Operator with the Operator SDK, provides a technical walk - through of creating a sample Operator project in Go with the Operator SDK toolkit.

Chapter 5, Developing an Operator – Advanced Functionality, builds on the sample Operator project from the previous chapter to add more complex functionality.

Chapter 6, Building and Deploying Your Operator, demonstrates the processes for compiling and installing an Operator in a Kubernetes cluster by hand.

Chapter 7, Installing and Running Operators with the Operator Lifecycle Manager, provides an introduction to the Operator Lifecycle Manager, which helps to automate the deployment of Operators in a cluster.

Chapter 8, Preparing for Ongoing Maintenance of Your Operator, provides considerations for promoting the active maintenance of Operator projects, including how to release new versions and alignment with upstream Kubernetes release standards.

Chapter 9, Diving into FAQs and Future Trends, provides a distilled summary of the content from previous chapters, broken down into small FAQ-style sections.

Chapter 10, Case Study for Optional Operators – the Prometheus Operator, provides a demonstration of the Operator Framework concepts in a real-world example of an Operator used to manage applications.

Chapter 11, Case Study for Core Operator – Etcd Operator, provides an additional example of Operator Framework concepts applied to the management of core cluster components.

To get the most out of this book

It is assumed that you have at least a foundational understanding of basic Kubernetes concepts and terms due to the fact that the Operator Framework builds heavily on these concepts to serve its purpose. These include topics such as basic application deployment and a familiarity with command-line tools such as kubectl for interacting with Kubernetes clusters. While direct hands-on experience with these topics is not necessary, it will be helpful.

In addition, administrator access to a Kubernetes cluster is needed in order to complete all of the sample tasks in the book (for example, deploying an Operator in Chapter 6, Building and Deploying Your Operator). The chapters that require a Kubernetes cluster offer some options for creating disposable clusters and basic setup steps for doing so, but in order to focus on the main content, these sections intentionally do not go into thorough detail regarding cluster setup. It is strongly recommended to use a disposable cluster for all examples in order to avoid accidental damage to sensitive workloads.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/The-Kubernetes-Operator-Framework-Book. If there's an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Code in Action

The Code in Action videos for this book can be viewed at https://bit.ly/3m5dlYa.

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781803232850_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "This requires additional resources, such as ClusterRoles and RoleBindings, to ensure the Prometheus Pod has permission to scrape metrics from the cluster and its applications."

A block of code is set as follows:

apiVersion: monitoring.coreos.com/v1kind: Prometheusmetadata:  name: samplespec:  replicas: 2

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

apiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata:  name: web-service-monitor  labels:    app: webspec:  selector:    matchLabels:      serviceLabel: webapp

Any command-line input or output is written as follows:

$ export BUNDLE_IMG=docker.io/sample/nginx-bundle:v0.0.2

$ make bundle-build bundle-push

$ operator-sdk run bundle docker.io/same/nginx-bundle:v0.0.2

Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "Clicking on the Grafana Operator tile opens up the information page for this specific Operator."

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you've read The Kubernetes Operator Framework Book, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.

Part 1: Essentials of Operators and the Operator Framework

In this section, you will achieve a basic understanding of the history and purpose of Kubernetes Operators. The fundamental concepts of the Operator Framework will be introduced and you will learn how Operators function in a Kubernetes cluster. This will set the groundwork for more complex concepts, which will be introduced later.

This section comprises the following chapters:

Chapter 1, Introducing the Operator Framework Chapter 2, Understanding How Operators Interact with Kubernetes

Chapter 1: Introducing the Operator Framework

Managing a Kubernetes cluster is hard. This is partly due to the fact that any microservice architecture is going to be inherently based on the interactions of many small components, each introducing its own potential point of failure. There are, of course, many benefits to this type of system design, such as graceful error handling thanks to the separation of responsibilities. However, diagnosing and reconciling such errors requires significant engineering resources and a keen familiarity with an application's design. This is a major pain point for project teams who migrate to the Kubernetes platform.

The Operator Framework was introduced to the Kubernetes ecosystem to address these problems. This chapter will go over a few general topics to give a broad overview of the Operator Framework. The intent is to provide a brief introduction to the Operator Framework, the problems it solves, how it solves them, and the tools and patterns it provides to users. This will highlight key takeaways for the goals and benefits of using Operators to help administrate a Kubernetes cluster. These topics include the following:

Managing clusters without OperatorsIntroducing the Operator FrameworkDeveloping with the Operator software development kit (SDK)Managing Operators with the Operator Lifecycle Manager (OLM)Distributing Operators on OperatorHub.ioDefining Operator functions with the Capability ModelUsing Operators to manage applications

Technical requirements

This chapter does not have any technical requirements because we will only be covering general topics. In later chapters, we will discuss these various topics in depth and include technical prerequisites for following along with them.

The Code in Action video for this chapter can be viewed at: https://bit.ly/3GKJfmE

Managing clusters without Operators

Kubernetes is a powerful microservice container orchestration platform. It provides many different controllers, resources, and design patterns to cover almost any use case, and it is constantly growing. Because of this, applications designed to be deployed on Kubernetes can be very complex.

When designing an application to use microservices, there are a number of concepts to be familiar with. In Kubernetes, these are mainly the native application programming interface (API) resource objects included in the core platform. Throughout this book, we will assume a foundational familiarity with the common Kubernetes resources and their functions.

These objects include Pods, Replicas, Deployments, Services, Volumes, and more. The orchestration of any microservice-based cloud application on Kubernetes relies on integrating these different concepts to weave a coherent whole. This orchestration is what creates a complexity that many application developers struggle to manage.

Demonstrating on a sample application

Take, for example, a simple web application that accepts, processes, and stores user input (such as a message board or chat server). A good, containerized design for an application such as this would be to have one Pod presenting the frontend to the user and a second backend Pod that accepts the user's input and sends it to a database for storage.

Of course, you will then need a Pod running the database software and a Persistent Volume to be mounted by the database Pod. These three Pods will benefit from Services to communicate with each other, and they will also need to share some common environment variables (such as access credentials for the database and environment variables to tweak different application settings).

Here is a diagram of what a sample application of this sort could look like. There are three Pods (frontend, backend, and database), as well as a Persistent Volume:

Figure 1.1 – Simple application diagram with three Pods and a Persistent Volume

This is just a small example, but it's already evident how even a simple application can quickly involve tedious coordination between several moving parts. In theory, these discrete components will all continue to function cohesively as long as each individual component does not fail. But what about when a failure does occur somewhere in the application's distributed design? It is never wise to assume that an application's valid state will consistently remain that way.

Reacting to changing cluster states

There are a number of reasons a cluster state can change. Some may not even technically be considered a failure, but they are still changes of which the running application must be aware. For example, if your database access credentials change, then that update needs to be propagated to all the Pods that interact with it. Or, a new feature is available in your application that requires tactful rollout and updated settings for the running workloads. This requires manual effort (and, more importantly, time), along with a keen understanding of the application architecture.

Time and effort are even more critical in the case of an unexpected failure. These are the kinds of problems that the Operator Framework addresses automatically. If one of the Pods that make up this application hits an exception or the application's performance begins to degrade, these scenarios require intervention. That means a human engineer must not only know the details of the deployment, but they must also be on-call to maintain uptime at any hour.

There are additional components that can help administrators monitor the health and performance of their applications, such as metrics aggregation servers. However, these components are essentially additional applications that must also be regularly monitored to make sure they are working, so adding them to a cluster can reintroduce the same issues of managing an application manually.

Introducing the Operator Framework

The concept of Kubernetes Operators was introduced in a blog post in 2016 by CoreOS. CoreOS created their own container-native Linux operating system that was optimized for the needs of cloud architecture. Red Hat acquired the company in 2018, and while the CoreOS operating system's official support ended in 2020, their Operator Framework has thrived.

The principal idea behind an Operator is to automate cluster and application management tasks that would normally be done manually by a human. This role can be thought of as an automated extension of support engineers or development-operations (DevOps) teams.

Most Kubernetes users will already be familiar with some of the design patterns of Operators, even if they have never used the Operator Framework before. This is because Operators are a seemingly complicated topic, but ultimately, they are not functionally much different than many of the core components that already automate most of a Kubernetes cluster by default. These components are called controllers, and at its core, any Operator is essentially just a controller.

Exploring Kubernetes controllers

Kubernetes itself is made up of many default controllers. These controllers maintain the desired state of the cluster, as set by users and administrators. Deployments, ReplicaSets, and Endpoints are just a few examples of cluster resources that are managed by their own controllers. Each of these resources involves an administrator declaring the desired cluster state, and it is then the controller's job to maintain that state. If there is any deviation, the controller must act to resolve what they control.

These controllers work by monitoring the current state of the cluster and comparing it to the desired state. One example is a ReplicaSet with a specification to maintain three replicas of a Pod. Should one of the replicas fail, the ReplicaSet quickly identifies that there are now only two running replicas. It then creates a new Pod to bring stasis back to the cluster.

In addition, these core controllers are collectively managed by the Kube Controller Manager, which is another type of controller. It monitors the state of controllers and attempts to recover from errors if one fails or reports the error for human intervention if it cannot automatically recover. So, it is even possible to have controllers that manage other controllers.

In the same way, Kubernetes Operators put the development of operational controllers in the hands of users. This provides administrators with the flexibility to write a controller that can manage any aspect of a Kubernetes cluster or custom application. With the ability to define more specific logic, developers can extend the main benefits of Kubernetes to the unique needs of their own applications.

The Operators that are written following the guidelines of the Operator Framework are designed to function very similarly to native controllers. They do this by also monitoring the current state of the cluster and acting to reconcile it with the desired state. Specifically, an Operator is tailored to a unique workload or component. The Operator then knows how to interact with that component in various ways.

Knowing key terms for Operators

The component that is managed by an Operator is its Operand. An Operand is any kind of application or workload whose state is reconciled by an Operator. Operators can have many Operands, though most Operators manage—at most—just a few (usually just one). The key distinction is that Operators exist to manage Operands, where the Operator is a meta-application in the architectural design of the system.

Operands can be almost any type of workload. While some Operators manage application deployments, many others deploy additional, optional cluster components offering meta-functionality such as database backup and restoration. Some Operators even make core native Kubernetes components their Operands, such as etcd. So, an Operator doesn't even need to be managing your own workloads; they can help with any part of a cluster.

No matter what the Operator is managing, it must provide a way for cluster administrators to interact with it and configure settings for their application. An Operator exposes its configuration options through a Custom Resource.

Custom Resources are created as API objects following the constraints of a matching CustomResourceDefinition (CRD). CRDs are themselves a type of native Kubernetes object that allows users and administrators to extend the Kubernetes platform with their own resource objects beyond what is defined in the core API. In other words, while a Pod is a built-in native API object in Kubernetes, CRDs allow cluster administrators to define MyOperator as another API object and interact with it the same way as native objects.

Putting it all together

The Operator Framework strives to define an entire ecosystem for Operator development and distribution. This ecosystem comprises three pillars that cover the coding, deployment, and publishing of Operators. They are the Operator SDK, OLM, and OperatorHub.

These three pillars are what have made the Operator Framework so successful. They transform the framework from just development patterns to an encompassing, iterative process that spans the entire lifecycle of an Operator. This helps support the contract between Operator developers and users to provide consistent industry standards for their software.

The lifecycle of an Operator begins with development. To help with this, the Operator SDK exists to guide developers in the first steps of creating an Operator. Technically, an Operator does not have to be written with the Operator SDK, but the Operator SDK provides development patterns to significantly reduce the effort needed to bootstrap and maintain an Operator's source code.

While coding and development are certainly important parts of creating an Operator, any project's timeline does not end once the code is compiled. The Operator Framework community recognized that a coherent ecosystem of projects must offer guidance beyond just the initial development stage. Projects need consistent methods for installation, and as software evolves, there is a need to publish and distribute new versions. OLM and OperatorHub help users to install and manage Operators in their cluster, as well as share their Operators in the community.

Finally, the Operator Framework provides a scale of Operator functionality called the Capability Model. The Capability Model provides developers with a way to classify the functional abilities of their Operator by answering quantifiable questions. An Operator's classification, along with the Capability Model, gives users information about what they can expect from the Operator.

Together, these three pillars establish the basis of the Operator Framework and form the design patterns and community standards that distinguish Operators as a concept. Along with the Capability Model, this standardized framework has led to an explosion in the adoption of Operators in Kubernetes.

At this point, we have discussed a brief introduction to the core concepts of the Operator Framework. In contrast with a Kubernetes application managed without an Operator, the pillars of the Operator Framework address problems met by application developers. This understanding of the core pillars of the Operator Framework will set us up for exploring each of them in more depth.

Developing with the Operator SDK

The first pillar of the Operator Framework is the Operator SDK. As with any other software development toolkit, the Operator SDK provides packaged functionality and design patterns as code. These include predefined APIs, abstracted common functions, code generators, and project scaffolding tools to easily start an Operator project from scratch.

The Operator SDK is primarily written in Go, but its tooling allows Operators to be written using Go code, Ansible, or Helm. This gives developers the ability to write their Operators from the ground up by coding the CRDs and reconciliation logic themselves, or by taking advantage of automated deployment tools provided by Ansible and Helm to generate their APIs and reconciliation logic depending on their needs.

Developers interact with the Operator SDK through its operator-sdk command-line binary. The binary is available on Homebrew for Mac and is also available directly from the Operator Framework GitHub repository (https://github.com/operator-framework/operator-sdk) as a release, where it can also be compiled from source.

Whether you are planning to develop an Operator with Go, Ansible, or Helm, the Operator SDK binary provides commands to initialize the boilerplate project source tree. These commands include operator-sdk init and operator-sdk create api. The first command initializes a project's source directory with boilerplate Go code, dependencies, hack scripts, and even a Dockerfile and Makefile for compiling the project.

Creating an API for your Operator is necessary to define the CRD required to interact with the Operator once it is deployed in a Kubernetes cluster. This is because CRDs are backed by API type definitions written in Go code. The CRD is generated from these code definitions, and the Operator has logic built in to translate between CRD and Go representations of the object. Essentially, CRDs are how users interact with Operators, and Go code is how the Operator understands the settings. CRDs also add benefits such as structural validation schemas to automatically validate inputs.

The Operator SDK binary has flags to specify the name and version of the API. It then generates the API types as Go code and corresponding YAML Ain't Markup Language (YAML) files based on best-practice standard definitions. However, you are free to modify the definitions of your API in whichever way you choose.

If we were to initialize a basic Operator for an application such as the one first demonstrated at the start of this chapter, the steps would be relatively simple. They would look like this:

$ mkdir sample-app

$ cd sample-app/

$ operator-sdk init --domain mysite.com --repo github.com/sample/simple-app

$ operator-sdk create api --group myapp --version v1alpha1 --kind WebApp --resource –controller

$ ls

total 112K

drwxr-xr-x   15 mdame staff  480 Nov 15 17:00 .

drwxr-xr-x+ 270 mdame staff 8.5K Nov 15 16:48 ..

drwx------    3 mdame staff   96 Nov 15 17:00 api

drwxr-xr-x    3 mdame staff   96 Nov 15 17:00 bin

drwx------   10 mdame staff  320 Nov 15 17:00 config

drwx------    4 mdame staff  128 Nov 15 17:00 controllers

drwx------    3 mdame staff   96 Nov 15 16:50 hack

-rw-------    1 mdame staff  129 Nov 15 16:50 .dockerignore

-rw-------    1 mdame staff  367 Nov 15 16:50 .gitignore

-rw-------    1 mdame staff  776 Nov 15 16:50 Dockerfile

-rw-------    1 mdame staff 8.7K Nov 15 16:51 Makefile

-rw-------    1 mdame staff  422 Nov 15 17:00 PROJECT

-rw-------    1 mdame staff  218 Nov 15 17:00 go.mod

-rw-r--r--    1 mdame staff  76K Nov 15 16:51 go.sum

-rw-------    1 mdame staff 3.1K Nov 15 17:00 main.go

After this, you would go on to develop the logic of the Operator based on the method you choose. If that's to write Go code directly, it would start by modifying the *.go files in the project tree. For Ansible and Helm deployments, you would begin working on the Ansible roles or Helm chart for your project.

Finally, the Operator SDK binary provides a set of commands to interact with OLM. These include the ability to install OLM in a running cluster, but also install and manage specific Operators within OLM.

Managing Operators with OLM

OLM is the second pillar of the Operator Framework. Its purpose is to facilitate the deployment and management of Operators in a Kubernetes cluster. It is a component that runs within a Kubernetes cluster and provides several commands and features for interacting with Operators.

OLM is primarily used for the installation and upgrade of Operators—this includes fetching and installing any dependencies for those Operators. Users interact with OLM via commands provided by the Operator SDK binary, the Kubernetes command-line tool (kubectl), and declarative YAML.

To get started, OLM can be initialized in a cluster with the following command:

$ operator-sdk olm install

Besides installing Operators, OLM can also make Operators that are currently installed discoverable to users on the cluster. This provides a catalog of already installed Operators available to cluster users. Also, by managing all the known Operators in the cluster, OLM can watch for conflicting Operator APIs and settings that would destabilize the cluster.

Once an Operator's Go code is compiled into an image, it is ready to be installed into a cluster with OLM running. Technically, OLM is not required to run an Operator in any cluster. For example, it is completely possible to deploy an Operator manually in the cluster, just as with any other container-based application. However, due to the advantages and security measures described previously (including its ability to install Operators and its awareness of other installed Operators), it is highly recommended to use OLM to manage cluster Operators.

When developing an Operator, the image is compiled into a bundle, and that bundle is installed via OLM. The bundle consists of several YAML files that describe the Operator, its CRD, and its dependencies. OLM knows how to process this bundle in its standardized format to properly manage the Operator in a cluster.

Compiling an Operator's code and deploying it can be done with commands such as the ones shown next. The first command shown in the following code snippet builds the bundle of YAML manifests that describe the Operator. Then, it passes that information to OLM to run the Operator in your cluster:

$ make bundle ...

$ operator-sdk run bundle ...

Later chapters will demonstrate exactly how to use these commands and what they do, but the general idea is that these commands first compile the Operator's Go code into an image and a deployable format that's understandable by OLM. But OLM isn't the only part of the Operator Framework that consumes an Operator's bundle—much of the same information is used by OperatorHub to provide information on an Operator.

Once an Operator has been compiled into its image, OperatorHub exists as a platform to share and distribute those images to other users.

Distributing Operators on OperatorHub.io

The final core component of the Operator Framework is OperatorHub.io. As a major open source project, the Operator Framework ecosystem is built on the open sharing and distribution of projects. Therefore, OperatorHub powers the growth of Operators as a Kubernetes concept.

OperatorHub is an open catalog of Operators published and managed by the Kubernetes community. It serves as a central index of freely available Operators, each contributed by developers and organizations. You can see an overview of the OperatorHub.io home page in the following screenshot:

Figure 1.2 – Screenshot of the OperatorHub.io home page, showing some of the most popular Operators

The process for submitting an Operator to OperatorHub for indexing has been standardized to ensure the consistency and compatibility of Operators with OLM. New Operators are reviewed by automated tooling for compliance with this standard definition of an Operator. The process is mainly handled through the open source GitHub repository that provides the backend of OperatorHub. However, OperatorHub does not provide any assistance with the ongoing maintenance of an Operator, which is why it is important for Operator developers to share links to their own open source repositories and contact information where users can report bugs and contribute themselves.

Preparing an Operator for submission to OperatorHub involves generating its bundle and associated manifests. The submission process primarily relies on the Operator's Cluster Service Version (CSV). The CSV is a YAML file that provides most of the metadata to OLM and OperatorHub about your Operator. It includes general information such as the Operator's name, version, and keywords. However, it also defines installation requirements (such as role-based access control (RBAC) permissions), CRDs, APIs, and additional cluster resource objects owned by the Operator.

The specific sections of an Operator's CSV include the following:

The Operator's name and version number, as well as a description of the Operator and its display icon in Base64-encoded image formatAnnotations for the OperatorContact information for the maintainers of the Operator and the open source repository where its code is locatedHow the Operator should be installed in the clusterExample configurations for the Operator's CRDRequired CRDs and other resources and dependencies that the Operator needs to run

Because of all the information that it covers, the Operator CSV is usually very long and takes time to prepare properly. However, a well-defined CSV helps an Operator reach a much wider audience. Details of Operator CSVs will be covered in a later chapter.

Defining Operator functions with the Capability Model

The Operator Framework defines a Capability Model (https://operatorframework.io/operator-capabilities/) that categorizes Operators based on their functionality and design. This model helps to break down Operators based on their maturity, and also describes the extent of an Operator's interoperability with OLM and the capabilities users can expect when using the Operator.

The Capability Model is divided into five hierarchical levels. Operators can be published at any one of these levels and, as they grow, may evolve and graduate from one level to the next as features and functionality are added. In addition, the levels are cumulative, with each level generally encompassing all features of the levels below it.

The current level of an Operator is part of the CSV, and this level is displayed on its OperatorHub listing. The level is based on somewhat subjective yet guided criteria and is purely an informational metric.

Each level has specific functionalities that define it. These functionalities are broken down into Basic Install, Seamless Upgrades, Full Lifecycle, Deep Insights, and Auto Pilot. The specific levels of the Capability Model are outlined here:

Level I—Basic Install: This level represents the most basic of Operator capabilities. At Level I, an Operator is only capable of installing its Operand in the cluster and conveying the status of the workload to cluster administrators. This means that it can set up the basic resources required for an application and report when those resources are ready to be used by the cluster.

At Level I, an Operator also allows for simple configuration of the Operand. This configuration is specified through the Operator's Custom Resource. The Operator is responsible for reconciling the configuration specifications with the running Operand workload. However, it may not be able to react if the Operand reaches a failed state, whether due to malformed configuration or outside influence.

Going back to our example web application from the start of the chapter, a Level I Operator for this application would handle the basic setup of the workloads and nothing else. This is good for a simple application that needs to be quickly set up on many different clusters, or one that should be easily shared with users for them to install themselves.

Level II—Seamless Upgrades: Operators at Level II offer the features of basic installation, with added functionality around upgrades. This includes upgrades for the Operand but also upgrades for the Operator itself.

Upgrades are a critical part of any application. As bug fixes are implemented and more features are added, being able to smoothly transition between versions helps ensure application uptime. An Operator that handles its own upgrades can either upgrade its Operand when it upgrades itself or manually upgrade its Operand by modifying the Operator's Custom Resource.

For seamless upgrades, an Operator must also be able to upgrade older versions of its Operand (which may exist because they were managed by an older version of the Operator). This kind of backward compatibility is essential for both upgrading to newer versions and handling rollbacks (for example, if a new version introduces a high-visibility bug that can't wait for an eventual fix to be published in a patch version).

Our example web application Operator could offer the same set of features. This means that if a new version of the application were released, the Operator could handle upgrading the deployed instances of the application to the newer version. Or, if changes were made to the Operator itself, then it could manage its own upgrades (and later upgrade the application, regardless of version skew between Operator and Operand).

Level III—Full Lifecycle: Level III Operators offer at least one out of a list of Operand lifecycle management features. Being able to offer management during the Operand's lifecycle implies that the Operator is more than just passively operating on a workload in a set and forget