Cloud Native with Kubernetes - Alexander Raul - E-Book

Cloud Native with Kubernetes E-Book

Alexander Raul

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Kubernetes is a modern cloud native container orchestration tool and one of the most popular open source projects worldwide. In addition to the technology being powerful and highly flexible, Kubernetes engineers are in high demand across the industry.
This book is a comprehensive guide to deploying, securing, and operating modern cloud native applications on Kubernetes. From the fundamentals to Kubernetes best practices, the book covers essential aspects of configuring applications. You’ll even explore real-world techniques for running clusters in production, tips for setting up observability for cluster resources, and valuable troubleshooting techniques. Finally, you’ll learn how to extend and customize Kubernetes, as well as gaining tips for deploying service meshes, serverless tooling, and more on your cluster.
By the end of this Kubernetes book, you’ll be equipped with the tools you need to confidently run and extend modern applications on Kubernetes.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 447

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Cloud Native with Kubernetes

Deploy, configure, and run modern cloud native applications on Kubernetes

Alexander Raul

BIRMINGHAM—MUMBAI

Cloud Native with Kubernetes

Copyright © 2020 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Karan Sadawana

Acquisition Editor: Rahul Nair

Senior Editor: Arun Nadar

Content Development Editor: Pratik Andrade

Technical Editor: Soham Amburle

Copy Editor: Safis Editing

Project Coordinator: Neil Dmello

Proofreader: Safis Editing

Indexer: Manju Arasan

Production Designer: Prashant Ghare

First published: January 2021

Production reference: 1031220

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-83882-307-8

www.packt.com

To my team at Rackner, my family, and my friends for their support in the process. To my girlfriend, for dealing with all the late nights of writing. And to the late Dan Kohn, in memoriam, for introducing me to and evangelizing the amazing Kubernetes community.

– Alexander Raul

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionalsImprove your learning with Skill Plans built especially for youGet a free eBook or video every monthFully searchable for easy access to vital informationCopy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Alexander Raul is CEO of Rackner, an innovative consultancy that builds, runs, and secures Kubernetes and the cloud for clients ranging from highly funded start-ups to Fortune and Global 500 enterprises. With Rackner, he has personally built and managed large Kubernetes-based platforms and implemented end-to-end DevSecOps for incredible organizations. Though his background and education are technical (he received an aerospace degree from the University of Maryland), he is well versed in the business and strategic arguments for the cloud and Kubernetes – as well as the issues around the adoption of these technologies. Alexander lives in Washington, D.C. – and when he isn't working with clients, he's mountaineering, skiing, or running.

About the reviewer

Zihao Yu is a senior staff software engineer at HBO in New York City. He has been instrumental in Kubernetes and other cloud-native practices and CI/CD projects within the company. He was a keynote speaker at KubeCon North America 2017. He holds a Master of Science degree in computer engineering from Rutgers, The State University of New Jersey, and a Bachelor of Engineering degree from Nanjing University of Science and Technology in China.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Preface

Section 1: Setting Up Kubernetes

Chapter 1: Communicating with Kubernetes

Technical requirements

Introducing container orchestration

What is container orchestration?

Benefits of container orchestration

Popular orchestration tools

Kubernetes' architecture

Kubernetes node types

The Kubernetes control plane

The Kubernetes API server

The Kubernetes scheduler

The Kubernetes controller manager

etcd

The Kubernetes worker nodes

kubelet

kube-proxy

The container runtime

Addons

Authentication and authorization on Kubernetes

Namespaces

Users

Authentication methods

Kubernetes' certificate infrastructure for TLS and security

Authorization options

RBAC

ABAC

Using kubectl and YAML

Setting up kubectl and kubeconfig

Imperative versus declarative commands

Writing Kubernetes resource YAML files

Summary

Questions

Further reading

Chapter 2: Setting Up Your Kubernetes Cluster

Technical requirements

Options for creating a cluster

minikube – an easy way to start

Installing minikube

Creating a cluster on minikube

Managed Kubernetes services

Benefits of managed Kubernetes services

Drawbacks of managed Kubernetes services

AWS – Elastic Kubernetes Service

Getting started

Google Cloud – Google Kubernetes Engine

Getting started

Microsoft Azure – Azure Kubernetes Service

Getting started

Programmatic cluster creation tools

Kubeadm

Kops

Kubespray

Creating a cluster with Kubeadm

Installing Kubeadm

Starting the master nodes

Starting the worker nodes

Setting up kubectl

Creating a cluster with Kops

Installing on macOS

Installing on Linux

Installing on Windows

Setting up credentials for Kops

Setting up state storage

Creating clusters

Creating a cluster completely from scratch

Provisioning your nodes

Creating the Kubernetes certificate authority for TLS

Creating config files

Creating an etcd cluster and configuring encryption

Bootstrapping the control plane component

Bootstrapping the worker node

Summary

Questions

Further reading

Chapter 3: Running Application Containers on Kubernetes

Technical requirements

What is a Pod?

Implementing Pods

Pod paradigms

Pod networking

Pod storage

Namespaces

The Pod life cycle

Understanding the Pod resource spec

Summary

Questions

Further reading

Section 2: Configuring and Deploying Applications on Kubernetes

Chapter 4: Scaling and Deploying Your Application

Technical requirements

Understanding Pod drawbacks and their solutions

Pod controllers

Using ReplicaSets

Replicas

Selector

Template

Testing a ReplicaSet

Controlling Deployments

Controlling Deployments with imperative commands

Harnessing the Horizontal Pod Autoscaler

Implementing DaemonSets

Understanding StatefulSets

Using Jobs

CronJobs

Putting it all together

Summary

Questions

Further reading

Chapter 5: Services and Ingress – Communicating with the Outside World

Technical requirement

Understanding Services and cluster DNS

Cluster DNS

Service proxy types

Implementing ClusterIP

Protocol

Using NodePort

Setting up a LoadBalancer Service

Creating an ExternalName Service

Configuring Ingress

Ingress controllers

Summary

Questions

Further reading

Chapter 6: Kubernetes Application Configuration

Technical requirements

Configuring containerized applications using best practices

Understanding ConfigMaps

Understanding Secrets

Implementing ConfigMaps

From text values

From files

From environment files

Mounting a ConfigMap as a volume

Mounting a ConfigMap as an environment variable

Using Secrets

From files

Manual declarative approach

Mounting a Secret as a volume

Mounting a Secret as an environment variable

Implementing encrypted Secrets

Checking whether your Secrets are encrypted

Disabling cluster encryption

Summary

Questions

Further reading

Chapter 7: Storage on Kubernetes

Technical requirements

Understanding the difference between volumes and persistent volumes

Volumes

Persistent volumes

Persistent volume claims

Attaching Persistent Volume Claims (PVCs) to Pods

Persistent volumes without cloud storage

Installing Rook

The rook-ceph-block storage class

The Rook Ceph filesystem

Summary

Questions

Further reading

Chapter 8: Pod Placement Controls

Technical requirements

Identifying use cases for Pod placement

Kubernetes node health placement controls

Applications requiring different node types

Applications requiring specific data compliance

Multi-tenant clusters

Multiple failure domains

Using node selectors and node name

Implementing taints and tolerations

Multiple taints and tolerations

Controlling Pods with node affinity

Using requiredDuringScheduling IgnoredDuringExecution node affinities

Using preferredDuringScheduling IgnoredDuringExecution node affinities

Multiple node affinities

Using inter-Pod affinity and anti-affinity

Pod affinities

Pod anti-affinities

Combined affinity and anti-affinity

Pod affinity and anti-affinity limitations

Pod affinity and anti-affinity namespaces

Summary

Questions

Further reading

Section 3: Running Kubernetes in Production

Chapter 9: Observability on Kubernetes

Technical requirements

Understanding observability on Kubernetes

Understanding what matters for Kubernetes cluster and application health

Using default observability tooling

Metrics on Kubernetes

Logging on Kubernetes

Installing Kubernetes Dashboard

Alerts and traces on Kubernetes

Enhancing Kubernetes observability using the best of the ecosystem

Introducing Prometheus and Grafana

Implementing the EFK stack on Kubernetes

Implementing distributed tracing with Jaeger

Third-party tooling

Summary

Questions

Further reading

Chapter 10: Troubleshooting Kubernetes

Technical requirements

Understanding failure modes for distributed applications

The network is reliable

Latency is zero

Bandwidth is infinite

The network is secure

The topology doesn't change

There is only one administrator

Transport cost is zero

The network is homogeneous

Troubleshooting Kubernetes clusters

Case study – Kubernetes Pod placement failure

Troubleshooting applications on Kubernetes

Case study 1 – Service not responding

Case study 2 – Incorrect Pod startup command

Case study 3 – Pod application malfunction with logs

Summary

Questions

Further reading

Chapter 11: Template Code Generation and CI/CD on Kubernetes

Technical requirements

Understanding options for template code generation on Kubernetes

Helm

Kustomize

Implementing templates on Kubernetes with Helm and Kustomize

Using Helm with Kubernetes

Using Kustomize with Kubernetes

Understanding CI/CD paradigms on Kubernetes – in-cluster and out-of-cluster

Out-of-cluster CI/CD

In-cluster CI/CD

Implementing in-cluster and out-of-cluster CI/CD with Kubernetes

Implementing Kubernetes CI with AWS Codebuild

Implementing Kubernetes CI with FluxCD

Summary

Questions

Further reading

Chapter 12: Kubernetes Security and Compliance

Technical requirements

Understanding security on Kubernetes

Reviewing CVEs and security audits for Kubernetes

Understanding CVE-2016-1905 – Improper admission control

Understanding CVE-2018-1002105 – Connection upgrading to the backend

Understanding the 2019 security audit results

Implementing tools for cluster configuration and container security

Using admission controllers

Enabling Pod security policies

Using network policies

Handling intrusion detection, runtime security, and compliance on Kubernetes

Installing Falco

Understanding Falco's capabilities

Mapping Falco to compliance and runtime security use cases

Summary

Questions

Further reading

Section 4: Extending Kubernetes

Chapter 13: Extending Kubernetes with CRDs

Technical requirements

How to extend Kubernetes with custom resource definitions

Writing a custom resource definition

Self-managing functionality with Kubernetes operators

Mapping the operator control loop

Designing an operator for a custom resource definition

Using cloud-specific Kubernetes extensions

Understanding the cloud-controller-manager component

Installing cloud-controller-manager

Understanding the cloud-controller-manager capabilities

Using external-dns with Kubernetes

Using the cluster-autoscaler add-on

Integrating with the ecosystem

Introducing the Cloud Native Computing Foundation

Summary

Questions

Further reading

Chapter 14: Service Meshes and Serverless

Technical requirements

Using sidecar proxies

Using NGINX as a sidecar reverse proxy

Using Envoy as a sidecar proxy

Adding a service mesh to Kubernetes

Setting up Istio on Kubernetes

Implementing serverless on Kubernetes

Using Knative for FaaS on Kubernetes

Using OpenFaaS for FaaS on Kubernetes

Summary

Questions

Further reading

Chapter 15: Stateful Workloads on Kubernetes

Technical requirements

Understanding stateful applications on Kubernetes

Popular Kubernetes-native stateful applications

Understanding strategies for running stateful applications on Kubernetes

Deploying object storage on Kubernetes

Installing the Minio Operator

Installing Krew and the Minio kubectl plugin

Starting the Minio Operator

Creating a Minio tenant

Accessing the Minio console

Running DBs on Kubernetes

Running CockroachDB on Kubernetes

Testing CockroachDB with SQL

Implementing messaging and queues on Kubernetes

Deploying RabbitMQ on Kubernetes

Summary

Questions

Further reading

Assessments

Chapter 1 – Communicating with Kubernetes

Chapter 2 – Setting Up Your Kubernetes Cluster

Chapter 3 – Running Application Containers on Kubernetes

Chapter 4 – Scaling and Deploying Your Application

Chapter 5 – Services and Ingress – Communicating with the Outside World

Chapter 6 – Kubernetes Application Configuration

Chapter 7 – Storage on Kubernetes

Chapter 8 – Pod Placement Controls

Chapter 9 – Observability on Kubernetes

Chapter 10 – Troubleshooting Kubernetes

Chapter 11 – Template Code Generation and CI/CD on Kubernetes

Chapter 12 – Kubernetes Security and Compliance

Chapter 13 – Extending Kubernetes with CRDs

Chapter 14 – Service Meshes and Serverless

Chapter 15 – Stateful Workloads on Kubernetes

Other Books You May Enjoy

Preface

The aim of this book is to give you the knowledge and the broad set of tools needed to build cloud-native applications using Kubernetes. Kubernetes is a powerful technology that gives engineers powerful tools to build cloud-native platforms using containers. The project itself is constantly evolving and contains many different tools to tackle common scenarios.

For the layout of this book, rather than sticking to any one niche area of the Kubernetes toolset, we will first give you a thorough summary of the most important parts of default Kubernetes functionality – giving you all the skills you need in order to run applications on Kubernetes. Then, we'll give you the tools you need in order to deal with security and troubleshooting for Kubernetes in a day 2 scenario. Finally, we'll go past the boundaries of Kubernetes itself and look at some powerful patterns and technologies to build on top of Kubernetes – such as service meshes and serverless.

Who this book is for

This book is for beginners to Kubernetes, but you should be well acquainted with containers and DevOps principles in order to get the most out of this book. A solid grounding in Linux will help but is not completely necessary.

What this book covers

Chapter 1, Communicating with Kubernetes, introduces you to the concept of container orchestration and the fundamentals of how Kubernetes works. It also gives you the basic tools you need in order to communicate with and authenticate with a Kubernetes cluster.

Chapter 2, Setting Up Your Kubernetes Cluster, walks you through creating a Kubernetes cluster in a few different popular ways, both on your local machine and on the cloud.

Chapter 3, Running Application Containers on Kubernetes, introduces you to the most basic building block of running applications on Kubernetes – the Pod. We cover how to create a Pod, as well as the specifics of the Pod lifecycle.

Chapter 4, Scaling and Deploying Your Application, reviews higher-level controllers, which allow the scaling and upgrading of multiple Pods of an application, including autoscaling.

Chapter 5, Services and Ingress – Communicating with the Outside World, introduces several approaches to exposing applications running in a Kubernetes cluster to users on the outside.

Chapter 6, Kubernetes Application Configuration, gives you the skills you need to provide configuration (including secure data) to applications running on Kubernetes.

Chapter 7, Storage on Kubernetes, reviews methods and tools to provide persistent and non-persistent storage to applications running on Kubernetes.

Chapter 8, Pod Placement Controls, introduces several different tools and strategies for controlling and influencing Pod placement on Kubernetes Nodes.

Chapter 9, Observability on Kubernetes, covers multiple tenets of observability in the context of Kubernetes, including metrics, tracing, and logging.

Chapter 10, Troubleshooting Kubernetes, reviews some key ways Kubernetes clusters can fail – as well as how to effectively triage issues on Kubernetes.

Chapter 11, Template Code Generation and CI/CD on Kubernetes, introduces Kubernetes YAML templating tooling and some common patterns for CI/CD on Kubernetes.

Chapter 12, Kubernetes Security and Compliance, covers the basics of security on Kubernetes, including some recent security issues with the Kubernetes project, and tooling for cluster and container security.

Chapter 13, Extending Kubernetes with CRDs, introduces Custom Resource Definitions (CRDs) along with other ways to add custom functionality to Kubernetes, such as operators.

Chapter 14, Service Meshes and Serverless, reviews some advanced patterns on Kubernetes, teaching you how to add a service mesh to your cluster and enable serverless workloads.

Chapter 15, Stateful Workloads on Kubernetes, walks you through the specifics of running stateful workloads on Kubernetes, including a tutorial on running some powerful stateful applications from the ecosystem.

To get the most out of this book

Since Kubernetes is based on containers, some examples in this book may use containers that have changed since publishing. Other illustrative examples may use containers that do not publicly exist in Docker Hub. These examples should be used as a basis for running your own application containers.

In some cases, open source software like Kubernetes can have breaking changes. The book is up to date with Kubernetes 1.19, but always check the documentation (for Kubernetes and for any of the other open source projects covered in the book) for the most up-to-date information and specifications.

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Cloud-Native-with-Kubernetes. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781838823078_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "In our case, we want to let every authenticated user on the cluster create privileged Pods, so we bind to the system:authenticated group."

A block of code is set as follows:

apiVersion: networking.k8s.io/v1

kind: NetworkPolicy

metadata:

name: full-restriction-policy

namespace: development

spec:

policyTypes:

- Ingress

- Egress

podSelector: {}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

spec:

privileged: false

allowPrivilegeEscalation: false

volumes:

- 'configMap'

- 'emptyDir'

- 'projected'

- 'secret'

- 'downwardAPI'

- 'persistentVolumeClaim'

hostNetwork: false

hostIPC: false

hostPID: false

Any command-line input or output is written as follows:

helm install falco falcosecurity/falco

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Prometheus also provides an Alerts tab for configuring Prometheus alerts."

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Setting Up Kubernetes

In this section, you'll learn what Kubernetes is for, how it is architected, and the basics of communicating with, and creating, a simple cluster, as well as how to run a basic workload.

This part of the book comprises the following chapters:

Chapter 1, Communicating with KubernetesChapter 2, Setting Up Your Kubernetes ClusterChapter 3, Running Application Containers on Kubernetes

Chapter 1: Communicating with Kubernetes

This chapter contains an explanation of container orchestration, including its benefits, use cases, and popular implementations. We'll also review Kubernetes briefly, including a layout of the architectural components, and a primer on authorization, authentication, and general communication with Kubernetes. By the end of this chapter, you'll know how to authenticate and communicate with the Kubernetes API.

In this chapter, we will cover the following topics:

A container orchestration primer Kubernetes' architecture Authentication and authorization on Kubernetes Using kubectl and YAML files

Technical requirements

In order to run the commands detailed in this chapter, you will need a computer running Linux, macOS, or Windows. This chapter will teach you how to install the kubectl command-line tool that you will use in all later chapters.

The code used in this chapter can be found in the book's GitHub repository at the following link:

https://github.com/PacktPublishing/Cloud-Native-with-Kubernetes/tree/master/Chapter1

Introducing container orchestration

We cannot talk about Kubernetes without an introduction of its purpose. Kubernetes is a container orchestration framework, so let's review what that means in the context of this book.

What is container orchestration?

Container orchestration is a popular pattern for running modern applications both in the cloud and the data center. By using containers – preconfigured application units with bundled dependencies – as a base, developers can run many instances of an application in parallel.

Benefits of container orchestration

There are quite a few benefits that container orchestration offers, but we will highlight the main ones. First, it allows developers to easily build high-availability applications. By having multiple instances of an application running, a container orchestration system can be configured in a way that means it will automatically replace any failed instances of the application with new ones.

This can be extended to the cloud by having those multiple instances of the application spread across physical data centers, so if one data center goes down, other instances of the application will remain, and prevent downtime.

Second, container orchestration allows for highly scalable applications. Since new instances of the application can be created and destroyed easily, the orchestration tool can auto-scale up and down to meet demand. Either in a cloud or data center environment, new Virtual Machines (VMs) or physical machines can be added to the orchestration tool to give it a bigger pool of compute to manage. This process can be completely automated in a cloud setting to allow for completely hands-free scaling, both at the micro and macro level.

Popular orchestration tools

There are several highly popular container orchestration tools available in the ecosystem:

Docker Swarm: Docker Swarm was created by the team behind the Docker container engine. It is easier to set up and run compared to Kubernetes, but somewhat less flexible.Apache Mesos: Apache Mesos is a lower-level orchestration tool that manages compute, memory, and storage, in both data center and cloud environments. By default, Mesos does not manage containers, but Marathon – a framework that runs on top of Mesos – is a fully fledged container orchestration tool. It is even possible to run Kubernetes on top of Mesos.Kubernetes: As of 2020, much of the work in container orchestration has consolidated around Kubernetes (koo-bur-net-ees), often shortened to k8s. Kubernetes is an open source container orchestration tool that was originally created by Google, with learnings from internal orchestration tools Borg and Omega, which had been in use at Google for years. Since Kubernetes became open source, it has risen in popularity to become the de facto way to run and orchestrate containers in an enterprise environment. There are a few reasons for this, including that Kubernetes is a mature product that has an extremely large open source community. It is also simpler to operate than Mesos, and more flexible than Docker Swarm.

The most important thing to take away from this comparison is that although there are multiple relevant options for container orchestration and some are indeed better in certain ways, Kubernetes has emerged as the de facto standard. With this in mind, let's take a look at how Kubernetes works.

Kubernetes' architecture

Kubernetes is an orchestration tool that can run on cloud VMs, on VMs running in your data center, or on bare metal servers. In general, Kubernetes runs on a set of nodes, each of which can each be a VM or a physical machine.

Kubernetes node types

Kubernetes nodes can be many different things – from a VM, to a bare metal host, to a Raspberry Pi. Kubernetes nodes are split into two distinct categories: first, the master nodes, which run the Kubernetes control plane applications; second, the worker nodes, which run the applications that you deploy onto Kubernetes.

In general, for high availability, a production deployment of Kubernetes should have a minimum of three master nodes and three worker nodes, though most large deployments have many more workers than masters.

The Kubernetes control plane

The Kubernetes control plane is a suite of applications and services that run on the master nodes. There are several highly specialized services at play that form the core of Kubernetes functionality. They are as follows:

kube-apiserver: This is the Kubernetes API server. This application handles instructions sent to Kubernetes. kube-scheduler: This is the Kubernetes scheduler. This component handles the work of deciding which nodes to place workloads on, which can become quite complex.kube-controller-manager: This is the Kubernetes controller manager. This component provides a high-level control loop that ensures that the desired configuration of the cluster and applications running on it is implemented. etcd: This is a distributed key-value store that contains the cluster configuration.

Generally, all of these components take the form of system services that run on every master node. They can be started manually if you wanted to bootstrap your cluster entirely by hand, but through the use of a cluster creation library or cloud provider-managed service such as Elastic Kubernetes Service (EKS), this will usually be done automatically in a production setting.

The Kubernetes API server

The Kubernetes API server is a component that accepts HTTPS requests, typically on port 443. It presents a certificate, which can be self-signed, as well as authentication and authorization mechanisms, which we will cover later in this chapter.

When a configuration request is made to the Kubernetes API server, it will check the current cluster configuration in etcd and change it if necessary.

The Kubernetes API is generally a RESTful API, with endpoints for each Kubernetes resource type, along with an API version that is passed in the query path; for instance, /api/v1.

For the purposes of extending Kubernetes (see Chapter 13, Extending Kubernetes with CRDs), the API also has a set of dynamic endpoints based on API groups, which can expose the same RESTful API functionality to custom resources.

The Kubernetes scheduler

The Kubernetes scheduler decides where instances of a workload should be run. By default, this decision is influenced by workload resource requirements and node status. You can also influence the scheduler via placement controls that are configurable in Kubernetes (see Chapter 8, Pod Placement Controls). These controls can act on node labels, which other pods are already running on a node, and many other possibilities.

The Kubernetes controller manager

The Kubernetes controller manager is a component that runs several controllers. Controllers run control loops that ensure that the actual state of the cluster matches that stored in the configuration. By default, these include the following:

The node controller, which ensures that nodes are up and runningThe replication controller, which ensures that each workload is scaled properlyThe endpoints controller, which handles communication and routing configuration for each workload (see Chapter 5, Services and Ingress – Communicating with the Outside World)Service account and token controllers, which handle the creation of API access tokens and default accounts

etcd

etcd is a distributed key-value store that houses the configuration of the cluster in a highly available way. An etcd replica runs on each master node and uses the Raft consensus algorithm, which ensures that a quorum is maintained before allowing any changes to the keys or values.

The Kubernetes worker nodes

Each Kubernetes worker node contains components that allow it to communicate with the control plane and handle networking.

First, there is the kubelet, which makes sure that containers are running on the node as dictated by the cluster configuration. Second, kube-proxy provides a network proxy layer to workloads running on each node. And finally, the container runtime is used to run the workloads on each node.

kubelet

The kubelet is an agent that runs on every node (including master nodes, though it has a different configuration in that context). Its main purpose is to receive a list of PodSpecs (more on those later) and ensure that the containers prescribed by them are running on the node. The kubelet gets these PodSpecs through a few different possible mechanisms, but the main way is by querying the Kubernetes API server. Alternately, the kubelet can be started with a file path, which it will monitor for a list of PodSpecs, an HTTP endpoint to monitor, or its own HTTP endpoint to receive requests on.

kube-proxy

kube-proxy is a network proxy that runs on every node. Its main purpose is to do TCP, UDP, and SCTP forwarding (either via stream or round-robin) to workloads running on its node. kube-proxy supports the Kubernetes Service construct, which we will discuss in Chapter 5, Services and Ingress – Communicating with the Outside World.

The container runtime

The container runtime runs on each node and is the component that actually runs your workloads. Kubernetes supports CRI-O, Docker, containerd, rktlet, and any valid Container Runtime Interface (CRI) runtime. As of Kubernetes v1.14, the RuntimeClass feature has been moved from alpha to beta and allows for workload-specific runtime selection.

Addons

In addition to the core cluster components, a typical Kubernetes installation includes addons, which are additional components that provide cluster functionality.

For example, Container Network Interface (CNI) plugins such as Calico, Flannel, or Weave provide overlay network functionality that adheres to Kubernetes' networking requirements.

CoreDNS, on the other hand, is a popular addon for in-cluster DNS and service discovery. There are also tools such as Kubernetes Dashboard, which provides a GUI for viewing and interacting with your cluster.

At this point, you should have a high-level idea of the major components of Kubernetes. Next, we will review how a user interacts with Kubernetes to control those components.

Authentication and authorization on Kubernetes

Namespaces are an extremely important concept in Kubernetes, and since they can affect API access as well as authorization, we'll cover them now.

Namespaces

A namespace in Kubernetes is a construct that allows you to group Kubernetes resources in your cluster. They are a method of separation with many possible uses. For instance, you could have a namespace in your cluster for each environment – dev, staging, and production.

By default, Kubernetes will create the default namespace, the kube-system namespace, and the kube-public namespace. Resources created without a specified namespace will be created in the default namespace. kube-system contains the cluster services such as etcd, the scheduler, and any resource created by Kubernetes itself and not users. kube-public is readable by all users by default and can be used for public resources.

Users

There are two types of users in Kubernetes – regular users and service accounts.

Regular users are generally managed by a service outside the cluster, whether they be private keys, usernames and passwords, or some form of user store. Service accounts however are managed by Kubernetes and restricted to specific namespaces. To create a service account, the Kubernetes API may automatically make one, or they can be made manually through calls to the Kubernetes API.

There are three possible types of requests to the Kubernetes API – those associated with a regular user, those associated with a service account, and anonymous requests.

Authentication methods

In order to authenticate requests, Kubernetes provides several different options: HTTP basic authentication, client certificates, bearer tokens, and proxy-based authentication.

To use HTTP authentication, the requestor sends requests with an Authorization header that will have the value bearer "token value".

In order to specify which tokens are valid, a CSV file can be provided to the API server application when it starts using the --token-auth-file=filename parameter. A new beta feature (as of the writing of this book), called Bootstrap Tokens, allows for the dynamic swapping and changing of tokens while the API server is running, without restarting it.

Basic username/password authentication is also possible via the Authorization token, by using the header value Basic base64encoded(username:password).

Kubernetes' certificate infrastructure for TLS and security

In order to use client certificates (X.509 certificates), the API server must be started using the --client-ca-file=filename parameter. This file needs to contain one or more Certificate Authorities (CAs) that will be used when validating certificates passed with API requests.

In addition to the CA, a Certificate Signing Request (CSR) must be created for each user. At this point, user groups can be included, which we will discuss in the Authorization options section.

For instance, you can use the following:

openssl req -new -key myuser.pem -out myusercsr.pem -subj "/CN=myuser/0=dev/0=staging"

This will create a CSR for the user myuser who is part of groups named dev and staging.

Once the CA and CSR are created, the actual client and server certificates can be created using openssl, easyrsa, cfssl, or any certificate generation tool. TLS certificates for the Kubernetes API can also be created at this point.

Since our aim is to get you started running workloads on Kubernetes as soon as possible, we will leave all the various possible certificate configurations out of this book – but both the Kubernetes documentation and the article Kubernetes The Hard Way have some great tutorials on setting up a cluster from scratch. In the majority of production settings, you will not be doing these steps manually.

Authorization options

Kubernetes provides several authorization methods: nodes, webhooks, RBAC, and ABAC. In this book, we will focus on RBAC and ABAC as they are the ones used most often for user authorization. If you extend your cluster with other services and/or custom features, the other authorization modes may become more important.

RBAC

RBAC stands for Role-Based Access Control and is a common pattern for authorization. In Kubernetes specifically, the roles and users of RBAC are implemented using four Kubernetes resources: Role, ClusterRole, RoleBinding, and ClusterRoleBinding. To enable RBAC mode, the API server can be started with the --authorization-mode=RBAC parameter.

Role and ClusterRole resources specify a set of permissions, but do not assign those permissions to any specific users. Permissions are specified using resources and verbs. Here is a sample YAML file specifying a Role. Don't worry too much about the first few lines of the YAML file – we'll get to those soon. Focus on the resources and verbs lines to see how the actions can be applied to resources:

Read-only-role.yaml

apiVersion: rbac.authorization.k8s.io/v1

kind: Role

metadata:

  namespace: default

  name: read-only-role

rules:

- apiGroups: [""]

  resources: ["pods"]

  verbs: ["get", "list"]

The only difference between a Role and ClusterRole is that a Role is restricted to a particular namespace (in this case, the default namespace), while a ClusterRole can affect access to all resources of that type in the cluster, as well as cluster-scoped resources such as nodes.

RoleBinding and ClusterRoleBinding are resources that associate a Role or ClusterRole with a user or a list of users. The following file represents a RoleBinding resource to connect our read-only-role with a user, readonlyuser:

Read-only-rb.yaml

apiVersion: rbac.authorization.k8s.io/v1namespace.

kind: RoleBinding

metadata:

  name: read-only

  namespace: default

subjects:

- kind: User

  name: readonlyuser

  apiGroup: rbac.authorization.k8s.io

roleRef:

  kind: Role

  name: read-only-role

  apiGroup: rbac.authorization.k8s.io

The subjects key contains a list of all entities to associate a role with; in this case, the user alex. roleRef contains the name of the role to associate, and the type (either Role or ClusterRole).

ABAC

ABAC stands for Attribute-Based Access Control. ABAC works using policies instead of roles. The API server is started in ABAC mode with a file called an authorization policy file, which contains a list of JSON objects called policy objects. To enable ABAC mode, the API server can be started with the --authorization-mode=ABAC and --authorization-policy-file=filename parameters.

In the policy file, each policy object contains information about a single policy: firstly, which subjects it corresponds to, which can be either users or groups, and secondly, which resources can be accessed via the policy. Additionally, a Boolean readonly value can be included to limit the policy to list, get, and watch operations.

A secondary type of policy is associated not with a resource, but with types of non-resource requests, such as calls to the /version endpoint.

When a request to the API is made in ABAC mode, the API server will check the user and any group it is a part of against the list in the policy file, and see if any policies match the resource or endpoint that the user is trying to access. On a match, the API server will authorize the request.

You should have a good understanding now of how the Kubernetes API handles authentication and authorization. The good news is that while you can directly access the API, Kubernetes provides an excellent command-line tool to simply authenticate and make Kubernetes API requests.

Using kubectl and YAML

kubectl is the officially supported command-line tool for accessing the Kubernetes API. It can be installed on Linux, macOS, or Windows.

Setting up kubectl and kubeconfig

To install the newest release of kubectl, you can use the installation instructions at https://kubernetes.io/docs/tasks/tools/install-kubectl/.

Once kubectl is installed, it needs to be set up to authenticate with one or more clusters. This is done using the kubeconfig file, which looks like this:

Example-kubeconfig

apiVersion: v1

kind: Config

preferences: {}

clusters:

- cluster:

    certificate-authority: fake-ca-file

    server: https://1.2.3.4

  name: development

users:

- name: alex

  user:

    password: mypass

    username: alex

contexts:

- context:

    cluster: development

    namespace: frontend

    user: developer

  name: development

This file is written in YAML and is very similar to other Kubernetes resource specifications that we will get to shortly – except that this file lives only on your local machine.

There are three sections to a Kubeconfig YAML file: clusters, users, and contexts:

The clusters section is a list of clusters that you will be able to access via kubectl, including the CA filename and server API endpoint.The users section lists users that you will be able to authorize with, including any user certificates or username/password combinations for authentication.Finally, the contexts section lists combinations of a cluster, a namespace, and a user that combine to make a context. Using the kubectl config use-context command, you can easily switch between contexts, which allows easy switching between cluster, user, and namespace combinations.

Imperative versus declarative commands

There are two paradigms for talking to the Kubernetes API: imperative and declarative. Imperative commands allow you to dictate to Kubernetes "what to do" – that is, "spin up two copies of Ubuntu," "scale this application to five copies," and so on.

Declarative commands, on the other hand, allow you to write a file with a specification of what should be running on the cluster, and have the Kubernetes API ensure that the configuration matches the cluster configuration, updating it if necessary.

Though imperative commands allow you to quickly get started with Kubernetes, it is far better to write some YAML and use a declarative configuration when running production workloads, or workloads of any complexity. The reason for this is that it makes it easier to track changes, for instance via a GitHub repo, or introduce Git-driven Continous Integration/Continuous Delivery (CI/CD) to your cluster.

Some basic kubectl commands

kubectl provides many convenient commands for checking the current state of your cluster, querying resources, and creating new ones. kubectl is structured so most commands can access resources in the same way.

First, let's learn how to see Kubernetes resources in your cluster. You can do this by using kubectl get resource_type where resource_type is the full name of the Kubernetes resource, or alternately, a shorter alias. A full list of aliases (and kubectl commands) can be found in the kubectl documentation at https://kubernetes.io/docs/reference/kubectl/overview.

We already know about nodes, so let's start with that. To find which nodes exist in a cluster, we can use kubectl get nodes or the alias kubectl get no.

kubectl's get commands return a list of Kubernetes resources that are currently in the cluster. We can run this command with any Kubernetes resource type. To add additional information to the list, you can add the wide output flag: kubectl get nodes -o wide.

Listing resources isn't enough, of course – we need to be able to see the details of a particular resource. For this, we use the describe command, which works similarly to get, except that we can optionally pass the name of a specific resource. If this last parameter is omitted, Kubernetes will return the details of all resources of that type, which will probably result in a lot of scrolling in your terminal.

For example, kubectl describe nodes will return details for all nodes in the cluster, while kubectl describe nodes node1 will return a description of the node named node1.

As you've probably noticed, these commands are all in the imperative style, which makes sense since we're just fetching information about existing resources, not creating new ones. To create a Kubernetes resource, we can use the following:

kubectl create -f /path/to/file.yaml, which is an imperative commandkubectl apply -f /path/to/file.yaml, which is declarative

Both commands take a path to a file, which can be either YAML or JSON – or you can just use stdin. You can also pass in the path to a folder instead of a file, which will create or apply all YAML or JSON files in that folder. create works imperatively, so it will create a new resource, but if you run it again with the same file, the command will fail since the resource already exists. apply works declaratively, so if you run it the first time it will create the resource, and subsequent runs will update the running resource in Kubernetes with any changes. You can use the --dry-run flag to see the output of the create or apply commands (that is, what resources will be created, or any errors if they exist).

To update existing resources imperatively, use the edit command like so: kubectl edit resource_type resource_name – just like with our describe command. This will open up the default terminal editor with the YAML of the existing resource, regardless of whether you created it imperatively or declaratively. You can edit this and save as usual, which will trigger an automatic update of the resource in Kubernetes.

To update existing resources declaratively, you can edit your local YAML resource file that you used to create the resource in the first place, then run kubectl apply -f /path/to/file.yaml. Deleting resources is best accomplished via the imperative command kubectl delete resource_type resource_name.

The last command we'll talk about in this section is kubectl cluster-info, which will show the IP addresses where the major Kubernetes cluster services are running.

Writing Kubernetes resource YAML files

For communicating with the Kubernetes API declaratively, formats of both YAML and JSON are allowed. For the purposes of this book, we will stick to YAML since it is a bit cleaner and takes up less space on the page. A typical Kubernetes resource YAML file looks like this:

resource.yaml

apiVersion: v1

kind: Pod

metadata:

  name: my-pod

spec:

  containers:

  - name: ubuntu

    image: ubuntu:trusty

    command: ["echo"]

    args: ["Hello Readers"]

A valid Kubernetes YAML file has four top-level keys at a minimum. They are apiVersion, kind, metadata, and spec.

apiVersion dictates which version of the Kubernetes API will be used to create the resource. kind specifies what type of resource the YAML file is referencing. metadata provides a location to name the resource, as well as adding annotations and name-spacing information (more on that later). And finally, the spec key will contain all the resource-specific information that Kubernetes needs to create the resource in your cluster.

Don't worry about kind and spec quite yet – we'll get to what a Pod is in Chapter 3, Running Application Containers on Kubernetes.

Summary

In this chapter, we learned the background behind container orchestration, an architectural overview of a Kubernetes cluster, how a cluster authenticates and authorizes API calls, and how to communicate with the API via imperative and declarative patterns using kubectl, the officially supported command-line tool for Kubernetes.

In the next chapter, we'll learn several ways to get started with a test cluster, and master harnessing the kubectl commands you've learned so far.

Questions

What is container orchestration?What are the constituent parts of the Kubernetes control plane, and what do they do?How would you start the Kubernetes API server in ABAC authorization mode?Why is it important to have more than one master node for a production Kubernetes cluster?What is the difference between kubectl apply and kubectl create?How would you switch between contexts using kubectl?What are the downsides of creating a Kubernetes resource declaratively and then editing it imperatively?

Further reading

The official Kubernetes documentation: https://kubernetes.io/docs/home/Kubernetes The Hard Way: https://github.com/kelseyhightower/kubernetes-the-hard-way

Chapter 2: Setting Up Your Kubernetes Cluster

This chapter contains a review of some of the possibilities for creating a Kubernetes cluster, which we'll need to be able to learn the rest of the concepts in this book. We'll start with minikube, a tool to create a simple local cluster, then touch on some additional, more advanced (and production-ready) tools and review the major managed Kubernetes services from public cloud providers, before we finally introduce the strategies for creating a cluster from scratch.

In this chapter, we will cover the following topics:

Options for creating your first clusterminikube – an easy way to start Managed services – EKS, GKE, AKS, and moreKubeadm – simple conformanceKops – infrastructure bootstrappingKubespray – Ansible-powered cluster creationCreating a cluster completely from scratch

Technical requirements

In order to run the commands in this chapter, you will need to have the kubectl tool installed. Installation instructions are available in Chapter 1, Communicating with Kubernetes.

If you are actually going to create a cluster using any of the methods in this chapter, you will need to review the specific technical requirements for each method in the relevant project's documentation. For minikube specifically, most machines running Linux, macOS, or Windows will work. For large clusters, please review the specific documentation of the tool you plan to use.

The code used in this chapter can be found in the book's GitHub repository at the following link:

https://github.com/PacktPublishing/Cloud-Native-with-Kubernetes/tree/master/Chapter2

Options for creating a cluster

There are many ways to create a Kubernetes cluster, ranging from simple local tools all the way to fully creating a cluster from scratch.

If you're just getting started with learning Kubernetes, you'll probably want to spin up a simple local cluster with a tool such as minikube.

If you're looking to build a production cluster for an application, you have several options:

You can use a tool such as Kops, Kubespray, or Kubeadm to create the cluster programmatically.You can use a managed Kubernetes service.You can create a cluster completely from scratch on VMs or physical hardware.

Unless you have extremely specific demands in terms of cluster configuration (and even then), it is not usually recommended to create your cluster completely from scratch without using a bootstrapping tool.

For most use cases, the decision will be between using a managed Kubernetes service on a cloud provider and using a bootstrapping tool.

In air-gapped systems, using a bootstrapping tool is the only way to go – but some are better than others for particular use cases. In particular, Kops is aimed at making it easier to create and manage clusters on cloud providers such as AWS.

Important note

Not included in this section is a discussion of alternative third-party managed services or cluster creation and administration tools such as Rancher or OpenShift. When making a selection for running clusters in production, it is important to take into account a large variety of factors including the current infrastructure, business requirements, and much more. To keep things simple, in this book we will focus on production clusters, assuming no other infrastructure or hyper-specific business needs – a "clean slate," so to speak.

minikube – an easy way to start

minikube is the easiest way to get started with a simple local cluster. This cluster won't be set up for high availability, and is not aimed at production uses, but it is a great way to get started running workloads on Kubernetes in minutes.

Installing minikube

minikube can be installed on Windows, macOS, and Linux. What follows is the installation instructions for all three platforms, which you can also find by navigating to https://minikube.sigs.k8s.io/docs/start.

Installing on Windows

The easiest installation method on Windows is to download and run the minikube installer from https://storage.googleapis.com/minikube/releases/latest/minikube-installer.exe.

Installing on macOS

Use the following command to download and install the binary. You can find it in the code repository as well:

Minikube-install-mac.sh

     curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-darwin-amd64 \

&& sudo install minikube-darwin-amd64 /usr/local/bin/minikube

Installing on Linux

Use the following command to download and install the binary:

Minikube-install-linux.sh

curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 \

&& sudo install minikube-linux-amd64 /usr/local/bin/minikube

Creating a cluster on minikube

To get started with a cluster on minikube, simply run minikube start, which will create a simple local cluster with the default VirtualBox VM driver. minikube also has several additional configuration options that can be reviewed at the documentation site.

Running the minikubestart command will automatically configure your kubeconfig file so you can run kubectl commands without any further configuration on your newly created cluster.

Managed Kubernetes services

The number of managed cloud providers that offer a managed Kubernetes service is always growing. However, for the purposes of this book, we will focus on the major public clouds and their particular Kubernetes offerings. This includes the following:

Amazon Web Services (AWS) – Elastic Kubernetes Service (EKS)Google Cloud – Google Kubernetes Engine (GKE)Microsoft Azure – Azure Kubernetes Service (AKS)

Important note

The number and implementation of managed Kubernetes services is always changing. AWS, Google Cloud, and Azure were selected for this section of the book because they are very likely to continue working in the same manner. Whatever managed service you use, make sure to check the official documentation provided with the service to ensure that the cluster creation procedure is still the same as what is presented in this book.

Benefits of managed Kubernetes services

Generally, the major managed Kubernetes service offerings provide a few benefits. Firstly, all three of the managed service offerings we're reviewing provide a completely managed Kubernetes control plane.

This means that when you use one of these managed Kubernetes services, you do not need to worry about your master nodes. They are abstracted away and may as well not exist. All three of these managed clusters allow you to choose the number of worker nodes when creating a cluster.

Another benefit of a managed cluster is seamless upgrades from one version of Kubernetes to another. Generally, once a new version of Kubernetes (not always the newest version) is validated for the managed service, you should be able to upgrade using a push button or a reasonably simple procedure.

Drawbacks of managed Kubernetes services

Although a managed Kubernetes cluster can make operations easier in many respects, there are also some downsides.

For many of the managed Kubernetes services available, the minimum cost for a managed cluster far exceeds the cost of a minimal cluster created manually or with a tool such as Kops. For production use cases, this is generally not as much of an issue because a production cluster should contain a minimum amount of nodes anyway, but for development environments or test clusters, the additional cost may not be worth the ease of operations depending on the budget.

Additionally, though abstracting away master nodes makes operations easier, it also prevents fine tuning or advanced master node functionality that may otherwise be available on clusters with defined masters.

AWS – Elastic Kubernetes Service

AWS' managed Kubernetes service is called EKS, or Elastic Kubernetes Service. There are a few different ways to get started with EKS, but we'll cover the simplest way.

Getting started

In order to create an EKS cluster, you must provision the proper Virtual Private Cloud (VPC) and Identity and Access Management (IAM) role settings – at which point you can create a cluster through the console. These settings can be created manually through the console, or through infrastructure provisioning tools such as CloudFormation and Terraform. Full instructions for creating a cluster through the console can be found at https://docs.aws.amazon.com/en_pv/eks/latest/userguide/getting-started-console.html.

Assuming you're creating a cluster and VPC from scratch, however, you can instead use a tool called eksctl