36,59 €
Gain hands-on experience of installing OpenShift Origin 3.9 in a production configuration and managing applications using the platform you built
Key Features
Book Description
Docker containers transform application delivery technologies to make them faster and more reproducible, and to reduce the amount of time wasted on configuration. Managing Docker containers in the multi-node or multi-datacenter environment is a big challenge, which is why container management platforms are required. OpenShift is a new generation of container management platforms built on top of both Docker and Kubernetes. It brings additional functionality to the table, something that is lacking in Kubernetes. This new functionality significantly helps software development teams to bring software development processes to a whole new level.
In this book, we'll start by explaining the container architecture, Docker, and CRI-O overviews. Then, we'll look at container orchestration and Kubernetes. We'll cover OpenShift installation, and its basic and advanced components. Moving on, we'll deep dive into concepts such as deploying application OpenShift. You'll learn how to set up an end-to-end delivery pipeline while working with applications in OpenShift as a developer or DevOps. Finally, you'll discover how to properly design OpenShift in production environments.
This book gives you hands-on experience of designing, building, and operating OpenShift Origin 3.9, as well as building new applications or migrating existing applications to OpenShift.
What you will learn
Who this book is for
The book is for system administrators, DevOps engineers, solutions architects, or any stakeholder who wants to understand the concept and business value of OpenShift.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 463
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Gebin GeorgeAcquisition Editor: Shrilekha InaniContent Development Editor: Ronn KurienTechnical Editor: Prachi SawantCopy Editor: Safis EditingProject Coordinator: Kinjal BariProofreader: Safis EditingIndexer: Mariammal ChettiyarGraphics: Tom ScariaProduction Coordinator: Shraddha Falebhai
First published: July 2018
Production reference: 1270718
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78899-232-9
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Denis Zuev is a worldwide IT expert with 10+ years' experience. Some people in the industry think that he is not human, and not without reason. His areas of expertise are networks, servers, storage, the cloud, containers, DevOps, SDN/NFV, automation, programming, and web development; you name it, he can do it. He is also known for his certification achievements. At the moment, he holds the following expert-level industry certifications: RHCI, RHCA.VI, 6xCCIE, 4xJNCIE, CCDE, HCIE, and VCIX-NV. He is a contractor and an instructor who works with top companies, including Cisco, Juniper, Red Hat, and ATT.
Artemii Kropachev is a worldwide IT expert and international consultant with more than 15 years' experience. He has trained, guided, and consulted hundreds of architects, engineers, developers, and IT experts around the world since 2001. His architect-level experience covers solution development, data centers, clouds, DevOps, middleware, and SDN/NFV solutions built on top of any Red Hat or open source technologies. He also possesses one of the highest Red Hat certification levels in the world – RHCA Level XIX.
Aleksey Usov has been working in the IT industry for more than 8 years, including in the position of infrastructure architect on projects of a national scale. He is also an expert in Linux with experience encompassing various cloud and automation technologies, including OpenShift, Kubernetes, OpenStack, and Puppet. At the time of writing, he holds the highest Red Hat certification level in CIS and Russia – RHCA Level XIV.
Our personal gratitude to Oleg Babkin, Evgenii Dos, Roman Gorshunov, and Aleksandr Varlamov for all the help with this book. We would like to thank International Computer Concepts for Supermicro servers we used for most of the tests, and Li9 Technology Solutions for many ideas in this book. And of course, we would like to thank our families for their patience and support, without them this book would not be possible.
Gopal Ramachandran is a Red Hat-certified OpenShift expert, and advocate of DevOps and cloud-native technology. Originally from India, where he began his career as a developer, Gopal is now a long-term expat (since 2007) in the Netherlands. He works as a consultant at Devoteam, helping several large and small enterprises in Europe with their digital transformation journeys. He is actively engaged with the local tech community, and shares his thoughts via his Twitter handle @goposky.
Roman Gorshunov is an IT architect and engineer with over 13 years of experience in the industry, primarily focusing on infrastructure solutions running on Linux and UNIX systems for Telecoms. He is currently working on design and development of automated OpenStack on Kubernetes-resilient cloud deployment (OpenStack Helm) and CI/CD for the AT&T Network Cloud based on OpenStack Airship.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Learn OpenShift
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the authors
Acknowledgments
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Containers and Docker Overview
Technical requirements
Containers overview
Container features and advantages
Efficient hardware resource consumption
Application and service isolation
Faster deployment
Microservices architecture
The stateless nature of containers
Docker container architecture
Docker architecture
Docker's main components
Linux containers
Understanding Docker images and layers
Container filesystem
Docker storage drivers
Container image layers
Docker registries
Public registry
Private registry
Accessing registries
Docker Hub overview
Docker installation and configuration
Docker installation
Docker configuration
Using the Docker command line
Using Docker man, help, info
Managing images using Docker CLI
Working with images
Saving and loading images
Uploading images to the Docker registry
Managing containers using Docker CLI
Docker ps and logs
Executing commands inside a container
Starting and stopping containers
Docker port mapping
Inspecting the Docker container
Removing containers
Using environment variables
Passing environment variables to a container
Linking containers
Using persistent storage
Creating a custom Docker image
Customizing images using docker commit
Using Dockerfile build
Using Docker history 
Dockerfile instructions 
Summary
Questions
Further reading
Kubernetes Overview
Technical requirements
Container management systems overview
Kubernetes versus Docker Swarm
Kubernetes key concepts
Kubernetes installation and configuration
Working with kubectl
Getting help
Using the kubectl get command
Running Kubernetes pods
Describing Kubernetes resources 
Editing Kubernetes resources
Exposing Kubernetes services
Using Kubernetes labels
Deleting Kubernetes resources
Kubernetes advanced resources
Creating kubernetes services using YAML and JSON files
Clearing the virtual environment
Kubernetes limitations
Summary
Questions
Further reading
CRI-O Overview
Technical requirements
Container Runtime and Container Runtime Interface
CRI-O and Open Container Initiative
How CRI-O works with Kubernetes
Installing and working with CRI-O
Stopping your virtual environment
Summary
Questions
Further reading
OpenShift Overview
Cloud technology landscape and the role of PaaS
OpenShift as an extension of Kubernetes
Understanding OpenShift's business value
OpenShift flavors
OpenShift architecture
Summary
Questions
Further reading
Building an OpenShift Lab
Technical requirements
Why use a development environment?
Deployment variants
Working with oc cluster up
System requirements and prerequisites
CentOS 7
macOS
Windows
Accessing OpenShift through a web browser
Working with Minishift
Working with Vagrant
Vagrant installation
Installing OpenShift with an all-in-one Vagrant box
Summary
Questions
Further reading
OpenShift Installation
Technical requirements
Prerequisites
Hardware requirements
Overview of OpenShift installation methods
RPM installation
Containerized installation
Deployment scenarios
Environment preparation
Docker
SELinux
Ansible installation
SSH access
Advanced installation
OpenShift Ansible inventory
OpenShift Ansible playbooks
Installation
Validation
Summary
Questions
Further reading
Managing Persistent Storage
Technical requirements
Persistent versus ephemeral storage
The OpenShift persistent storage concept
Persistent Volumes
Persistent Volume Claims
The storage life cycle in OpenShift
Storage backends comparison
Storage infrastructure setup
Setting up NFS
Installing NFS packages on the server and clients
Configuring NFS exports on the server
Starting and enabling the NFS service
Verification
Configuring GlusterFS shares
Installing packages
Configuring a brick and volume
Configuring iSCSI
Client-side verification
NFS verification
GlusterFS verification
iSCSI verification
Configuring Physical Volumes (PV)
Creating PVs for NFS shares
Creating a PV for the GlusterFS volume
PV for iSCSI
Using persistent storage in pods
Requesting persistent volume
Binding a PVC to a particular PV
Using claims as volumes in pod definition
Managing volumes through oc volume
Persistent data for a database container
Summary
Questions
Further reading
Core OpenShift Concepts
Managing projects in OpenShift
Managing users in OpenShift
Creating new applications in OpenShift
Managing pods in OpenShift
Managing services in OpenShift
Managing routes in OpenShift
Summary
Questions
Further reading
Advanced OpenShift Concepts
Technical requirements
Tracking the version history of images using ImageStreams
Importing images
Creating applications directly from Docker images
Manually pushing images into the internal registry
Separating configuration from application code using ConfigMaps
Controlling resource consumption using ResourceQuotas
Controlling resource consumption using LimitRanges
Creating complex stacks of applications with templates
Autoscaling your application depending on CPU and RAM utilization
CPU-based autoscaling
Memory-based autoscaling
Summary
Questions
Further reading
Security in OpenShift
Technical requirements
Authentication
Users and identities
Service accounts
Identity providers
AllowAll
DenyAll
HTPasswd
LDAP
Authorization and role-based access control
Using built-in roles
Creating custom roles
Admission controllers
Security context constraints
Storing sensitive data in OpenShift
What data is considered sensitive?
Secrets
Summary
Questions
Further reading
Managing OpenShift Networking
Technical requirements
Network topology in OpenShift
Tracing connectivity
SDN plugins
ovs-subnet plugin
ovs-multitenant plugin
ovs-networkpolicy plugin
Egress routers
Static IPs for external project traffic
Egress network policies
DNS
Summary
Questions
Further reading
Deploying Simple Applications in OpenShift
Technical requirements
Manual application deployment
Creating a pod
Creating a service
Creating a service using oc expose
Creating a service from a YAML definition
Creating a route
Creating a route by using oc expose
Creating a route from a YAML definition
Using oc new-app
The oc new-app command
Using oc new-app with default options
Advanced deployment
Deploying MariaDB
Summary
Questions
Further reading
Deploying Multi-Tier Applications Using Templates
Technical requirements
OpenShift template overview
Template syntax
Adding templates
Displaying template parameters
Processing a template
Creating a custom template
Developing YAML/JSON template definitions
Exporting existing resources as templates
Using the oc new-app -o command
Using templates to deploy a multi-tier application
The Gogs application template
Creating the Gogs application
Summary
Questions
Further reading
Building Application Images from Dockerfile
Technical requirements
Dockerfile development for OpenShift
Building an application from Dockerfile
A simple Dockerfile build
Dockerfile build customization
Summary
Questions
Further reading
Building PHP Applications from Source Code
Technical requirements
PHP S2I
Building a simple PHP application
Understanding the PHP build process
Starting a new build
Summary
Questions
Further reading
Building a Multi-Tier Application from Source Code
Technical requirements
Building a multi-tier application
WordPress template
Building a WordPress application
Summary
Questions
CI/CD Pipelines in OpenShift
Technical requirements
CI/CD and CI/CD pipelines
Jenkins as CI/CD
Jenkins in OpenShift
Creating Jenkins pipelines in OpenShift
Starting a Jenkins pipeline
Editing Jenkinsfile
Managing pipeline execution
Summary
Questions
Further reading
OpenShift HA Architecture Overview
What is high availability?
HA in OpenShift
Virtual IPs
IP failover
OpenShift infrastructure nodes  
OpenShift masters 
OpenShift etcd
OpenShift nodes  
External storage for OpenShift persistent data
OpenShift backup and restore
Etcd key-value store backup
OpenShift masters 
OpenShift nodes 
Persistent storage  
Summary
Questions
Further reading
OpenShift HA Design for Single and Multiple DCs
OpenShift single-DC HA design
OpenShift infrastructure nodes
OpenShift masters
OpenShift nodes
Etcd key-value store
Persistent storage
Physical placement consideration
Design considerations
OpenShift multi-DC HA design
One OpenShift cluster across all data centers
One OpenShift cluster per data center
Networking
Storage
Application deployment
Summary
Questions
Further reading
Network Design for OpenShift HA
Common network topologies for OpenShift deployments
Data center networks
Access layer switches
Core layer switches
Edge firewalls
Load balancers
Border routers
Cloud networks
SDN
Security groups
Load balancers
Network Address Translation (NAT) gateways
Commonly made mistakes while designing networks for OpenShift
General network requirements and design guidelines for OpenShift deployments
Summary
Questions
Further reading
What is New in OpenShift 3.9?
Major changes in OpenShift 3.9
What to expect from the following OpenShift releases
Summary
Questions
Further reading
Assessments
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Chapter 14
Chapter 15
Chapter 16
Chapter 17
Chapter 18
Chapter 19
Chapter 20
Chapter 21
Other Books You May Enjoy
Leave a review - let other readers know what you think
OpenShift is an application management platform that leverages Docker as an isolated runtime for running applications and Kubernetes for container orchestration. First introduced on May 4, 2011, it drew heavily from Borg— a container orchestration solution developed by Google engineers for managing hundreds of thousands of containers. In September 2014, it was redesigned, with Docker and Kubernetes becoming its main building blocks, and, since then, it has seen a large number of improvements and a growing community of users and developers. At the time of writing, the most recent version of OpenShift is 3.9, which became generally available on March 28, 2018, with 3.10 under development. Release 3.9 was the next one after 3.7, so technically, it incorporates changes intended for 3.8 as well, and represents a significant step in its life cycle.
Relying on Docker, OpenShift brings the true power of containerization to businesses, allowing them to respond quickly to ever-increasing demand from customers, and to maintain a good reputation by supporting high-availability and multi-data center deployments out of the box. From a business perspective, OpenShift reduces the costs associated with your investment by 531% over five years, with average annual benefits of $1.29 million and a packback period of 8 months of payback period—more details can be found at https://cdn2.hubspot.net/hubfs/4305976/s3-files/idc-business-value-of-openshift-snapshot.pdf.
Developers will find OpenShift's self-service portal easy to use, providing quick access to all features and deployment strategies, supporting unmodified source code, as well as Docker images and Dockerfiles, allowing developers to concentrate on development instead of managing their environment. OpenShift can automate every aspect of your release management by relying on Jenkins pipelines and integration with SCM.
For operations, OpenShift provides automatic failover and recovery, as well as high-availability and scalability, meaning that operations teams can spend their time on more high-level tasks. It can also be integrated with various SDN technologies developed by vendors other that Red Hat. And the fact that it relies on well-known technologies makes the learning curve shallow.
From a security standpoint, OpenShift can be integrated with corporate identity management solutions for user management and role assignment. It can expose applications to corporate security infrastructure for granular access control and auditing, protect sensitive data used by your applications, and manage access between different applications.
Examples in this book are demonstrated on OpenShift Origin 3.9, but all of them are applicable to Red Hat OpenShift Container PlatformTM as well due to their technical equivalence. The book is built in modular fashion, so if you feel that you are already familiar with certain topics, feel free to move on to another chapter.
This book is written for professionals who are new to OpenShift, but it also covers some advanced topics as well, such as CI/CD pipelines, high availability, and multi-data center setups. Readers do not require any background in Docker, Kubernetes, or OpenShift, although familiarity with the basic concepts will be beneficial. The book doesn't cover how to work with Linux though, so at least a year of previous experience with Linux is expected. The primary goal of this book is not so much about theoretical knowledge, as it is about hands-on experience, which is why we use a practical approach with virtual labs where possible. The book starts by introducing readers to the concept and benefits of containers in general, in order to get newcomers up to speed quickly, and then builds on that foundation to guide readers through the basic and advanced concepts of Kubernetes and OpenShift. The book finishes by providing readers with an architectural reference for a highly available multi-data center setup. Before we started working on this book, we realized that there is very little information available on how to deploy OpenShift in multiple data centers for high availability and fault tolerance. Due in no small part to that, we decided to pool our experience and collaborate on writing this book.
Chapter 1, Containers and Docker Overview, discusses containers and how to use Docker to build images and run containers.
Chapter 2, Kubernetes Overview, explains how Kubernetes orchestrates Docker containers and how to work with its CLI.
Chapter 3, CRI-O Overview, presents CRI-O as a container runtime interface and explains its integration with Kubernetes.
Chapter 4, OpenShift Overview, explains the role of OpenShift as a PaaS and covers the flavors it is available in.
Chapter 5, Building an OpenShift Lab, shows how to set up your own virtual lab on OpenShift using several methods.
Chapter 6, OpenShift Installation, gives you hands-on experience of performing an advanced installation of OpenShift using Ansible.
Chapter 7, Managing Persistent Storage, shows you how OpenShift provides persistent storage to applications.
Chapter 8, Core OpenShift Concepts, walks you through the most important concepts and resources behind OpenShift.
Chapter 9, Advanced OpenShift Concepts, explores OpenShift's resources and explains how to manage them further.
Chapter 10, Security in OpenShift, depicts how OpenShift handles security on multiple levels.
Chapter 11, Managing OpenShift Networking, explores the use cases for each network configuration of a virtual network in OpenShift.
Chapter 12, Deploying Simple Applications OpenShift, shows you how to deploy a single-container application in OpenShift.
Chapter 13, Deploying Multi-Tier Applications Using Templates, walks you through the deployment of complex applications via templates.
Chapter 14, Building Application Images from Dockerfile, explains how to use OpenShift to build images from Dockerfiles.
Chapter 15, Building PHP Applications from Source Code, explains how to implement the Source-to-Image build strategy in OpenShift.
Chapter 16, Building a Multi-Tier Application from Source Code, shows how to deploy a multi-tier PHP application on OpenShift.
Chapter 17, CI/CD Pipelines in OpenShift, works through implementing CI/CD on OpenShift using Jenkins and Jenkinsfile.
Chapter 18, OpenShift HA Architecture Overview, shows how to bring high availability to various layers of your OpenShift cluster.
Chapter 19, OpenShift HA Design for Single and Multiple DCs, explains what it takes to build a geo-distributed OpenShift cluster.
Chapter 20, Network Design for OpenShift HA, explores the network devices and protocols required to build an HA OpenShift solution.
Chapter 21, What is New in OpenShift 3.9?, gives you an insight into the latest features of OpenShift 3.9 and explains why you might want to use it.
This books assumes that you have practical experience with Linux and open source, are comfortable working with a command-line interface (CLI), are familiar with text editors such as nano or vi/vim, and know how to use SSH to access running machines. Since OpenShift can only be installed on RHEL-derived Linux distributions, previous experience of RHEL/CentOS 7 is preferable, as opposed to Debian-based variants. Knowing about cloud technologies and containers will certainly be a plus, but is not required.
To ensure the smoothest experience, we recommend using a laptop or desktop with an adequate amount of RAM, as this is the most critical resource for OpenShift. You can see all requirements for your learning environment in the Software Hardware List section that is included in the GitHub repository. Using a system with less than 8 GB RAM may result in occasional failures during the installation of OpenShift and overall instability, which will be distracting, even though it will boost your troubleshooting skills.
Another important aspect concerns the DNS of your environment. Some network providers, such as Cox (https://www.cox.com), redirect requests for all non-existent domains (those that result in an NXDOMAIN response from upstream DNS) to a custom web page with partner search results. Normally, that is not a problem, but during the installation of OpenShift, it will manifest itself by failing the installation. This happens because local DNS lookup settings for your virtual machine and containers managed by OpenShift include several domains to contact in order before NXDOMAIN is returned to the client making the request, and the next one is tried only after the previous one has returned NXDOMAIN. So, when your provider intercepts such requests, it may return its own IP, which will result in the OpenShift installer trying to reach a certain service at that IP for a health check. As expected, the request will not be answered and the installation will fail. For Cox, this feature is called Enhanced Error Results, so we suggest you opt out of it on your account.
You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packtpub.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Learn-OpenShift. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/LearnOpenShift_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Let's assume that the template is stored locally as wordpress.yaml."
A block of code is set as follows:
...node('nodejs') { stage('build') { openshiftBuild(buildConfig: 'nodejs-mongodb-example', showBuildLogs: 'true') }
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
openshiftBuild(buildConfig: 'nodejs-mongodb-example', showBuildLogs: 'true')}
stage('approval') {
input "Approve moving to deploy stage?"
}
stage('deploy') { openshiftDeploy(deploymentConfig: 'nodejs-mongodb-example')}
Any command-line input or output is written as follows:
$ vagrant ssh
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Once you click on the Log In button, the following page is displayed."
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
This book is much more than just the fundamentals of OpenShift. It's about the past, present, and the future of microservices and containers in general. In this book, we are going to cover OpenShift and its surroundings; this includes topics such as the fundamentals of containers, Docker basics, and studying sections where we will work with both Kubernetes and OpenShift in order to feel more comfortable with them.
During our OpenShift journey, we will walk you through all the main and most of the advanced components of OpenShift. We are going to cover OpenShift security and networking and also application development for OpenShift using the most popular and built-in OpenShift DevOps tools, such as CI/CD with Jenkins and Source-to-Image (S2I) in conjunction with GitHub.
We will also learn about the most critical part for every person who would like to actually implement OpenShift in their company—the design part. We are going to show you how to properly design and implement OpenShift, examining the most common mistakes made by those who have just started working with OpenShift.
The chapter is focused on container and Docker technologies. We will describe container concepts and Docker basics, from the architecture to low-level technologies. In this chapter, we will learn how to use Docker CLI and manage Docker containers and Docker images. A significant part of the chapter is focused on building and running Docker container images. As a part of the chapter, you are asked to develop a number of Dockerfiles and to containerize several applications.
In this chapter, we will look at the following:
Containers overview
Docker container architecture
Understanding Docker images and layers
Understanding Docker Hub and Docker registries
Installing and configuring Docker software
Using the Docker command line
Managing images via Docker CLI
Managing containers via Docker CLI
Understanding the importance of environment variables inside Docker containers
Managing persistent storage for Docker containers
Building a custom
Docker
image
In this chapter, we are going to use the following technologies and software:
Vagrant
Bash Shell
GitHub
Docker
Firefox (recommended) or any other browser
The Vagrant installation and all the code we use in this chapter are located on GitHub at https://github.com/PacktPublishing/Learn-OpenShift.
Instructions on how to install and configure Docker are provided in this chapter as we learn.
Bash Shell will be used as a part of your virtual environment based on CentOS 7.
Firefox or any other browser can be used to navigate through Docker Hub.
As a prerequisite, you will need a stable internet connection from your laptop.
Traditionally, software applications were developed following a monolithic architecture approach, meaning all the services or components were locked to each other. You could not take out a part and replace it with something else. That approach changed over time and became the N-tier approach. The N-tier application approach is one step forward in container and microservices architecture.
The major drawbacks of the monolith architecture were its lack of reliability, scalability, and high availability. It was really hard to scale monolith applications due to their nature. The reliability of these applications was also questionable because you could rarely easily operate and upgrade these applications without any downtime. There was no way you could efficiently scale out monolith applications, meaning you could not just add another one, five, or ten applications back to back and let them coexist with each other.
We had monolith applications in the past, but then people and companies started thinking about application scalability, security, reliability, and high availability (HA). And that is what created N-tier design. The N-tier design is a standard application design like 3-tier web applications where we have a web tier, application tier, and database backend. It's pretty standard. Now it is all evolving into microservices. Why do we need them? The short answer is for better numbers. It's cheaper, much more scalable, and secure. Containerized applications bring you to a whole new level and this is where you can benefit from automation and DevOps.
Containers are a new generation of virtual machines. That brings software development to a whole new level. Containers are an isolated set of different rules and resources inside a single operating system. This means that containers can provide the same benefits as virtual machines but use far less CPU, memory, and storage. There are several popular container providers including LXC, Rockt, and Docker, which we are going to focus on this book.
This architecture brings a lot of advantages to software development.
Some of the major advantages of containers are as follows:
Efficient hardware resource consumption
Application and service isolation
Faster deployment
Microservices architecture
The stateless nature of containers
Whether you run containers natively on a bare-metal server or use virtualization techniques, using containers allows you to utilize resources (CPU, memory, and storage) in a better and much more efficient manner. In the case of a bare-metal server, containers allow you to run tens or even hundreds of the same or different containers, providing better resource utilization in comparison to usually one application running on a dedicated server. We have seen in the past that some server utilization at peak times is only 3%, which is a waste of resources. And if you are going to run several of the same or different applications on the same servers, they are going to conflict with each other. Even if they work, you are going to face a lot of problems during day-to-day operation and troubleshooting.
If you are going to isolate these applications by introducing popular virtualization techniques such as KVM, VMware, XEN, or Hyper-V, you will run into a different issue. There is going to be a lot of overhead because, in order to virtualize your app using any hypervisor, you will need to install an operating system on top of your hypervisor OS. This operating system needs CPU and memory to function. For example, each VM has its own kernel and kernel space associated with it. A perfectly tuned container platform can give you up to four times more containers in comparison to standard VMs. It may be insignificant when you have five or ten VMs, but when we talk hundreds or thousands, it makes a huge difference.
Imagine a scenario where we have ten different applications hosted on the same server. Each application has a number of dependencies (such as packages, libraries, and so on). If you need to update an application, usually it involves updating the process and its dependencies. If you update all related dependencies, most likely it will affect the other application and services. It may cause these applications not to work properly. Sure, to a degree these issues are addressed by environment managers such as virtualenv for Python and rbenv/rvm for Ruby—and dependencies on shared libraries can be isolated via LD_LIBRARY_PATH—but what if you need different versions of the same package? Containers and virtualization solve that issue. Both VMs and containers provide environment isolation for your applications.
But, in comparison to bare-metal application deployment, container technology (for example, Docker) provides an efficient way to isolate applications, and other computer resources libraries from each other. It not only provides these applications with the ability to co-exist on the same OS, but also provides efficient security, which is a big must for every customer-facing and content-sensitive application. It allows you to update and patch your containerized applications independently of each other.
Using container images, discussed later in this book, allows us speed up container deployment. We are talking about seconds to completely restart a container versus minutes or tens of minutes with bare-metal servers and VMs. The main reason for this is that a container does not need to restart the whole OS, it just needs to restart the application itself.
Containers bring application deployment to a whole new level by introducing microservices architecture. What it essentially means is that, if you have a monolith or N-tier application, it usually has many different services communicating with each other. Containerizing your services allows you to break down your application into multiple pieces and work with each of them independently. Let's say you have a standard application that consists of a web server, application, and database. You can probably put it on one or three different servers, three different VMs, or three simple containers, running each part of this application. All these options require a different amount of effort, time, and resources. Later in this book, you will see how simple it is to do using containers.
Containers are stateless, which means that you can bring containers up and down, create or destroy them at any time, and this will not affect your application performance. That is one of the greatest features of containers. We are going to delve into this later in this book.
Docker is one of the most popular application containerization technologies these days. So why do we want to use Docker if there are other container options available? Because collaboration and contribution are key in the era of open source, and Docker has made many different things that other technologies have not been able to in this area.
For example, Docker partnered with other container developers such as Red Hat, Google, and Canonical to jointly work on its components. Docker also contributed it's software container format and runtime to the Linux Foundation's open container project. Docker has made containers very easy to learn about and use.
As we mentioned already, Docker is the most popular container platform. It allows for creating, sharing, and running applications inside Docker containers. Docker separates running applications from the infrastructure. It allows you to speed up the application delivery process drastically. Docker also brings application development to an absolutely new level. In the diagram that follows, you can see a high-level overview of the Docker architecture:
Docker uses a client-server type of architecture:
Docker server
: This is a service running as a daemon in an operating system. This service is responsible for downloading, building, and running containers.
Docker client
: The CLI tool is responsible for communicating with Docker servers using the REST API.
Docker uses three main components:
Docker containers
: Isolated user-space environments running the same or different applications and sharing the same host OS. Containers are created from Docker images.
Docker images
: Docker templates that include application libraries and applications. Images are used to create containers and you can bring up containers immediately. You can create and update your own custom images as well as download build images from Docker's public registry.
Docker registries
: This is a images store. Docker registries can be public or private, meaning that you can work with images available over the internet or create your own registry for internal purposes. One popular public Docker registry is Docker Hub, discussed later in this chapter.
As mentioned in the previous section, Docker containers are secured and isolated from each other. In Linux, Docker containers use several standard features of the Linux kernel. This includes:
Linux namespaces
: It is a feature of Linux kernel to isolate resources from each other. This allows one set of Linux processes to see one group of resources while allowing another set of Linux processes to see a different group of resources. There are several kinds of namespaces in Linux:
Mount
(
mnt
),
Process ID
(
PID
),
Network
(
net
),
User ID
(
user
),
Control group
(
cgroup
), and
Interprocess Communication
(
IPC
). The kernel can place specific system resources that are normally visible to all processes into a namespace. Inside a namespace, a process can see resources associated with other processes in the same namespace. You can associate a process or a group of processes with their own namespace or, if using network namespaces, you can even move a network interface to a network namespace. For example, two processes in two different mounted namespaces may have different views of what the mounted root file system is. Each container can be associated with a specific set of namespaces, and these namespaces are used inside these containers only.
Control groups
(
cgroups
): These provide an effective mechanism for resource limitation. With cgroups, you can control and manage system resources per Linux process, increasing overall resource utilization efficiency. Cgroups allow Docker to control resource utilization per container.
SELinux
:
Security Enhanced Linux
(
SELinux
) is
mandatory access control
(
MAC
) used for granular system access, initially developed by the
National Security Agency
(
NSA
). It is an additional security layer for Debian and RHEL-based distributions like Red Hat Enterprise Linux, CentOS, and Fedora. Docker uses SELinux for two main reasons: host protection and to isolate containers from each other. Container processes run with limited access to the system resources using special SELinux rules.
The beauty of Docker is that it leverages the aforementioned low-level kernel technologies, but hides all complexity by providing an easy way to manage your containers.
A Docker image is a read-only template used to build containers. An image consists of a number of layers that are combined into a single virtual filesystem accessible for Docker applications. This is achieved by using a special technique which combines multiple layers into a single view. Docker images are immutable, but you can add an extra layer and save them as a new image. Basically, you can add or change the Docker image content without changing these images directly. Docker images are the main way to ship, store, and deliver containerized applications. Containers are created using Docker images; if you do not have a Docker image, you need to download or build one.
The container filesystem, used for every Docker image, is represented as a list of read-only layers stacked on top of each other. These layers eventually form a base root filesystem for a container. In order to make it happen, different storage drivers are being used. All the changes to the filesystem of a running container are done to the top level image layer of a container. This layer is called a Container layer. What it basically means is that several containers may share access to the same underlying level of a Docker image, but write the changes locally and uniquely to each other. This process is shown in the following diagram:
A Docker storage driver is the main component to enable and manage container images. Two main technologies are used for that—copy-on-write and stackable image layers. The storage driver is designed to handle the details of these layers so that they interact with each other. There are several drivers available. They do pretty much the same job, but each and every one of them does it differently. The most common storage drivers are AUFS, Overlay/Overlay2, Devicemapper, Btrfs, and ZFS. All storage drivers can be categorized into three different types:
Storage driver category
Storage drivers
Union filesystems
AUFS, Overlay, Overlay2
Snapshotting filesystems
Btrfs, ZFS
Copy-on-write block devices
Devicemapper
As previously mentioned, a Docker image contains a number of layers that are combined into a single filesystem using a storage driver. The layers (also called intermediate images) are generated when commands are executed during the Docker image build process. Usually, Docker images are created using a Dockerfile, the syntax of which will be described later. Each layer represents an instruction in the image's Dockerfile.
Each layer, except the very last one, is read-only:
A Docker image usually consists of several layers, stacked one on top of the other. The top layer has read-write permissions, and all the remaining layers have read-only permissions. This concept is very similar to the copy-on-write technology. So, when you run a container from the image, all the changes are done to this top writable layer.
As mentioned earlier, a Docker image is a way to deliver applications. You can create a Docker image and share it with other users using a public/private registry service. A registry is a stateless, highly scalable server-side application which you can use to store and download Docker images. Docker registry is an open source project, under the permissive Apache license. Once the image is available on a Docker registry service, another user can download it by pulling the image and can use this image to create new Docker images or run containers from this image.
Docker supports several types of docker registry:
Public registry
Private registry
You can start a container from an image stored in a public registry. By default, the Docker daemon looks for and downloads Docker images from Docker Hub, which is a public registry provided by Docker. However, many vendors add their own public registries to the Docker configuration at installation time. For example, Red Hat has its own proven and blessed public Docker registry which you can use to pull Docker images and to build containers.
Some organization or specific teams don't want to share their custom container images with everyone for a reason. They still need a service to share Docker images, but just for internal usage. In that case, a private registry service can be useful. A private registry can be installed and configured as a service on a dedicated server or a virtual machine inside your network.
You can easily install a private Docker registry by running a Docker container from a public registry image. The private Docker registry installation process is no different from running a regular Docker container with additional options.
A Docker registry is accessed via the Docker daemon service using a Docker client. The Docker command line uses a RESTful API to request process execution from the daemon. Most of these commands are translated into HTTP requests and may be transmitted using curl.
The process of using Docker registries is shown in the following section.
A developer can create a Docker image and put it into a private or public registry. Once the image is uploaded, it can be immediately used to run containers or build other images.
Docker Hub is a cloud-based registry service that allows you to build your images and test them, push these images, and link to Docker cloud so you can deploy images on your hosts. Docker Hub provides a centralized resource for container image discovery, distribution and change management, user and team collaboration, and workflow automation throughout the development pipeline.
Docker Hub is the public registry managed by the Docker project, and it hosts a large set of container images, including those provided by major open source projects, such as MySQL, Nginx, Apache, and so on, as well as customized container images developed by the community.
Docker Hub provides some of the following features:
Image repositories
: You can find and download images managed by other Docker Hub users. You can also push or pull images from private image libraries you have access to.
Automated builds
: You can automatically create new images when you make changes to a source code repository.
Webhooks
: The action trigger that allows you to automate builds when there is a push to a repository.
Organizations
: The ability to create groups and manage access to image repositories.
In order to start working with Docker Hub, you need to log in to Docker Hub using a Docker ID. If you do not have one, you can create your Docker ID by following the simple registration process. It is completely free. The link to create your Docker ID if you do not have one is https://hub.docker.com/.
You can search for and pull Docker images from Docker Hub without logging in; however, to push images you must log in. Docker Hub gives you the ability to create public and private repositories. Public repositories will be publicly available for anyone and private repositories will be restricted to a set of users of organizations.
Docker Hub contains a number of official repositories. These are public, certified repositories from different vendors and Docker contributors. It includes vendors like Red Hat, Canonical, and Oracle.
Docker software is available in two editions: Community Edition (CE) and Enterprise Edition (EE).
Docker CE is a good point from which to start learning Docker and using containerized applications. It is available on different platforms and operating systems. Docker CE comes with an installer so you can start working with containers immediately. Docker CE is integrated and optimized for infrastructure so you can maintain a native app experience while getting started with Docker.
Docker Enterprise Edition (EE) is a Container-as-a-Service (CaaS) platform for IT that manages and secures diverse applications across disparate infrastructures, both on-premises and in a cloud. In other words, Docker EE is similar to Docker CE in that it is supported by Docker Inc.
Docker software supports a number of platforms and operating systems. The packages are available for most popular operating systems such as Red Hat Enterprise Linux, Fedora Linux, CentOS, Ubuntu Linux, Debian Linux, macOS, and Microsoft Windows.
The Docker installation process is dependent on the particular operating system. In most cases, it is well described on the official Docker portal—https://docs.docker.com/install/. As a part of this book, we will be working with Docker software on CentOS 7.x. Docker installation and configuration on other platforms is not part of this book. If you still need to install Docker on another operating system, just visit the official Docker web portal.
Usually, the Docker node installation process looks like this:
Installation and configuration of an operating system
Docker packages installation
Configuring Docker settings
Running the Docker service
Docker CE is available on CentOS 7 with standard repositories. The installation process is focused on the docker package installation:
# yum install docker -y
...output truncated for brevity...Installed:docker.x86_64 2:1.12.6-71.git3e8e77d.el7.centos.1Dependency Installed:...output truncated for brevity...
Once the installation is completed, you need to run the Docker daemon to be able to manage your containers and images. On RHEL7 and CentOS 7, this just means starting the Docker service like so:
# systemctl start docker
# systemctl enable docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
You can verify that your Docker daemon works properly by showing Docker information provided by the docker info command:
# docker info
Containers: 0Running: 0Paused: 0Stopped: 0Images: 0...output truncated for brevity...Registries: docker.io (secure)
Docker daemon configuration is managed by the Docker configuration file (/etc/docker/daemon.json) and Docker daemon startup options are usually controlled by the systemd unit named Docker. On Red Hat-based operating systems, some configuration options are available at /etc/sysconfig/docker and /etc/sysconfig/docker-storage. Modification of the mentioned file will allow you to change Docker parameters such as the UNIX socket path, listen on TCP sockets, registry configuration, storage backends, and so on.
The Docker daemon listens on unix:///var/run/docker.sock but you can bind Docker to another host/port or a Unix socket. The Docker client (the docker utility) uses the Docker API to interact with the Docker daemon.
The Docker client supports dozens of commands, each with numerous options, so an attempt to list them all would just result in a copy of the CLI reference from the official documentation. Instead, we will provide you with the most useful subsets of commands to get you up and running.
You can always check available man pages for all Docker sub-commands using:
$ man -k docker
You will be able to see a list of man pages for Docker and all the sub-commands available:
Another way to get information regarding a command is to use docker COMMAND --help:
The docker utility allows you to manage container infrastructure. All sub-commands can be grouped as follows:
Activity type
Related subcommands
Managing images
search, pull, push, rmi, images, tag, export, import, load, save
Managing containers
run, exec, ps, kill, stop, start
Building custom images
build, commit
Information gathering
info, inspect
The first step in running and using a container on your server or laptop is to search and pull a Docker image from the Docker registry using the docker search command.
Let's search for the web server container. The command to do so is:
$ docker search httpd
NAME DESCRIPTION STARS OFFICIAL AUTOMATEDhttpd ... 1569 [OK]hypriot/rpi-busybox-httpd ... 40centos/httpd 15 [OK]centos/httpd-24-centos7 ... 9
Alternatively, we can go to https://hub.docker.com/ and type httpd in the search window. It will give us something similar to the docker search httpd results:
Once the container image is found, we can pull this image from the Docker registry in order to start working with it. To pull a container image to your host, you need to use the docker pull command:
$ docker pull httpd
The output of the preceding command is as follows:
Note that Docker uses concepts from union filesystem layers to build Docker images. This is why you can see seven layers being pulled from Docker Hub. One stacks up onto another, building a final image.
