31,19 €
Ansible is a modern, YAML-based automation tool (built on top of Python, one of the world’s most popular programming languages) with a massive and ever-growing user base. Its popularity and Python underpinnings make it essential learning for all in the DevOps space.
This fourth edition of Mastering Ansible provides complete coverage of Ansible automation, from the design and architecture of the tool and basic automation with playbooks to writing and debugging your own Python-based extensions.
You'll learn how to build automation workflows with Ansible’s extensive built-in library of collections, modules, and plugins. You'll then look at extending the modules and plugins with Python-based code and even build your own collections — ultimately learning how to give back to the Ansible community.
By the end of this Ansible book, you'll be confident in all aspects of Ansible automation, from the fundamentals of playbook design to getting under the hood and extending and adapting Ansible to solve new automation challenges.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 553
Veröffentlichungsjahr: 2021
Automate configuration management and overcome deployment challenges with Ansible
James Freeman
Jesse Keating
BIRMINGHAM—MUMBAI
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Rahul Nair
Publishing Product Manager: Meeta Rajani
Senior Editor: Sangeeta Purkayastha
Content Development Editor: Nihar Kapadia
Technical Editor: Shruthi Shetty
Copy Editor: Safis Editing
Project Coordinator: Shagun Saini
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Jyoti Chauhan
First published: November 2015
Second edition: March 2017
Third edition: March 2019
Fourth edition: November 2021
Production reference: 1271021
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80181-878-0
www.packt.com
To Corinne Woolley for helping me to see who I really am, for her care, continued support, and presence in my world. To Ray and Mary, my beloved grandparents, always in my heart. To Ken – for teaching me to take life in my stride.
– James Freeman
James Freeman is an accomplished IT consultant with over 20 years' experience in the technology industry. He has more than 8 years of first-hand experience solving real-world enterprise problems in production environments using Ansible, frequently introducing Ansible as a new technology to businesses and CTOs. In addition, he has authored and facilitated bespoke Ansible workshops and training sessions and has presented at both international conferences and meetups on Ansible.
So many people have made this possible and I would like to thank each and every one of them for their love and support, especially Neeshia Jasmara.
Jesse Keating is an accomplished Ansible user, contributor, and presenter. He has been an active member of the Linux and open source community for over 15 years. He has firsthand experience involving a variety of IT activities, software development, and large-scale system administration. He has presented at numerous conferences and meetups, and has written many articles on a variety of topics.
Mario Vázquez is a software engineer passionate about container technologies, automation, and hybrid cloud. He has been working with Ansible since the early days and has always had Ansible in his toolbelt. These days, Mario is helping partners and customers to move their workloads to Kubernetes across multiple infrastructure providers.
Welcome to Mastering Ansible, your fully updated guide to the most valuable advanced features and functionalities provided by Ansible—the automation and orchestration tool. This book will provide you with the knowledge and skills required to truly understand how Ansible functions at a fundamental level, including all the latest features and changes since the release of version 3.0. In turn, this will allow you to master the advanced capabilities needed to tackle the complex automation challenges of today and the future. You will gain knowledge of Ansible workflows, explore use cases for advanced features, troubleshoot unexpected behavior, extend Ansible through customization, and learn about many of the new and important developments in Ansible, especially around infrastructure and network provisioning.
This book is for Ansible developers and operators who have an understanding of the core elements and applications but are now looking to enhance their skills in applying automation using Ansible.
Chapter 1, TheSystem Architecture and Design of Ansible, looks at the ins and outs of how Ansible goes about performing tasks on behalf of an engineer, how it is designed, and how to work with inventory and variables.
Chapter 2, Migrating from Earlier Ansible Versions, explains the architectural changes you will experience when you migrate from Ansible 2.x to any version from 3.x onward, how to work with Ansible collections, and also how to build your own—essential reading for anyone familiar with earlier Ansible versions.
Chapter 3, Protecting Your Secrets with Ansible, explores the tools available to encrypt data at rest and prevent secrets from being revealed at runtime.
Chapter 4, Ansible and Windows – Not Just for Linux, explores the integration of Ansible with Windows hosts to enable automation in cross-platform environments.
Chapter 5, Infrastructure Management for Enterprises with AWX, provides an overview of the powerful, open source graphical management framework for Ansible known as AWX, and how this might be employed in an enterprise environment.
Chapter 6, Unlocking the Power of Jinja2 Templates, states the varied uses of the Jinja2 templating engine within Ansible and discusses ways to make the most of its capabilities.
Chapter 7, Controlling Task Conditions, explains how to change the default behavior of Ansible to customize task error and change conditions.
Chapter 8, Composing Reusable Ansible Content with Roles, explains how to move beyond executing loosely organized tasks on hosts, and instead build clean, reusable, and self-contained code structures known as roles to achieve the same end result.
Chapter 9, Troubleshooting Ansible, takes you through the various methods that can be employed to examine, introspect, modify, and debug the operations of Ansible.
Chapter 10, Extending Ansible, covers the various ways in which new capabilities can be added to Ansible via modules, plugins, and inventory sources.
Chapter 11, Minimizing Downtime with Rolling Deployments, explains the common deployment and upgrade strategies to showcase the relevant Ansible features.
Chapter 12, Infrastructure Provisioning, examines cloud infrastructure providers and container systems for creating an infrastructure to manage.
Chapter 13, Network Automation, describes the advancements in the automation of network device configuration using Ansible.
To follow the examples provided in this book, you will need access to a computer platform capable of running Ansible. Currently, Ansible can be run on any machine with Python 2.7 or Python 3 (versions 3.5 and higher) installed (Windows is supported for the control machine, but only through a Linux distribution running in the Windows Subsystem for Linux (WSL) layer available on newer versions—see Chapter 4, Ansible and Windows – Not Just for Linux, for details). Operating systems supported include (but are not limited to) Red Hat, Debian, Ubuntu, CentOS, macOS, and FreeBSD.
This book uses the Ansible 4.x.x series release. Ansible installation instructions can be found at https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html.
Some examples use Docker version 20.10.8. Docker installation instructions can be found at https://docs.docker.com/get-docker/.
A handful of examples in this book make use of accounts on both Amazon Web Services (AWS) and Microsoft Azure. More information about these services may be found at https://aws.amazon.com/ and https://azure.microsoft.com, respectively. We also delve into the management of OpenStack with Ansible, and the examples in this book were tested against a single all-in-one instance of DevStack as per the instructions found here: https://docs.openstack.org/devstack/latest/.
Finally, Chapter 13, Network Automation, makes use of Arista vEOS 4.26.2F and Cumulus VX version 4.4.0 in the example code—please see here for more information: https://www.arista.com/en/support/software-download and https://www.nvidia.com/en-gb/networking/ethernet-switching/cumulus-vx/. If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Mastering-Ansible-Fourth-Edition. If there's an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
The Code in Action videos for this book can be viewed at https://bit.ly/3vvkzbP.
We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781801818780_ColorImages.pdf.
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "This book will assume that there are no settings in the ansible.cfg file that would affect the default operation of Ansible"
A block of code is set as follows:
---
plugin: amazon.aws.aws_ec2
boto_profile: default
Any command-line input or output is written as follows:
ansible-playbook -i mastery-hosts --vault-id
test@./password.sh showme.yaml -v
Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "You simply need to navigate to your profile preferences page and click the Show API Key button."
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you've read Mastering Ansible Fourth Edition, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.
In this section, we will explore the fundamentals of how Ansible works and establish a sound basis on which to develop playbooks and workflows. We will also examine and explain the changes you will discover if you are familiar with the older Ansible 2.x releases.
The following chapters are included in this section:
Chapter 1, The System Architecture and Design of AnsibleChapter 2, Migrating from Earlier Ansible VersionsChapter 3, Protecting Your Secrets with AnsibleChapter 4, Ansible and Windows – Not Just for LinuxChapter 5, Infrastructure Management for Enterprises with AWXThis chapter provides a detailed exploration of the architecture and design of Ansible and how it goes about performing tasks on your behalf. We will cover the basic concepts of inventory parsing and how data is discovered. Then, we will proceed onto playbook parsing. We will take a walk through module preparation, transportation, and execution. Finally, we will detail variable types and find out where variables are located, their scope of use, and how precedence is determined when variables are defined in more than one location. All these things will be covered in order to lay the foundation for mastering Ansible!
In this chapter, we will cover the following topics:
Ansible versions and configurationsInventory parsing and data sourcesPlaybook parsingExecution strategiesModule transport and executionAnsible collectionsVariable types and locationsMagic variablesAccessing external dataVariable precedence (and interchanging this with variable priority ordering)To follow the examples presented in this chapter, you will need a Linux machine running Ansible 4.3 or later. Almost any flavor of Linux should do. For those who are interested in specifics, all the code presented in this chapter was tested on Ubuntu Server 20.04 LTS, unless stated otherwise, and on Ansible 4.3. The example code that accompanies this chapter can be downloaded from GitHub at https://github.com/PacktPublishing/Mastering-Ansible-Fourth-Edition/tree/main/Chapter01.
Check out the following video to view the Code in Action: https://bit.ly/3E37xpn.
It is assumed that you have Ansible installed on your system. There are many documents out there that cover installing Ansible in a way that is appropriate to the operating system and version that you might be using. However, it is important to note that Ansible versions that are newer than 2.9.x feature some major changes from all of the earlier versions. For everyone reading this book who has had exposure to Ansible 2.9.x and earlier, Chapter 2, Migrating from Earlier Ansible Versions, explains the changes in detail, along with how to address them.
This book will assume the use of Ansible version 4.0.0 (or later), coupled with ansible-core 2.11.1 (or newer), both of which are required and are the latest and greatest releases at the time of writing. To discover the version in use on a system where Ansible is already installed, make use of the --version argument, that is, either ansible or ansible-playbook, as follows:
ansible-playbook --version
This command should give you an output that's similar to Figure 1.1; note that the screenshot was taken on Ansible 4.3, so you might see an updated version number corresponding to the version of your ansible-core package (for instance, for Ansible 4.3.0, this would be ansible-core 2.11.1, which is the version number that all of the commands will return):
Figure 1.1 – An example output showing the installed version of Ansible on a Linux system
Important note
Note that ansible is the executable for doing ad hoc one-task executions, and ansible-playbook is the executable that will process playbooks to orchestrate multiple tasks. We will cover the concepts of ad hoc tasks and playbooks later in the book.
The configuration for Ansible can exist in a few different locations, where the first file found will be used. The search involves the following:
ANSIBLE_CFG: This environment variable is used, provided that it is set.ansible.cfg: This is located in the current working directory.~/.ansible.cfg: This is located in the user's home directory./etc/ansible/ansible.cfg: The default central Ansible configuration file for the system.Some installation methods could include placing a config file in one of these locations. Look around to check whether such a file exists and view what settings are in the file to get an idea of how the Ansible operation might be affected. This book assumes that there are no settings in the ansible.cfg file that can affect the default operation of Ansible.
In Ansible, nothing happens without an inventory. Even ad hoc actions performed on the localhost require an inventory – although that inventory might just consist of the localhost. The inventory is the most basic building block of Ansible architecture. When executing ansible or ansible-playbook, an inventory must be referenced. Inventories are files or directories that exist on the same system that runs ansible or ansible-playbook. The location of the inventory can be defined at runtime with the --inventory-file (-i) argument or by defining the path in an Ansible config file.
Inventories can be static or dynamic, or even a combination of both, and Ansible is not limited to a single inventory. The standard practice is to split inventories across logical boundaries, such as staging and production, allowing an engineer to run a set of plays against their staging environment for validation, and then follow with the exact plays run against the production inventory set.
Variable data, such as specific details a how to connect to a particular host in your inventory, can be included, along with an inventory in a variety of ways, and we'll explore the options available to you.
The static inventory is the most basic of all the inventory options. Typically, a static inventory will consist of a single file in ini format. Other formats are supported, including YAML, but you will find that ini is commonly used when most people start out with Ansible. Here is an example of a static inventory file describing a single host, mastery.example.name:
mastery.example.name
That is all there is to it. Simply list the names of the systems in your inventory. Of course, this does not take full advantage of all that an inventory has to offer. If every name were listed like this, all plays would have to reference specific hostnames, or the special built-in all group (which, as the name suggests, contains all hosts inside the inventory). This can be quite tedious when developing a playbook that operates across different environments within your infrastructure. At the very least, hosts should be arranged into groups.
A design pattern that works well is arranging your systems into groups based on expected functionality. At first, this might seem difficult if you have an environment where single systems can play many different roles, but that is perfectly fine. Systems in an inventory can exist in more than one group, and groups can even consist of other groups! Additionally, when listing groups and hosts, it is possible to list hosts without a group. These would have to be listed first before any other group is defined. Let's build on our previous example and expand our inventory with a few more hosts and groupings, as follows:
[web]
mastery.example.name
[dns]
backend.example.name
[database]
backend.example.name
[frontend:children]
web
[backend:children]
dns
database
Here, we have created a set of three groups with one system in each, and then two more groups, which logically group all three together. Yes, that's right: you can have groups of groups. The syntax used here is [groupname:children], which indicates to Ansible's inventory parser that this group, going by the name of groupname, is nothing more than a grouping of other groups.
The children, in this case, are the names of the other groups. This inventory now allows writing plays against specific hosts, low-level role-specific groups, or high-level logical groupings, or any combination thereof.
By utilizing generic group names, such as dns and database, Ansible plays can reference these generic groups rather than the explicit hosts within. An engineer can create one inventory file that fills in these groups with hosts from a preproduction staging environment, and another inventory file with the production versions of these groupings. The content of the playbook does not need to change when executing on either a staging or production environment because it refers to the generic group names that exist in both inventories. Simply refer to the correct inventory to execute it in the desired environment.
A new play-level keyword, order, was added to Ansible in version 2.4. Prior to this, Ansible processed the hosts in the order specified in the inventory file, and it continues to do so by default, even in newer versions. However, the following values can be set for the order keyword for a given play, resulting in the processing order of hosts, which is described as follows:
inventory: This is the default option. It simply means that Ansible proceeds as it always has, processing the hosts in the order that is specified in the inventory file.reverse_inventory: This results in the hosts being processed in the reverse order that is specified in the inventory file.sorted: The hosts are processed in alphabetical order by name.reverse_sorted: The hosts are processed in reverse alphabetical order.shuffle: The hosts are processed in a random order, with the order being randomized on each run.In Ansible, the alphabetical sorting used is alternatively known as lexicographical. Put simply, this means that values are sorted as strings, with the strings being processed from left to right. Therefore, let's say that we have three hosts: mastery1, mastery11, and mastery2. In this list, mastery1 comes first as the character, as position 8 is a 1. Then comes mastery11, as the character at position 8 is still a 1, but now there is an additional character at position 9. Finally comes mastery2, as character 8 is a 2, and 2 comes after 1. This is important as, numerically, we know that 11 is greater than 2. However, in this list, mastery11 comes before mastery2. You can easily work around this by adding leading zeros to any numbers on your hostnames; for example, mastery01, mastery02, and mastery11 will be processed in the order they have been listed in this sentence, resolving the lexicographical issue described.
Inventories provide more than just system names and groupings. Data regarding the systems can be passed along as well. This data could include the following:
Host-specific data to use in templatesGroup-specific data to use in task arguments or conditionalsBehavioral parameters to tune how Ansible interacts with a systemVariables are a powerful construct within Ansible and can be used in a variety of ways, not just those described here. Nearly every single thing done in Ansible can include a variable reference. While Ansible can discover data about a system during the setup phase, not all of the data can be discovered. Defining data with the inventory expands this. Note that variable data can come from many different sources, and one source could override another. We will cover the order of variable precedence later in this chapter.
Let's improve upon our existing example inventory and add to it some variable data. We will add some host-specific data and group-specific data:
[web]
mastery.example.name ansible_host=192.168.10.25
[dns]
backend.example.name
[database]
backend.example.name
[frontend:children]
web
[backend:children]
dns
database
[web:vars]
http_port=88
proxy_timeout=5
[backend:vars]
ansible_port=314
[all:vars]
ansible_ssh_user=otto
In this example, we defined ansible_host for mastery.example.name to be the IP address of 192.168.10.25. The ansible_host variable is a behavioral inventory variable, which is intended to alter the way Ansible behaves when operating with this host. In this case, the variable instructs Ansible to connect to the system using the IP address provided, rather than performing a DNS lookup on the name using mastery.example.name. There are a number of other behavioral inventory variables that are listed at the end of this section, along with their intended use.
Our new inventory data also provides group-level variables for the web and backend groups. The web group defines http_port, which could be used in an NGINX configuration file, and proxy_timeout, which might be used to determine HAProxy behavior. The backend group makes use of another behavioral inventory parameter to instruct Ansible to connect to the hosts in this group using port 314 for SSH, rather than the default of 22.
Finally, a construct is introduced that provides variable data across all the hosts in the inventory by utilizing a built-in all group. Variables defined within this group will apply to every host in the inventory. In this particular example, we instruct Ansible to log in as the otto user when connecting to the systems. This is also a behavioral change, as the Ansible default behavior is to log in as a user with the same name as the user executing ansible or ansible-playbook on the control host.
Here is a list of behavior inventory variables and the behaviors they intend to modify:
ansible_host: This is the DNS name or the Docker container name that Ansible will initiate a connection to.ansible_port: This specifies the port number that Ansible will use to connect to the inventory host if it is not the default value of 22.ansible_user: This specifies the username that Ansible will use to connect with the inventory host, regardless of the connection type.ansible_password: This is used to provide Ansible with the password for authentication to the inventory host in conjunction with ansible_user. Use this for testing purposes only – you should always use a vault to store sensitive data such as passwords (please refer to Chapter 3, Protecting Your Secrets with Ansible).ansible_ssh_private_key_file: This is used to specify which SSH private key file will be used to connect to the inventory host if you are not using the default one or ssh-agent.ansible_ssh_common_args: This defines SSH arguments to append to the default arguments for ssh, sftp, and scp.ansible_sftp_extra_args: This is used to specify additional arguments that will be passed to the sftp binary when called by Ansible.ansible_scp_extra_args: This is used to specify additional arguments that will be passed to the scp binary when called by Ansible.ansible_ssh_extra_args: This is used to specify additional arguments that will be passed to the ssh binary when called by Ansible.ansible_ssh_pipelining: This setting uses a Boolean to define whether SSH pipelining should be used for this host.ansible_ssh_executable: This setting overrides the path to the SSH executable for this host.ansible_become: This defines whether privilege escalation (sudo or something else) should be used with this host.ansible_become_method: This is the method to use for privilege escalation and can be one of sudo, su, pbrun, pfexec, doas, dzdo, or ksu.ansible_become_user: This is the user to switch to through privilege escalation, typically root on Linux and Unix systems.ansible_become_password: This is the password to use for privilege escalation. Only use this for testing purposes; you should always use a vault to store sensitive data such as passwords (please refer to Chapter 3, Protecting Your Secrets with Ansible).ansible_become_exe: This is used to set the executable that was used for the chosen escalation method if you are not using the default one defined by the system.ansible_become_flags: This is used to set the flags passed to the chosen escalation executable if required.ansible_connection: This is the connection type of the host. Candidates are local, smart, ssh, paramiko, docker, or winrm (we will look at this in more detail later in the book). The default setting is smart in any modern Ansible distribution (this detects whether the ControlPersist SSH feature is supported and, if so, uses ssh as the connection type; otherwise, it falls back to paramiko).ansible_docker_extra_args: This is used to specify the extra argument that will be passed to a remote Docker daemon on a given inventory host.ansible_shell_type: This is used to determine the shell type on the inventory host(s) in question. It defaults to the sh-style syntax but can be set to csh or fish to work with systems that use these shells.ansible_shell_executable: This is used to determine the shell type on the inventory host(s) in question. It defaults to the sh-style syntax but can be set to csh or fish to work with systems that use these shells.ansible_python_interpreter: This is used to manually set the path to Python on a given host in the inventory. For example, some distributions of Linux have more than one Python version installed, and it is important to ensure that the correct one is set. For example, a host might have both /usr/bin/python27 and /usr/bin/python3, and this is used to define which one will be used.ansible_*_interpreter: This is used for any other interpreted language that Ansible might depend upon (for example, Perl or Ruby). This replaces the interpreter binary with the one that is specified.A static inventory is great and can be enough for many situations. However, there are times when a statically written set of hosts is just too unwieldy to manage. Consider situations where inventory data already exists in a different system, such as LDAP, a cloud computing provider, or an in-house configuration management database (CMDB) (inventory, asset tracking, and data warehousing) system. It would be a waste of time and energy to duplicate that data and, in the modern world of on-demand infrastructure, that data would quickly grow stale or become disastrously incorrect.
Another example of when a dynamic inventory source might be desired is when your site grows beyond a single set of playbooks. Multiple playbook repositories can fall into the trap of holding multiple copies of the same inventory data, or complicated processes have to be created to reference a single copy of the data. An external inventory can easily be leveraged to access the common inventory data that is stored outside of the playbook repository to simplify the setup. Thankfully, Ansible is not limited to static inventory files.
A dynamic inventory source (or plugin) is an executable that Ansible will call at runtime to discover real-time inventory data. This executable can reach out to external data sources and return data, or it can just parse local data that already exists but might not be in the ini/yaml Ansible inventory format. While it is possible, and easy, to develop your own dynamic inventory source, which we will cover in a later chapter, Ansible provides an ever-growing number of example inventory plugins. This includes, but is not limited to, the following:
OpenStack NovaRackspace Public CloudDigitalOceanLinodeAmazon EC2Google Compute EngineMicrosoft AzureDockerVagrantMany of these plugins require some level of configuration, such as user credentials for EC2 or an authentication endpoint for OpenStack Nova. Since it is not possible to configure additional arguments for Ansible to pass along to the inventory script, the configuration for the script must either be managed via an ini config file that is read from a known location or environment variables that are read from the shell environment used to execute ansible or ansible-playbook. Also, note that, sometimes, external libraries are required for these inventory scripts to function.
When ansible or ansible-playbook is directed at an executable file for an inventory source, Ansible will execute that script with a single argument, --list. This is so that Ansible can get a listing of the entire inventory in order to build up its internal objects to represent the data. Once that data is built up, Ansible will then execute the script with a different argument for every host in the data to discover variable data. The argument used in this execution is --host <hostname>, which will return any variable data that is specific to that host.
The number of inventory plugins is too numerous for us to go through each of them in detail in this book. However, similar processes are needed to set up and use just about all of them. So, to demonstrate the process, we will work through the use of the EC2 dynamic inventory.
Many of the dynamic inventory plugins are installed as part of the community.general collection, which is installed, by default, when you install Ansible 4.0.0. Nonetheless, the first part of working with any dynamic inventory plugin is finding out which collection the plugin is part of and, if required, installing that collection. The EC2 dynamic inventory plugin is installed as part of the amazon.aws collection. So, your first step will be to install this collection – you can do this with the following command:
ansible-galaxy collection install amazon.aws
If all goes well, you should see a similar output on your Terminal to that in Figure 1.2:
Figure 1.2 – The installation of the amazon.aws collection using ansible-galaxy
Whenever you install a new plugin or collection, it is always advisable to read the accompanying documentation as some of the dynamic inventory plugins require additional libraries or tools to function correctly. For example, if you refer to the documentation for the aws_ec2 plugin at https://docs.ansible.com/ansible/latest/collections/amazon/aws/aws_ec2_inventory.html, you will see that both the boto3 and botocore libraries are required for this plugin to operate. Installing this will depend on your operating system and Python environment. However, on Ubuntu Server 20.04 (and other Debian variants), it can be done with the following command:
sudo apt install python3-boto3 python3-botocore
Here's the output for the preceding command:
Figure 1.3 – Installing the Python dependencies for the EC2 dynamic inventory script
Now, looking at the documentation for the plugin (often, you can also find helpful hints by looking within the code and any accompanying configuration files), you will note that we need to provide our AWS credentials to this script in some manner. There are several possible ways in which to do this – one example is to use the awscli tool (if you have it installed) to define the configuration, and then reference this configuration profile from your inventory. For example, I configured my default AWS CLI profile using the following command:
aws configure
The output will appear similar to the following screenshot (the secure details have been redacted for obvious reasons!):
Figure 1.4 – Configuring AWS credentials using the AWS CLI utility
With this done, we can now create out inventory definition, telling Ansible which plugin to use, and passing the appropriate parameters to it. In our example here, we simply need to tell the plugin to use the default profile we created earlier. Create a file called mastery_aws_ec2.yml, which contains the following content:
---
plugin: amazon.aws.aws_ec2
boto_profile: default
Finally, we will test our new inventory plugin configuration by passing it to the ansible-inventory command with the –graph parameter:
ansible-inventory -i mastery_aws_ec2.yml –-graph
Assuming you have some instances running in AWS EC2, you will see a similar output to the following:
Figure 1.5 – An example output from the dynamic inventory plugin
Voila! We have a listing of our current AWS inventory, along with a glimpse into the automatic grouping performed by the plugin. If you want to delve further into the capabilities of the plugin and view, for example, all the inventory variables assigned to each host (which contain useful information, including instance type and sizing), try passing the–-list parameter to ansible-inventory instead of–-graph.
With the AWS inventory in place, you could use this right away to run a single task or the entire playbook against this dynamic inventory. For example, to use the ansible.builtin.ping module to check Ansible authentication and connectivity to all the hosts in the inventory, you could run the following command:
ansible -i mastery_aws_ec2.yml all -m ansible.builtin.ping
Of course, this is just one example. However, if you follow this process for other dynamic inventory providers, you should get them to work with ease.
In Chapter 10, Extending Ansible, we will develop our own custom inventory plugin to demonstrate how they operate.
Just like static inventory files, it is important to remember that Ansible will parse this data once, and only once, per the ansible or ansible-playbook execution. This is a fairly common stumbling point for users of cloud dynamic sources, where, frequently, a playbook will create a new cloud resource and then attempt to use it as if it were part of the inventory. This will fail, as the resource was not part of the inventory when the playbook launched. All is not lost, though! A special module is provided that allows a playbook to temporarily add an inventory to the in-memory inventory object, that is, the ansible.builtin.add_host module.
This module takes two options: name and groups. The name option should be obvious; it defines the hostname that Ansible will use when connecting to this particular system. The groups option is a comma-separated list of groups that you can add to this new system. Any other option passed to this module will become the host variable data for this host. For example, if we want to add a new system, name it newmastery.example.name, add it to the web group, and instruct Ansible to connect to it by way of IP address 192.168.10.30. This will create a task that resembles the following:
- name: add new node into runtime inventory
ansible.builtin.add_host:
name: newmastery.example.name
groups: web
ansible_host: 192.168.10.30
This new host will be available to use – either by way of the name provided or by way of the web group – for the rest of the ansible-playbook execution. However, once the execution has been completed, this host will not be available unless it has been added to the inventory source itself. Of course, if this were a new cloud resource that had been created, the next ansible or ansible-playbook execution that sourced a dynamic inventory from that cloud would pick up the new member.
As mentioned earlier, every execution of ansible or ansible-playbook will parse the entire inventory it has been provided with. This is even true when a limit has been applied. Put simply, a limit is applied at runtime by making use of the --limit runtime argument to ansible or ansible-playbook. This argument accepts a pattern, which is essentially a mask to apply to the inventory. The entire inventory is parsed, and at each play, the limit mask that is supplied restricts the play to only run against the pattern that has been specified.
Let's take our previous inventory example and demonstrate the behavior of Ansible with and without a limit. If you recall, we have a special group, all, that we can use to reference all of the hosts within an inventory. Let's assume that our inventory is written out in the current working directory, in a file named mastery-hosts, and we will construct a playbook to demonstrate the host on which Ansible is operating. Let's write this playbook out as mastery.yaml:
---
- name: limit example play
hosts: all
gather_facts: false
tasks:
- name: tell us which host we are on
ansible.builtin.debug:
var: inventory_hostname
The ansible.builtin.debug module is used to print out text or values of variables. We'll use this module a lot in this book to simulate the actual work being done on a host.
Now, let's execute this simple playbook without supplying a limit. For simplicity's sake, we will instruct Ansible to utilize a local connection method, which will execute locally rather than attempt to SSH to these nonexistent hosts. Run the following command:
ansible-playbook -i mastery-hosts -c local mastery.yaml
The output should appear similar to Figure 1.6:
Figure 1.6 – Running the simple playbook on an inventory without a limit applied
As you can see, both the backend.example.name and mastery.example.name hosts were operated on. Now, let's see what happens if we supply a limit, that is, to limit our run to the frontend systems only, by running the following command:
ansible-playbook -i mastery-hosts -c local mastery.yaml --limit frontend
This time around, the output should appear similar to Figure 1.7:
Figure 1.7 – Running the simple playbook on an inventory with a limit applied
Here, we can see that only mastery.example.name was operated on this time. While there are no visual clues that the entire inventory was parsed, if we dive into the Ansible code and examine the inventory object, we will indeed find all the hosts within. Additionally, we will see how the limit is applied every time the object is queried for items.
It is important to remember that regardless of the host's pattern used in a play, or the limit that is supplied at runtime, Ansible will still parse the entire inventory that is set during each run. In fact, we can prove this by attempting to access the host variable data for a system that would otherwise be masked by our limit. Let's expand our playbook slightly and attempt to access the ansible_port variable from backend.example.name:
---
- name: limit example play
hosts: all
gather_facts: false
tasks:
- name: tell us which host we are on
ansible.builtin.debug:
var: inventory_hostname
- name: grab variable data from backend
ansible.builtin.debug:
var: hostvars['backend.example.name']['ansible_port']
We will still apply our limit by running the playbook with the same command we used in the previous run, which will restrict our operations to just mastery.example.name:
Figure 1.8 – Demonstrating that the entire inventory is parsed even with a limit applied
We have successfully accessed the host variable data (by way of group variables) for a system that was otherwise limited out. This is a key skill to understand, as it allows for more advanced scenarios, such as directing a task at a host that is otherwise limited out. Additionally, delegation can be used to manipulate a load balancer; this will put a system into maintenance mode while it is being upgraded without you having to include the load balancer system in your limit mask.
The whole purpose of an inventory source is to have systems to manipulate. The manipulation comes from playbooks (or, in the case of Ansible ad hoc execution, simple single-task plays). You should already have a basic understanding of playbook construction, so we won't spend a lot of time covering that; however, we will delve into some specifics of how a playbook is parsed. Specifically, we will cover the following:
The order of operationsRelative path assumptionsPlay behavior keysThe host selection for plays and tasksPlay and task namesAnsible is designed to be as easy as possible for humans to understand. The developers strive to strike the best balance of human comprehension and machine efficiency. To that end, nearly everything in Ansible can be assumed to be executed in a top-to-bottom order; that is, the operation listed at the top of a file will be accomplished before the operation listed at the bottom of a file. Having said that, there are a few caveats and even a few ways to influence the order of operations.
A playbook only has two main operations it can accomplish. It can either run a play, or it can include another playbook from somewhere on the filesystem. The order in which these are accomplished is simply the order in which they appear in the playbook file, from top to bottom. It is important to note that while the operations are executed in order, the entire playbook and any included playbooks are completely parsed before any executions. This means that any included playbook file has to exist at the time of the playbook parsing – they cannot be generated in an earlier operation. This is specific to playbook inclusions but not necessarily to task inclusions that might appear within a play, which will be covered in a later chapter.
Within a play, there are a few more operations. While a playbook is strictly ordered from top to bottom, a play has a more nuanced order of operations. Here is a list of the possible operations and the order in which they will occur:
Variable loadingFact gatheringThe pre_tasks executionHandlers notified from the pre_tasks executionThe roles executionThe tasks executionHandlers notified from the roles or tasks executionThe post_tasks executionHandlers notified from the post_tasks executionThe following is an example play with most of these operations shown:
---
- hosts: localhost
gather_facts: false
vars:
- a_var: derp
pre_tasks:
- name: pretask
debug:
msg: "a pre task"
changed_when: true
notify: say hi
roles:
- role: simple
derp: newval
tasks:
- name: task
debug:
msg: "a task"
changed_when: true
notify: say hi
post_tasks:
- name: posttask
debug:
msg: "a post task"
changed_when: true
notify: say hi
handlers:
- name: say hi
debug:
msg: hi
Regardless of the order in which these blocks are listed in a play, the order detailed in the previous code block is the order in which they will be processed. Handlers (that is, the tasks that can be triggered by other tasks that result in a change) are a special case. There is a utility module, ansible.builtin.meta, that can be used to trigger handler processing at a specific point:
- ansible.builtin.meta: flush_handlers
This will instruct Ansible to process any pending handlers at that point before continuing with the next task or next block of actions within a play. Understanding the order and being able to influence the order with flush_handlers is another key skill to have when there is a need to orchestrate complicated actions; for instance, where things such as service restarts are very sensitive to order. Consider the initial rollout of a service.
The play will have tasks that modify config files and indicate that the service should be restarted when these files change. The play will also indicate that the service should be running. The first time this play happens, the config file will change, and the service will change from not running to running. Then, the handlers will trigger, which will cause the service to restart immediately. This can be disruptive to any consumers of the service. It is better to flush the handlers before a final task to ensure the service is running. This way, the restart will happen before the initial start, so the service will start up once and stay up.
When Ansible parses a playbook, there are certain assumptions that can be made about the relative paths of items referenced by the statements in a playbook. In most cases, paths for things such as variable files to include, task files to include, playbook files to include, files to copy, templates to render, and scripts to execute are all relative to the directory where the file that is referencing them resides. Let's explore this with an example playbook and directory listing to demonstrate where the files are:
The directory structure is as follows:.
├── a_vars_file.yaml
├── mastery-hosts
├── relative.yaml
└── tasks
├── a.yaml
└── b.yaml
The content of a_vars_file.yaml is as follows:---
something: "better than nothing"
The content of relative.yaml is as follows:---
- name: relative path play
hosts: localhost
gather_facts: false
vars_files:
- a_vars_file.yaml
tasks:
- name: who am I
ansible.builtin.debug:
msg: "I am mastery task"
- name: var from file
ansible.builtin.debug:
var: something
- ansible.builtin.include: tasks/a.yaml
The content of tasks/a.yaml is as follows:---
- name: where am I
ansible.builtin.debug:
msg: "I am task a"
- ansible.builtin.include: b.yaml
The content of tasks/b.yaml is as follows:---
- name: who am I
ansible.builtin.debug:
msg: "I am task b"
The execution of the playbook is performed with the following command:
ansible-playbook -i mastery-hosts -c local relative.yaml
The output should be similar to Figure 1.9:
Figure 1.9 – The expected output from running a playbook utilizing relative paths
Here, we can clearly see the relative references to the paths and how they are relative to the file referencing them. When using roles, there are some additional relative path assumptions; however, we'll cover that, in detail, in a later chapter.
When Ansible parses a play, there are a few directives it looks for in order to define various behaviors for a play. These directives are written at the same level as the hosts: directive. Here is a list of descriptions for some of the more frequently used keys that can be defined in this section of the playbook:
any_errors_fatal: This Boolean directive is used to instruct Ansible to treat any failure as a fatal error to prevent any further tasks from being attempted. This changes the default, where Ansible will continue until all the tasks have been completed or all the hosts have failed.connection: This string directive defines which connection system to use for a given play. A common choice to make here is local, which instructs Ansible to do all the operations locally but with the context of the system from the inventory.collections: This is a list of the collection namespaces used within the play to search for modules, plugins, and roles, and it can be used to prevent the need to enter Fully Qualified Collection Names (FQCNs) – we will learn more about this in Chapter 2, Migrating from Earlier Ansible Versions. Note that this value does not get inherited by role tasks, so you must set it separately in each role in the meta/main.yml file.gather_facts: This Boolean directive controls whether or not Ansible will perform the fact-gathering phase of the operation, where a special task will run on a host to uncover various facts about the system. Skipping fact gathering – when you are sure that you do not require any of the discovered data – can be a significant time-saver in a large environment.Max_fail_percentage: This number directive is similar to any_errors_fatal, but it is more fine-grained. It allows you to define what percentage of your hosts can fail before the whole operation is halted.no_log: This is a Boolean to control whether or not Ansible will log (to the screen and/or a configured log file) the command given or the results received from a task. This is important if your task or return deals with secrets. This key can also be applied to a task directly.port: This is a number directive to define what SSH port (or any other remote connection plugin) you should use to connect unless this is already configured in the inventory data.remote_user: This is a string directive that defines which user to log in with on the remote system. The default setting is to connect as the same user that ansible-playbook was started with.serial: This directive takes a number and controls how many systems Ansible will execute a task on before moving to the next task in a play. This is a drastic change from the normal order of operations, where a task is executed across every system in a play before moving to the next. This is very useful in rolling update scenarios, which we will discuss in later chapters.become: This is a Boolean directive that is used to configure whether privilege escalation (sudo or something else) should be used on the remote host to execute tasks. This key can also be defined at a task level. Related directives include become_user, become_method, and become_flags. These can be used to configure how the escalation will occur.strategy: This directive sets the execution strategy to be used for the play.Many of these keys will be used in the example playbooks throughout this book.
For a full list of available play directives, please refer to the online documentation at https://docs.ansible.com/ansible/latest/reference_appendices/playbooks_keywords.html#play.
With the release of Ansible 2.0, a new way to control play execution behavior was introduced: strategy. A strategy defines how Ansible coordinates each task across the set of hosts. Each strategy is a plugin, and three strategies come with Ansible: linear, debug, and free. The linear strategy, which is the default strategy, is how Ansible has always behaved. As a play is executed, all the hosts for a given play execute the first task.
Once they are all complete, Ansible moves to the next task. The serial directive can create batches of hosts to operate in this way, but the base strategy remains the same. All the targets for a given batch must complete a task before the next task is executed. The debug strategy makes use of the same linear mode of execution described earlier, except that here, tasks are run in an interactive debugging session rather than running to completion without any user intervention. This is especially valuable during the testing and development of complex and/or long-running automation code where you need to analyze the behavior of the Ansible code as it runs, rather than simply running it and hoping for the best!
The free strategy breaks from this traditional linear behavior. When using the free strategy, as soon as a host completes a task, Ansible will execute the next task for that host, without waiting for any other hosts to finish.
This will happen for every host in the set and for every task in the play. Each host will complete the tasks as fast as they can, thus minimizing the execution time of each specific host. While most playbooks will use the default linear strategy, there are situations where the free strategy would be advantageous; for example, when upgrading a service across a large set of hosts. If the play requires numerous tasks to perform the upgrade, which starts with shutting down the service, then it would be more important for each host to suffer as little downtime as possible.
Allowing each host to independently move through the play as fast as it can will ensure that each host is only down for as long as necessary. Without using the free strategy, the entire set will be down for as long as the slowest host in the set takes to complete the tasks.
As the free strategy does not coordinate task completion across hosts, it is not possible to depend on the data that is generated during a task on one host to be available for use in a later task on a different host. There is no guarantee that the first host will have completed the task that generates the data.
Execution strategies are implemented as a plugin and, as such, custom strategies can be developed to extend Ansible behavior by anyone who wishes to contribute to the project.
The first thing that most plays define (after a name, of course) is a host pattern for the play. This is the pattern used to select hosts out of the inventory object to run the tasks on. Generally, this is straightforward; a host pattern contains one or more blocks indicating a host, group, wildcard pattern, or regular expression (regex) to use for the selection. Blocks are separated by a colon, wildcards are just an asterisk, and regex patterns start with a tilde:
hostname:groupname:*.example:~(web|db)\.example\.com
Advanced usage can include group index selections or even ranges within a group:
webservers[0]:webservers[2:4]
Each block is treated as an inclusion block; that is, all the hosts found in the first pattern are added to all the hosts found in the next pattern, and so on. However, this can be manipulated with control characters to change their behavior. The use of an ampersand defines an inclusion-based selection (all the hosts that exist in both patterns).
The use of an exclamation point defines an exclusion-based selection (all the hosts that exist in the previous patterns but are NOT in the exclusion pattern):
webservers:&dbservers: Hosts must exist in both the webservers and dbservers groups.webservers:!dbservers: Hosts must exist in the webservers group but not the dbservers group.Once Ansible parses the patterns, it will then apply restrictions if there are any. Restrictions come in the form of limits or failed hosts. This result is stored for the duration of the play, and it is accessible via the play_hosts variable. As each task is executed, this data is consulted, and an additional restriction could be placed upon it to handle serial operations. As failures are encountered, be it a failure to connect or a failure to execute a task, the failed host is placed in a restriction list so that the host will be bypassed in the next task.
If, at any time, a host selection routine gets restricted down to zero hosts, the play execution will stop with an error. A caveat here is that if the play is configured to have a max_fail_precentage or any_errors_fatal parameter, then the playbook execution stops immediately after the task where this condition is met.
While not strictly necessary, it is a good practice to label your plays and tasks with names. These names will show up in the command-line output of ansible-playbook and will show up in the log file if the output of ansible-playbook is directed to log to a file. Task names also come in handy when you want to direct ansible-playbook to start at a specific task and to reference handlers.
There are two main points to consider when naming plays and tasks:
The names of the plays and tasks should be unique.Beware of the kinds of variables that can be used in play and task names.In general, naming plays and tasks uniquely is a best practice that will help to quickly identify where a problematic task could be residing in your hierarchy of playbooks, roles, task files, handlers, and more. When you first write a small monolithic playbook, they might not seem that important. However, as your use of and confidence in Ansible grows, you will quickly be glad that you named your tasks! Uniqueness is more important when notifying a handler or when starting at a specific task. When task names have duplicates, the behavior of Ansible could be non-deterministic, or at least non-obvious.