29,99 €
This updated edition of Learning Continuous Integration with Jenkins is your one-stop guide to implementing CI/CD with Jenkins, addressing crucial technologies such as cloud computing, containerization, Infrastructure as Code, and GitOps. Tailored to both beginners and seasoned developers, the book provides a practical path to mastering a production-grade, secure, resilient, and cost-effective CI/CD setup.
Starting with a detailed introduction to the fundamental principles of CI, this book systematically takes you through setting up a CI environment using Jenkins and other pivotal DevOps tools within the CI/CD ecosystem. You’ll learn to write pipeline code with AI assistance and craft your own CI pipeline. With the help of hands-on tutorials, you’ll gain a profound understanding of the CI process and Jenkins’ robust capabilities. Additionally, the book teaches you how to expand your CI pipeline with automated testing and deployment, setting the stage for continuous deployment. To help you through the complete software delivery process, this book also covers methods to ensure that your CI/CD setup is maintainable across teams, secure, and performs optimally.
By the end of the book, you’ll have become an expert in implementing and optimizing CI/CD setups across diverse teams.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 498
Veröffentlichungsjahr: 2024
Learning Continuous Integration with Jenkins
An end-to-end guide to creating operational, secure, resilient, and cost-effective CI/CD processes
Nikhil Pathania
BIRMINGHAM—MUMBAI
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Preet Ahuja
Publishing Product Manager: Suwarna Patil
Book Project Manager: Ashwini C
Senior Content Development Editor: Adrija Mitra
Technical Editor: Arjun Varma
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Hemangini Bari
Production Designer: Prafulla Nikalje
DevRel Marketing Coordinator: Rohan Dobhal
First published: May 2016
Second edition: December 2017
Third edition: February 2024
Production reference: 1090124
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK
ISBN 978-1-83508-773-2
www.packtpub.com
This book is dedicated to the spirited debaters of social media – your passionate, sometimes eyebrow-raising discussions on CI, CD, and DevOps have not only provided entertainment but also ignited the spark to pen down these pages. Here’s to dispelling myths and expanding knowledge, one Jenkins pipeline stage at a time!
– Nikhil Pathania
Nikhil Pathania is a tech expert with deep knowledge in the Software Development Lifecycle (SDLC) domain. His professional identity is shaped by his specialization in Agile methodologies, DevOps practices, cloud technologies, and container solutions. His significant contributions, particularly in implementing CI/CD frameworks at multinational corporations, underline his expertise in optimizing complex software development processes.
In his current role as a solutions architect at Arla Foods, Nikhil spearheads innovative projects in software development and data analytics, reflecting his keen insight into technological advancements and their practical applications.
I’m deeply grateful to Suwarna Rajput (Publishing Product Manager) for her vital role in reigniting the spark in this book project. Immense thanks to Ashwini Gowda (Book Project Manager) and Adrija Mitra (Content Development Editing) for their continuous support throughout the writing process; their constant motivation and belief to make this book better were invaluable.
Immense gratitude to Arjun Varma (Technical Editor), Werner Dijkerman, and Aditya Soni (Technical reviewers) their meticulous review and insightful feedback on the book. Their expertise and constructive suggestions have been invaluable in enhancing the quality and depth of this work.
Special thanks to my wife, Karishma, whose support made the time for writing this book possible. Her belief in me was both my inspiration and foundation.
A lighthearted shoutout to the Danish 37-hour work week for its “unintentional” role in this book’s creation. It’s amazing how a few extra hours each week can lead to chapters of inspiration!
Werner Dijkerman is a freelance platform, Kubernetes (certified), and Dev(Sec)Ops engineer. He currently focuses on and works with cloud-native solutions and tools, including AWS, Ansible, Kubernetes, and Terraform. He focuses on infrastructure as code and monitoring the correct “thing,” with tools such as Zabbix, Prometheus, and the ELK stack. He has a passion for automating everything and avoiding doing anything that resembles manual work. He is an active reader of comics, self-care/psychology, and IT-related books. He is a also technical reviewer of various books about DevOps, CI/CD, and Kubernetes.
A big shoutout to the best software architect in the Netherlands: Ernst Vorsteveld!
Aditya Soni is a DevOps/Site Reliability Engineering (SRE) tech professional who has taken an inspiring journey with technology and achieved a lot in a short period of time. He has worked with product- and service-based companies including Red Hat and Searce, and is currently positioned at Forrester as a DevOps Engineer II. He holds AWS, GCP, Azure, RedHat, and Kubernetes certifications. He mentors and contributes to the open source community. He has been a CNCF ambassador and AWS Community Builder for four years. He leads AWS, CNCF, and HashiCorp user groups for Rajasthan State in India. He has spoken at many conferences, both in-person and virtual. He is a hardcore traveler who loves food, exploring, and self-care, and shares stories on social media as @adityasonittyl.
I would like to thank my parents and family who support me in various ways in my busy schedule. Taking the time to review the book wouldn’t have been easy without them. Thanks to my friends, who always cheer me up and give me confidence whenever I feel lost. Last but not least, thanks to the open source communities, mentors, managers, and co-workers from the initial days to now, who have helped me along my path.
Dear readers, welcome to the newly updated Learning Continuous Integration with Jenkins – Third Edition! Before diving into the book, I invite you to spend a few moments with this Preface. It’s not just a formality; it’s like a friendly chat before embarking on an exciting exploration. Here, you will get a sneak peek into the book’s purpose, the author’s perspective, and the overall tone and style. It’s like getting to know your guide before setting off on an expedition.
When the second edition of Learning Continuous Integration with Jenkins was published in 2017, the landscape of Continuous Integration/Continuous Deployment (CI/CD) was noticeably different from today’s. As 2024 approaches, it’s become clear that the practices and tools I discussed have evolved substantially, making many sections of the previous edition less relevant. This shift in the CI/CD realm inspired me to write this updated edition, bridging the gap between then and now, especially in terms of Jenkins’ application.
Jenkins itself has undergone significant evolution. The advent of Jenkins Configuration as Code (JCasC) and the trend toward deploying Jenkins using Helm charts on Kubernetes exemplify these major shifts. While the core syntax of Jenkins pipeline code has remained mostly stable, the ecosystem surrounding CI/CD has been transformed. The emergence of GitOps, the heightened focus on software composition analysis, and the trend toward container-based applications underscore a broader shift, from monolithic to more modular and scalable architectures.
Another catalyst for this edition has been the confusion and misinformation I’ve witnessed on social media about Agile, CI/CD, GitOps, Infrastructure as Code (IaC), and DevOps. Misinterpretations and partial truths, often spread by less experienced professionals, have clouded the understanding of those new to these practices. This edition aims to dispel these myths, offering a lucid and comprehensive guide to CI/CD methodologies. It focuses not only on tools but also on the principles and practices essential for successful implementation.
The primary aim of this book is to instill a foundational understanding of CI/CD. The book goes beyond explaining CI and CD as mere technical processes, delving into the concepts and circumstances that gave rise to CI/CD practices. It emphasizes a comprehensive grasp of the underlying principles before exploring the tools, focusing on what CI/CD is, why it is crucial, and how it is implemented, with a particular emphasis on its core elements.
As the book progresses, the content becomes more technical, yet it maintains a straightforward approach. You will learn about the efficient and secure deployment of a container-based modular application through the CI/CD process, from development to production, with an emphasis on testing. They will also explore how Jenkins integrates within the broader DevOps ecosystem, working in tandem with various tools.
A critical lesson the book imparts is the “fail fast and shift left” approach. It underscores the importance of encountering and addressing failures in the development or testing stages, rather than in production. This mindset shift is vital – it’s not about preventing failures in production but about ensuring rapid recovery when they occur.
Finally, the book’s purpose extends beyond merely enhancing your knowledge. It is also designed to train your mind, equipping you to think critically and adaptively in the fast-evolving world of DevOps and CI/CD. This dual focus ensures that you are not only well informed but also strategically adept in applying CI/CD principles effectively in real-world scenarios.
In today’s rapidly evolving technological realm, the concept of “continuous” – originally derived from Agile methodologies – has significantly broadened its scope. This concept now underpins a variety of practices, such as continuous testing, continuous improvement, and continuous thinking. Moreover, the principles of CI and CD have expanded their influence beyond traditional software development, finding relevance in fields such as Machine Learning Operations (MLOps).
This book is essential reading in this broadened context. It offers a comprehensive demystification of the core principles of CI and CD, principles that are increasingly becoming across various disciplines. The book delves deeply into the concepts of failing fast, shifting left, and delivering small, frequent, and faster iterations. These are not just software development strategies but also universal principles that are now being applied in areas such as ML and financial operations.
As these principles continue to be adopted in diverse sectors, there is a pressing need for a clear, comprehensive guide that elucidates not just the “how” but also the “why” behind these practices. This book meets that need, providing you with an in-depth understanding of the foundational principles underpinning modern CI/CD practices and their broader applications.
In this updated edition, you’ll discover revised content reflecting the latest developments in Jenkins configurations and deployment strategies. It offers practical insights into scaling Jenkins both horizontally and vertically, delving deeper into the evolving CI/CD ecosystem. Whether you’re new to Jenkins or seeking to refresh your knowledge, this book aims to guide you through the dynamic landscape of CI/CD with clarity and expertise.
The creation of this book is the culmination of extensive research and deep personal experience in software configuration management and DevOps, dating back to 2008. My journey in writing this book began with a thorough exploration of the world before Agile, delving into methodologies such as Waterfall and extreme programming. Understanding these historical contexts was crucial in comprehensively addressing the evolution of software development practices.
A significant focus of my research was on the transition from GUI-based configurations to the concept of configuration as code and pipeline as code, as well as the evolution of Version Control Systems (VCSs). These developments represent fundamental shifts in our approach to software configuration and management, and understanding them is key to grasping the principles of CI and CD.
Every technical aspect discussed in this book has been rigorously tested in practice before being committed to paper. This hands-on approach ensures that the content is not only theoretically sound but also practically applicable. The book places a particular emphasis on cloud and container technologies, acknowledging their growing prominence and critical role in modern software development.
Additionally, the book explores the emerging realm of AI tools in DevOps, illustrating how they can be utilized to write pipeline code. This inclusion reflects my commitment to staying abreast of the latest technological advancements and ensuring that the book’s content remains relevant and forward-thinking.
To enhance understanding and clarity, this book is enriched with a wealth of illustrations and screenshots. These visual aids are designed to make complex concepts more accessible and to provide you with a clearer picture of practical applications. May your experience with Learning Continuous Integration with Jenkins be both enlightening and rewarding, and I trust this book will prove to be an invaluable resource in your professional journey.
This book is designed for a diverse audience, from university students studying Agile software development to seasoned developers, testers, release engineers, and project managers. It offers a comprehensive guide to mastering CI and CD using Jenkins. If you’re already using Jenkins for CI, you can take your project to the next level – CD. Whether you’re a novice to the concepts of Agile and CI/CD or a DevOps engineer seeking advanced insights into JCasC, IaC, or Azure, this resource equips you with the tools to harness Jenkins for improved productivity and streamlined deliveries in the cloud.
You are expected to possess a fundamental understanding of software development processes, although in-depth knowledge is not a prerequisite. You should be familiar with basic concepts, such as writing code and the importance of testing, although a comprehensive grasp of more complex software engineering aspects is not necessary. A basic familiarity with version control, especially Git, would be beneficial, since CI/CD processes are intimately linked with source code management, including tasks such as committing changes, creating branches, and merging code.
Having some grounding in programming, even at a rudimentary level, would be advantageous. You don’t need to be an expert coder, but being able to write and understand simple code in at least one programming language is helpful. While you might understand why testing is crucial in software development, detailed knowledge of automated testing methods or tools is not required.
Chapter 1, The What, How, and Why of Continuous Integration, delves into a comprehensive introduction to CI, guided by the Golden Circle theory. This approach helps us to unravel the "what,” "how,” and "why” of CI. Our primary focus is to define the practice of CI, understand its key principles, and learn the essential elements required to achieve it. We will also explore the reasons behind the practice of CI.
Chapter 2, Planning, Deploying, and Maintaining Jenkins, guides you through planning, deploying, and maintaining a Jenkins server. The aim is to design and deploy a Jenkins setup that is resilient, cost-effective, secure, high-performing, and operational. The chapter starts by examining the Jenkins server architecture, and then it evaluates various deployment scenarios against the Well-Architected Framework. It focuses on the two most popular deployment methods for Jenkins, guiding you through their implementation step by step. This process will integrate crucial DevOps practices, including IaC and JCasC. The chapter also covers the essential aspects of Jenkins server maintenance.
Chapter 3, Securing Jenkins, examines the key aspects of securing Jenkins. Here, the vital measures to enhance the security around who gets to do what on your Jenkins instance are explored. Firstly, the chapter delves into user authentication and permissions by integrating Jenkins with Azure Active Directory (AD). After that, it goes through Cross-Site Request Forgery (CSRF) protection settings inside Jenkins. Lastly, it explores the powerful Jenkins Credentials feature, which allows for secure storage and usage of sensitive information, such as passwords, API keys, and certificates.
Chapter 4, Extending Jenkins, explores the expansive world of Jenkins enhancements, enabling you to tailor its functionalities for specific needs such as CI. Enhancing Jenkins for CI demands the integration of additional tools and services, such as SonarQube, Artifactory, and a VCS, which is what the chapter is all about.
Chapter 5, Scaling Jenkins, teaches you how to scale Jenkins horizontally on the cloud with dynamically produced build agents, using both Virtual Machines (VMs) and containers on an AKS cluster. Both solutions allow organizations to leverage the strengths of each approach. VMs provide flexibility and compatibility with existing infrastructure, while containers offer efficient resource utilization and faster deployment times.
Chapter 6, Enhancing Jenkins Pipeline Vocabulary, explores learning Jenkins pipeline code syntax. The chapter’s aim is to prepare you for the use of AI to write a Jenkins pipeline. To achieve this, the chapter begins with an introduction to the Jenkins pipeline syntax. Next, you will learn about the core and add-on building blocks of pipeline code. The focus is mainly on the structure and skeleton of a pipeline. It also teaches you some internal Jenkins tools to construct pipeline code.
Chapter 7, Crafting AI-Powered Pipeline Code, delves into using ChatGPT to write pipeline code. In this chapter, we embark on an enlightening journey into the world of Artificial Intelligence (AI), with a special focus on ChatGPT, a renowned AI model. As we navigate the evolving landscape of AI, you’ll be equipped with the knowledge to harness ChatGPT to aid in the construction of pipeline code.
Chapter 8, Setting the Stage for Writing Your First CI Pipeline, focuses on planning for CI and understanding the high-level CI design. The chapter begins by explaining the software projects that will be used for CI. Next, we will learn to configure tools such as SonarQube for code quality and Artifactory for Docker image storage. These platforms are vital in our CI pipeline. By the end of the chapter, you’ll have grasped how to analyze a software project for CI, comprehend its architecture, and develop a CI design. We’ll also touch on setting real-time CI triggers through webhooks.
Chapter 9, Writing Your First CI Pipeline, logically rounds up everything you learned in the previous chapters. It’s a step-by-step, hands-on guide that will teach you to create a CI pipeline in Jenkins. You will start by writing CI pipeline code stage by stage. At the end of the chapter, we will take a walkthrough of the CI pipeline run, using the Jenkins Blue Ocean interface.
Chapter 10, Planning for Continuous Deployment, delves into understanding CD. You will be introduced to the concepts of CD and its elements, including GitOps. Subsequently, you will be acquainted with a high-level CD design. This is followed by setting up the Argo CD tool, establishing staging and production environments on Azure Kubernetes Service (AKS), and undertaking other essential steps to run a CD pipeline with Jenkins.
Chapter 11, Writing Your First CD Pipeline, methodically steers through the process of developing a full-fledged CD pipeline. Through the automated CD pipeline, you will master the act of updating an application’s Helm chart on GitHub, consequently triggering a deployment in the staging environment. Post-deployment, the pipeline autonomously monitors the application’s status in staging, runs performance tests, and, upon validation, facilitates further Helm chart modifications on GitHub to initiate deployment in the production environment.
Chapter 12, Enhancing Your CI/CD Pipelines, explores techniques to enhance your CI and CD pipelines. It introduces GitHub Copilot, an AI tool that refines Jenkins pipeline code development, aiming for smarter coding, fewer errors, and faster development. It also discusses Jenkins Shared Libraries, which centralize common code patterns, simplifying the management of multiple pipelines. Additionally, strategies are provided to handle and remove old Jenkins builds, ensuring system optimization. Furthermore, using JFrog Xray, the chapter demonstrates how you can integrate automated security scans in your Jenkins pipeline, guaranteeing not just functional but also secure code deployment.
This book thoroughly covers concepts and procedures executed on the Azure cloud, eliminating the need for extensive setups on your computer. There’s no requirement to install Docker Desktop, any VMs, or a Kubernetes cluster on your local machine. All you need is an Azure subscription and certain CLI tools on your laptop. Here is a concise list:
Tool/subscriptions
OS
An Azure subscription
NA
Visual Studio Code (minimum 1.84.2 or the latest)
Windows, macOS, or Linux
The Azure CLI (minimum 2.49.0 or the latest)
Windows, macOS, or Linux
Helm (minimum v3.12.1 or the latest)
Windows, macOS, or Linux
Git (minimum 2.41.0 or the latest)
Windows, macOS, or Linux
Terraform (minimum v1.5.2 or the latest)
Windows, macOS, or Linux
kubectl (minimum v1.27.3 or the latest)
Windows, macOS, or Linux
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
This book guides you through the process of creating CI and CD pipelines for a container-based, modular, three-tier “Hello World” web application. Once you’ve mastered the concepts and techniques presented, I encourage you to apply your knowledge to a more complex software project. Delve deeper into the intricacies of each DevOps tool’s configuration within the CI and CD platform. Explore the possibilities of configuring Jenkins entirely using JCasC. On the testing front, challenge yourself by integrating more advanced testing methods into your workflow.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Learning-Continuous-Integration-with-Jenkins_Third-Edition. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Now, create a Kubernetes cluster using the az aks create command with the -g option, placing it inside the resource group we created in the previous step. Use the -n option to give your Kubernetes cluster a name.”
A block of code is set as follows:
configuration-as-code git sonarWhen we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
jenkins: systemMessage: "Jenkins configured automatically by Jenkins Configuration as Code plugin\n\n" securityRealm: local: users: - id: jenkins-admin password: passwordAny command-line input or output is written as follows:
az aks create -g rg-nikhil-sbox -l westeurope \ -n kubernetes-dev --tier free --node-count 1 \ --enable-cluster-autoscaler --min-count 1 --max-count 3 \ --network-plugin kubenet --generate-ssh-keysThe commands presented in this book are tested for execution within a PowerShell command-line shell. While alternative command-line shells may be utilized, some modifications to the commands may be necessary.
Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “From the dashboard, click on Azure Active Directory, and then select App registrations from the menu.”
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com
Once you’ve read Learning Continuous Integration with Jenkins, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link below:https://packt.link/free-ebook/9781835087732
Submit your proof of purchase.That’s it! We’ll send your free PDF and other benefits to your email directly.Consider a factory with a huge production line that snakes across the floor like a metallic river. Each portion of this assembly line flows flawlessly into the next, but there are also moments when components falter and processes stutter here and there. The actual marvel of modern production, however, is not flawless execution but intelligent contingency plans, the ability to self-correct, and the relentless quest for efficiency. Mistakes are not only problems; they are also chances for improvement and growth. Continuous Integration (CI) represents this same resilient attitude in the digital art of software development.
The blueprint for our own sophisticated CI manufacturing line is the first step in our journey. We use the Golden Circle theory to unravel the “what,” streamline the “how,” and distill the “why” of CI. Before we roll up our sleeves and begin tinkering with the tools and approaches in the upcoming parts, let us first become acquainted with the CI blueprint. In this part, we describe the CI process, become acquainted with its essential concepts, and prepare to answer why we do CI at all. Just as engineers on the assembly line understand that each turn of the screw is critical, you will learn how each facet of CI is critical for the smooth flow of the development life cycle.
This part has the following chapter:
Chapter 1, The What, How, and Why of Continuous IntegrationIn this chapter, a comprehensive introduction to continuous integration (CI) is presented using the Golden Circle Theory (see [1] in the Further reading section at the end of the chapter) that will help us in answering the what, how, and whyof CI.
Our focus here is primarily on defining the practice of CI, understanding its key principles, learning the elements required to achieve it, and lastly, answering why we practice CI.
The practical implementation of CI is, however, presented in detail throughout the remaining chapters of the book.
After completing this chapter, you should be able to do the following:
Explain what continuous integration isLearn how to practice continuous integrationUnderstand why continuous integration is crucialIn this section, we will try to answer what CI is in two basic steps. We will first attempt to define CI, and then examine its key principles.
CI is a software development practice in which developers regularly integrate their code changes into the project’s shared work area and inspect their changes for build issues and other quality-related problems.
Integration is the act of submitting your modified code to the project’s main branch (the potential software solution). This is technically done by merging your feature branch to the main branch in a trunk-based development (a Git-based branching strategy) and, additionally, by merging the feature branch into the development branch in a Gitflow Workflow-based development (another Git-based branching strategy), as shown in the following figure:
Figure 1.1: The process of frequently integrating your changes with the main code
CI is necessary to detect and resolve issues encountered during and after integration as early in the cycle as possible. This can be understood from Figure 1.2, which depicts various issues encountered during a single CI cycle:
Figure 1.2: The CI process
Figure 1.2: The CI process
When a developer commits and pushes (merges) their code changes to the remote branch, they may experience merging difficulties, if not build failures, failed unit tests, or failed integration tests.
A merging problem can emerge if developers do not routinely pull (rebase) the code from the remote trunk branch on their local branch. Similarly, the code may fail to compile owing to compilation issues, and failing tests indicates a defect, which is a positive thing. If any of these issues occur, the developer must adjust the code to correct it.
Now that we’ve defined CI, let’s look at the key ideas that underpin it. Some of it, I suppose, you’ve already guessed. Let’s dive in
In the context of CI, a pull simply updates your local branch with the most recent code changes from the remote branch, while a push publishes your code changes to the remote branch.
Git, the popular distributed version control system, inspired the notion of pull/push. However, prior to the popularity of Git, the terms rebase and merge were more prominent. Nevertheless, let’s understand why frequent pulls promote CI.
Working on your local branch for an extended period without often pulling changes is handy, but it is also quite likely to result in a flurry of merging issues. This occurs when, in big development teams, hundreds of code changes are submitted to the remote repository each day, considerably altering it and raising the likelihood of encountering merge conflicts from a developer who seldom clicks the pull button.
On the other hand, such situations are rare in teams that frequently pull and push code changes. This is the most important principle of CI as it helps avoid merge hell.
You validate your code change for compilation issues, failing unit tests, failing integration tests, and other quality checks. You first do this locally on your local workspace (laptop), and next on the build server when you push changes to the remote repository.
This practice makes sure that every commit is validated to see its impact on the system. Afterward, make sure to share the results with the team. The idea is to get instant feedback on the changes that have been made.
We think the following idea is based on Murphy’s law, which states that “anything that can go wrong will go wrong.” Building on that, in the context of the software development life cycle, we may argue that if failure is unavoidable, it is preferable to fail sooner in production rather than later. Pulling code changes often and verifying each code change enhances the likelihood of identifying merge problems and functionality and quality concerns, hence failing quicker.
Let us understand the idea of fail fast a bit differently. The concept of fail fast advocates that all the pipeline stages, such as validating merge issues, checking for compilation issues, unit testing, integration testing, static code analysis, and so on, must be performed sequentially for a particular code change rather than simultaneously. This is because building the code is pointless if there are merge issues; similarly, running unit testing is pointless if the build fails, and so on and forth. In other words, stop as soon as you find there is an issue and try something else.
Automation acts as a catalyst. From finding failures to successfully publishing your application, automation accelerates everything. Automate anything you can, whether it’s code compilation, unit testing, integration testing, static code analysis, packaging the finished app or service, and so on.
Testing is the most time-consuming and repetitive operation among the others; thus, automating the testing process may considerably boost the speed of software delivery. The testing process, on the other hand, is more difficult to automate than the build, release, and packaging phases.
It usually takes a lot of effort to automate nearly all the test cases used in a project. It is an activity that matures over time. Hence, when beginning to automate the testing, we need to take a few factors into consideration. Test cases that are of great value and easy to automate must be considered first. For example, automate the testing where the steps are the same but they run every time with different data. Also, automate the testing where a software functionality is tested on various platforms. Additionally, automate the testing that involves a software application running with different configurations.
Later sections will discuss the significance of a CI tool in automating and orchestrating the CI process.
The software development process is analogous to a factory assembly line. The delivery of new features and upgrades flows from left to right. The far left of the pipeline is where development takes place, and the far right is where your application is operating in production. As depicted in the following diagram, consider such an assembly line with a single feedback loop flowing from the operations team to the development team:
Figure 1.3: Example of a late and slow feedback loop
Because there is a single information flow moving backward in such a case, the developers will never know whether their modifications have produced a well-functioning application until the application is launched. Another thing to keep in mind is that the time it takes for the feedback to reach developers equals the time it takes to bring a feature from development to deployment.
One such method with limited feedback loops is the waterfall model of software development. In this model, the first feedback occurs when all the code is integrated, the second when the code is tested, and the final when it is put into production. Nothing tangible can be done quickly when the operations team complains about a failure in production whose code was created a year earlier. We require more immediate feedback.
And that is the aim of continuous feedback: to have more and more feedback loops in our software development life cycle that are closer together and happen faster. One such example is the practice of peer review, which often occurs throughout the development process. During peer review, the proposed modifications are assessed by a senior developer to assert whether the changes will function as intended, as shown in the following figure:
Figure 1.4: Example of a continuous feedback loop
Continuous input from all phases of the software development life cycle should be provided with as much clarity as feasible. Besides, if your software development process is heavily automated, then it further helps in achieving continuous feedback.
This brings us to the end of this section. We’ve learned the definition of CI and its key principles; now, let’s look at how to put these principles into action in the next section.
To understand how CI is practiced, you must first realize that it is a constantly evolving ecosystem of tools and technologies, and this section will go through most of them. At the end of this part, it will be obvious how these separate components and their associated tools contribute to the realization of the core concepts of CI outlined previously. Besides, the subsequent chapters of the book are a detailed, thorough elaboration on the practices presented here. As a result, the current section is crucial.
The topics discussed in the following sections are structured sequentially and in accordance with the various stages of an automated CI pipeline:
Figure 1.5: Various stages of a CI pipeline
So, let’s start with the first and foremost aspect of CI: the version control system.
A version control system helps the team in controlling changes to the project’s source code. It does this by monitoring every code change made to the source code, resulting in a detailed history of who modified what at any given time. It enables developers to work in their own workspaces, providing isolation and, at the same time, offering a means for collaboration and dispute resolution. Most significantly, it helps you to maintain a single source of truth for your project.
Everything that is necessary for delivering the software application should ideally be version-controlled and tracked, including software dependencies that are tracked in a slightly different way using a binary repository manager (a DevOps tool for managing all the artifacts, binaries, files, and container images throughout the software development life cycle of your application), as we shall see later in this chapter. You can even expand on this idea by versioning code to manage and synchronize infrastructure configurations and application deployments. This is knownas GitOps.
A version control tool offers features such as tags to mark important milestones in the software development life cycle, and branches to enable independent development that will ultimately be merged back to where it came from.
Branching allows parallel development. It allows developers in a team to work on individual versions of the code at the same time. To understand this, consider a software development project that has all its source code stored in a Git repository with a single main/master branch. Assume there is a requirement to deliver two new features and two different developers from the team are picked to work on them independently, without interfering with each other’s work. The solution is that they would use branches. Each developer would create their own branch and work on it in solitude. When they have completed their work and the CI pipeline running on their separate branches is green, the changes are merged back into the main branch.
Though there are no restrictions on how branches can be utilized, it is recommended that certain industry-standard branching methods be employed. Gitflow Workflow and, more recently, the trunk-based workflow, are two such branching strategies.
Gitflow is a more restrictive development paradigm in which not everyone has the privilege to make modifications to the main source. This keeps bugs to a minimum and code quality to the highest level.
On the other hand, because all developers have access to the primary code, trunk-based development is a more vulnerable branching strategy. Nevertheless, it allows teams to iterate fast.
Gitflow Workflow is a branching methodology that utilizes multiple branches to organize source code. In the workflow shown in the following diagram, the main/master branch is kept clean and only contains releasable, ready-to-ship code. All development takes place on the feature branches, with the development branch acting as a central location for integrating all features.
Figure 1.6 depicts Gitflow Workflow. As you can see, we have release branches that are taken out from the development branch when there is a stable release. All bug fixes for a release are performed on the release branch. There is also a hotfix branch that is created from the main/master branch when a hotfix is required:
Figure 1.6: Gitflow Workflow
The trunk-based branching technique encourages the use of the main/master branch as the single location to track all significant milestones of a software development life cycle. You may still create tiny, temporary feature branches from the main/master branch, but you must merge them into the main/master branch as soon as possible.
Someone who has been in the IT industry for more than a decade will find it amusing to see that the trunk-based workflow is popular again. When version control and CI were new concepts, the trunk-based workflow was what most teams used; then, Gitflow Workflow became popular for good reasons and remained so for a very long time, only to be overthrown by the trunk-based workflow again.
In the trunk-based branching model, just one CI/CD pipeline that tracks the main/master branch is required. When a commit occurs, the CI/CD pipeline is activated, which results in a releasable candidate. Any commit to the main/master branch might be a potential release candidate, a hotfix, or a new feature.
Because everyone is permitted to merge changes to the main/master branch, a strict code review procedure is required. This methodology is better suited to high-performance contemporary software projects where the entire CI/CD process is automated from start to finish. Microservice-based software development projects, in particular, can benefit greatly from the trunk-based workflow since they have a faster CI/CD process. Monolithic-based projects, on the other hand, would benefit more from Gitflow Workflow.
Because this branching model lacks feature branches, feature flags can be used instead. Feature flags, also known as feature toggles, are a software development approach that allows you to publish software programs that have dormant features that may be toggled on and off during runtime.
The first and foremost step of any CI pipeline is to check whether the code changes are compilable. Basically, we try to see whether we can generate a working binary that can be tested and deployed. If the code base happens to be in an interpreted language, then we check whether all the dependencies are present and packed together. There is no point moving any further in the CI process if the code changes fail to compile.
The process of CI advocates that you compile every code change, trigger a notification (feedback loop), and fail the whole pipeline run in the event of a failure. As such, the duration of the build run (build time) plays a critical factor in determining the effectiveness of the CI process. If the build time is long (say, several hours), it becomes impractical to build every code change on an infrastructure limited by cost and availability. Automation and CI have no value if pipelines take hours to complete.
Modularizing your application is an effective way to counter software builds that take a longer time to build.
Remember the principle of continuous feedback: “to have more and more feedback loops in our software development life cycle that are closer and faster.” Unit testing is one such feedback that is closer and faster to the development stage.
Without unit testing, the next level of testing is the long-running integration testing stage, followed by the QA stage from the continuous delivery/deployment pipeline. It will normally take some time for your build to reach the QA engineering team, who will usually raise a defect if they identify an issue. In summary, if unit testing is used, a lengthy feedback cycle can be avoided.
Unit tests enable you to test the smallest testable portion of your program faster. If your source code is well written, unit tests are easy to write and easy to resolve when they fail.
Code coverage is an important CI metric and it’s defined as the proportion of code covered by unit tests. Having a high code coverage score is essential since detecting and resolving defects in QA or production is both time-consuming and expensive.
Any piece of code that lacks a matching unit test is a possible risk; therefore, having a greater code coverage score is crucial. It’s also worth noting that code coverage percentage does not ensure the functionality of the code covered by unit tests.
The percentage of code coverage is calculated during unit testing. Almost every programming language has a coverage tool. However, the findings are mostly uploaded to a static code analysis tool such as SonarQube, where they are shown on a lovely dashboard among other data concerning code quality.
Static code analysis, sometimes known as white-box testing, is a type of software testing that searches for structural features in the code. It determines how resilient or maintainable the code is. Static code analysis is conducted without actually running programs. It differs from functional testing, which explores the functional features of the software and is dynamic.
Static code analysis is the evaluation of software’s inner structures. It doesn’t question the code’s functionality. A static code analysis tool such as SonarQube comes with a dashboard to show various metrics and statistics of each CI pipeline run. Static code analysis is triggered every time a build runs after the unit testing. SonarQube supports many languages, such as Java, C/C++, Objective-C, C#, PHP, Flex, Groovy, JavaScript, Python, PL/SQL, COBOL, and so on.
When you run static code analysis on your source code, the SonarQube tool generates a report containing important metrics concerning the quality of your code. These metrics are grouped into explicit sections: complexity, duplications, issues, maintainability, reliability, security, size, and tests [2]. Every static code analysis report is published to its own project-specific dashboard on SonarQube, branch by branch. However, the measurements are shown collectively for the project as a score or a number, or by using a time series graph.
Talking about metrics, let’s look at one such section: tests. This covers a variety of metrics relating to code coverage and unit testing, such as the following:
Uncovered lines: The number of source code lines that are not covered by the unit testsUnit tests: The number of unit testsUnit test duration: The collective duration of all unit testsUnit test errors: The number of failed unit testsAnd the list goes on.
Security is another key quality metric section. It includes metrics such as vulnerabilities (the total number of security problems discovered). Remember the Apache Log4j security flaws? This is where you’d detect it and stop it before it goes any further.
Quality profiles are collections of rules that you choose to apply during static code analysis. This can be done from the SonarQube dashboard. Quality profiles are language-specific, hence you would usually see more than one quality profile assigned to a project in SonarQube.
The issues you see for a particular static code analysis run are generated based on the SonarQube rules they broke. The SonarQube rules [3] are categorized as follows:
Rule Type
Quality Metric Section
Code smell
Maintainability
Bug
Reliability
Vulnerability
Security
Security hotspot
Security
Table 1.1: The SonarQube rule types mapped against their corresponding quality metrics
Quality profiles can have rules added or removed. However, this is normally done by a specific governance team from an organization’s cyber security department, which oversees the evaluation of new and old rules on a regular basis and builds custom quality profiles to be utilized by development teams.
So far, we’ve looked into quality metrics, profiles, and rules. But how can we turn this insight into action? The answer is quality gates. A quality gate serves as a checkpoint through which a pipeline flow must pass. Quality gates are, in practice, combinations of specific conditions. An example of a quality gate condition can go as follows: the number of bugs is less than 10, and the percentage of code coverage is greater than 95%.
CI pipelines can be configured to pass or fail based on the results of the quality gate check. In the subsequent chapters of the book, you will learn how to achieve this in practice. Quality gates are normally decided and applied by development teams. The following diagram depicts the various stages of a CI/CD pipeline, emphasizing the crucial function of the Quality Gate as a pivotal checkpoint where code must match set quality standards before it can proceed to deployment.
Figure 1.7: SonarQube Quality Gate
If code fails to meet these standards at the Quality Gate, it is returned for additional development and improvement, guaranteeing that only high-quality, thoroughly tested software advances to the Continuous Deployment phase for release and operation.
CI pipelines generate binary artifacts that are deployable apps and services. When you choose to run the pipeline for every commit made to the trunk branch, this happens much more quickly. As a result, these built packages must be saved and managed somewhere for subsequent usage, and this is usually done using a binary repository manager. But what exactly is a binary repository manager?
A binary repository manager, such as JFrog Artifactory, is a system for storing, managing, and promoting built artifacts by incorporating metadata. The metadata attached to every artifact is a group of properties.
In addition to storing built packages, the binary repository manager is also responsible for keeping track of dependencies that go into building your project. Let’s see some of the features of a binary repository manager.
A binary repository manager, such as JFrog Artifactory, is essential for not just storing the built artifacts but also accounting for them.
For example, every artifact hosted on the Artifactory server has a unique URL. This enables you to embed a specifically built artifact into configuration management scripts (for example, Chef recipes) and utilize them afterward to distribute packages on multiple machines.
A property (singular) or property set (plural) is another essential feature included with Artifactory. A property is an entity with a key and a value pair, while a property set is a collection of several key and value pairs. The property/property set is attached to the artifacts as XML metadata. This makes it possible to query artifacts using the GUI (short for graphical user interface) or APIs (short for application programming interface), opening the path to automation. Similarly, properties may be assigned, unassigned, or modified using the GUI or API instructions.
Artifact housekeeping is one of the many use cases of the property feature. For example, in a trunk-based CI system, every commit potentially results in a build artifact. This causes many build artifacts to accumulate on the Artifactory server in a very short amount of time. However, not all these artifacts will get published; some may fail QA testing, while others will simply be ignored for a newer version.
This calls for an automation script to identify and clean artifacts that are older than a particular period. Since date of creation is a default auto-assigned property of an artifact, you can use the AQL (short for Artifactory Query Language), another feature of Artifactory, to query artifacts using the date of creation property and then delete them.
A binary repository manager such as JFrog Artifactory helps you in architecting a fully automated CI/CD pipeline workflow through another built-in feature called Build Promotion.
To understand how it works, imagine a fully automated continuous deployment pipeline with stages such as downloading the built package from Artifactory, deploying the package in the QA environment, running automated tests, and lastly, deploying the solution in production. This Build Promotion feature in Artifactory takes care of promoting the status of the artifact by updating the Build Info data. Build Promotion is configured to trigger from the various stages of the CD pipeline using APIs. See Figure 1.8 to get an overview of the Build Promotion process.
Build Info data is metadata published along with the artifact by the continuous integration pipeline to Artifactory. It includes the commit ID, CI pipeline URL, build number, and a lot more information:
Figure 1.8: The process of Build Promotion
Pre-built software dependencies or libraries, such as npm, Maven, and Conda, to mention a few, enable developers to build applications more quickly. Using these open source frameworks, however, presents two major challenges: administration and security.
The Artifactory tool helps you to manage dependencies as well as check them for vulnerabilities. It serves as a one-stop shop for accessing all the different dependent packages needed to build your project.
This is often done by creating as many remote repositories as needed within Artifactory, each of which is linked to its appropriate package source on the internet. With an internal Artifactory URL, a remote package repository in Artifactory allows you to list all accessible packages from the internet. All you must do is reference these libraries using the correct Artifactory URLs when writing your code. Since these Artifactory URLs get tracked by your version control tool (Git), you can easily generate a bill of materials (BOM), used to build a particular version of your software application. A remote repository is only a reference to the actual source; nothing else is ever downloaded to the Artifactory server but the metadata.
Another advantage of utilizing a binary repository manager such as Artifactory is the built-in security tool that scans open source package libraries for vulnerabilities and checks open source libraries for license violations.
Tools such as JFrog Xray, part of the JFrog suite, scan your package libraries for OSS components and security vulnerabilities and create a report before they enter your source code. SCA can also be incorporated as a separate stage in your CI pipeline using other SCA tools such as Black Duck.
At the heart of the CI system sits a CI/CD tool. Its role is to seamlessly orchestrate other DevOps tools to achieve a fully automated CI/CD workflow. Thus, a good CI/CD tool must provide, at the very least, the following features:
Seamless integration with various software tools and technologiesPowerful pipeline syntax to create robust automationIntuitive pipeline visualization and notification capabilitiesThe various stages of a CI pipeline are scripted using their corresponding tools and technologies. For example, to build code written in Java, you require Maven libraries and the Maven build tool. Likewise, to build a code written in C++ or C#, you require NuGet packages, the MS Build tool, and so on and so forth.
Similarly, to deploy a built application on a given infrastructure, the team might employ a configuration management tool such as Ansible or Chef, or Helm if it’s a containerized application. And the list goes on if we continue to include all the other stages of a software development life cycle.
Because there is a wide variety of tools and technologies, a good CI/CD tool is expected to integrate well with as many of them as it can.
Jenkins is one such versatile tool that has stood the test of time. Its ecosystem of plugins allows it to integrate with almost all the tools listed on the DevSecOps periodic table [4]:
Figure 1.9: Jenkins, a CI/CD tool
Earlier, automation pipelines were first built manually on the CI/CD server using GUI-based parameters. But, as development teams grew and automation pipelines proliferated, engineers found it increasingly challenging to maintain them, because the only method to back up and version the pipelines was to create a backup of the whole CI/CD tool. Jenkins’Freestyle Job and TeamCity’sBuild are two such examples.
But thanks to the pipeline-as-code movement, automation pipelines can now be defined as code, making it possible to version-control them alongside source code. And not just CI/CD pipelines; describing your complete infrastructure as code is also a reality.
Pipeline as code allows you to share, examine, and create longer automation pipelines. Furthermore, to reduce redundant pipeline code, it is even possible to compose reusable pipeline components as libraries using the Shared Libraries feature in Jenkins.
Pipeline code, particularly Jenkins pipeline code, is a DSL (short for domain-specific language) built on