E-Book
46,44 €

Hands-On Software Engineering with Golang E-Book

Achilleas Anagnostopoulos

0,0

46,44 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Explore software engineering methodologies, techniques, and best practices in Go programming to build easy-to-maintain software that can effortlessly scale on demand

Key Features

Apply best practices to produce lean, testable, and maintainable Go code to avoid accumulating technical debt

Explore Go's built-in support for concurrency and message passing to build high-performance applications

Scale your Go programs across machines and manage their life cycle using Kubernetes

Book Description

Over the last few years, Go has become one of the favorite languages for building scalable and distributed systems. Its opinionated design and built-in concurrency features make it easy for engineers to author code that efficiently utilizes all available CPU cores.

This Golang book distills industry best practices for writing lean Go code that is easy to test and maintain, and helps you to explore its practical implementation by creating a multi-tier application called Links 'R' Us from scratch. You'll be guided through all the steps involved in designing, implementing, testing, deploying, and scaling an application. Starting with a monolithic architecture, you'll iteratively transform the project into a service-oriented architecture (SOA) that supports the efficient out-of-core processing of large link graphs. You'll learn about various cutting-edge and advanced software engineering techniques such as building extensible data processing pipelines, designing APIs using gRPC, and running distributed graph processing algorithms at scale. Finally, you'll learn how to compile and package your Go services using Docker and automate their deployment to a Kubernetes cluster.

By the end of this book, you'll know how to think like a professional software developer or engineer and write lean and efficient Go code.

What you will learn

Understand different stages of the software development life cycle and the role of a software engineer

Create APIs using gRPC and leverage the middleware offered by the gRPC ecosystem

Discover various approaches to managing package dependencies for your projects

Build an end-to-end project from scratch and explore different strategies for scaling it

Develop a graph processing system and extend it to run in a distributed manner

Deploy Go services on Kubernetes and monitor their health using Prometheus

Who this book is for

This Golang programming book is for developers and software engineers looking to use Go to design and build scalable distributed systems effectively. Knowledge of Go programming and basic networking principles is required.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 907

Veröffentlichungsjahr: 2020

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Hands-On Software Engineering with Golang

Move beyond basic programming to design and build reliable software with clean code

Achilleas Anagnostopoulos

BIRMINGHAM - MUMBAI

Hands-On Software Engineering with Golang

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor:Richa TripathiAcquisition Editor:Karan GuptaContent Development Editor:Tiksha SarangSenior Editor: Storm MannTechnical Editor:Pradeep SahuCopy Editor: Safis EditingProject Coordinator:Francy PuthiryProofreader: Safis EditingIndexer:Tejal Daruwale SoniProduction Designer:Arvindkumar Gupta

First published: January 2020 Production reference: 1230120

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-83855-449-1

www.packt.com

In memory of my mother, Rhea.

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Achilleas Anagnostopoulos has been writing code in a multitude of programming languages since the mid 90s. His main interest lies in building scalable, microservice-based distributed systems where components are interconnected via gRPC or message queues. Achilleas has over 4 years of experience building production-grade systems using Go and occasionally enjoys pushing the language to its limits through his experimental gopher-os project: a 64-bit kernel written entirely in Go. He is currently a member of the Juju team at Canonical, contributing to one of the largest open source Go code bases in existence.

I would like to thank my wife, Olga, and my daughter, Nefeli, for their understanding and support while I was putting in long hours to get this book completed on time. What's more, I would also like to extend a massive thanks to my father, Panagiotis, for taking the time to read through the book and offer insightful suggestions on improving not only my writing style but also the content of the chapters themselves. Finally, I would like to thank the editorial team at Packt for their input and assistance in getting the book ready for printing.

About the reviewer

Eduard Bondarenko is a long-time software developer. He prefers concise and expressive code with comments and has tried many programming languages, such as Ruby, Go, Java, and JavaScript.

Eduard has reviewed a couple of programming books and has enjoyed their broad topics and how interesting they are. Besides programming, he likes to spend time with the family, play soccer, and travel.

I want to thank my family for supporting me during my work on the book and also the author of this book for an interesting read.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Hands-On Software Engineering with Golang

Dedication

About Packt

Why subscribe?

Contributors

About the author

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Code in Action

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Software Engineering and the Software Development Life Cycle

A Bird's-Eye View of Software Engineering

What is software engineering?

Types of software engineering roles

The role of the software engineer (SWE)

The role of the software development engineer in test (SDET)

The role of the site reliability engineer (SRE)

The role of the release engineer (RE)

The role of the system architect

A list of software development models that all engineers should know

Waterfall

Iterative enhancement

Spiral

Agile

Lean

Eliminate waste

Create knowledge

Defer commitment

Build in quality

Deliver fast

Respect and empower people

See and optimize the whole

Scrum

Scrum roles

Essential Scrum events

Kanban

DevOps

The CAMS model

The three ways model

Summary

Questions

Further reading

Section 2: Best Practices for Maintainable and Testable Go Code

Best Practices for Writing Clean and Maintainable Go Code

The SOLID principles of object-oriented design

Single responsibility

Open/closed principle

Liskov substitution

Interface segregation

Dependency inversion

Applying the SOLID principles

Organizing code into packages

Naming conventions for Go packages

Circular dependencies

Breaking circular dependencies via implicit interfaces

Sometimes, code repetition is not a bad idea!

Tips and tools for writing lean and easy-to-maintain Go code

Optimizing function implementations for readability

Variable naming conventions

Using Go interfaces effectively

Zero values are your friends

Using tools to analyze and manipulate Go programs

Taking care of formatting and imports (gofmt, goimports)

Refactoring code across packages (gorename, gomvpkg, fix)

Improving code quality metrics with the help of linters

Summary

Questions

Further reading

Dependency Management

What's all the fuss about software versioning?

Semantic versioning

Comparing semantic versions

Applying semantic versioning to Go packages

Managing the source code for multiple package versions

Single repository with versioned folders

Single repository – multiple branches

Vendoring – the good, the bad, and the ugly

Benefits of vendoring dependencies

Is vendoring always a good idea?

Strategies and tools for vendoring dependencies

The dep tool

The Gopkg.toml file

The Gopkg.lock file

Go modules – the way forward

Fork packages

Summary

Questions

Further reading

The Art of Testing

Technical requirements

Unit testing

Mocks, stubs, fakes, and spies – commonalities and differences

Stubs and spies!

Mocks

Introducing gomock

Exploring the details of the project we want to write tests for

Leveraging gomock to write a unit test for our application

Fake objects

Black-box versus white-box testing for Go packages – an example

The services behind the facade

Writing black-box tests

Boosting code coverage via white-box tests

Table-driven tests versus subtests

Table-driven tests

Subtests

The best of both worlds

Using third-party testing frameworks

Integration versus functional testing

Integration tests

Functional tests

Functional tests part deux – testing in production!

Smoke tests

Chaos testing – breaking your systems in fun and interesting ways!

Tips and tricks for writing tests

Using environment variables to set up or skip tests

Speeding up testing for local development

Excluding classes of tests via build flags

This is not the output you are looking for – mocking calls to external binaries

Testing timeouts is easy when you have all the time in the world!

Summary

Questions

Further reading

Section 3: Designing and Building a Multi-Tier System from Scratch

The Links 'R'; Us Project

System overview – what are we going to be building?

Selecting an SDLC model for our project

Iterating faster using an Agile framework

Elephant carpaccio – how to iterate even faster!

Requirements analysis

Functional requirements

User story – link submission

User story – search

User story – crawl link graph

User story – calculate PageRank scores

User story – monitor Links 'R' Us health

Non-functional requirements

Service-level objectives

Security considerations

Being good netizens

System component modeling

The crawler

The link filter

The link fetcher

The content extractor

The link extractor

The content indexer

The link provider

The link graph

The PageRank calculator

The metrics store

The frontend

Monolith or microservices? The ultimate question

Summary

Questions

Further reading

Building a Persistence Layer

Technical requirements

Running tests that require CockroachDB

Running tests that require Elasticsearch

Exploring a taxonomy of database systems

Key-value stores

Relational databases

NoSQL databases

Document databases

Understanding the need for a data layer abstraction

Designing the data layer for the link graph component

Creating an ER diagram for the link graph store

Listing the required set of operations for the data access layer

Defining a Go interface for the link graph

Partitioning links and edges for processing the graph in parallel

Iterating Links and Edges

Verifying graph implementations using a shared test suite

Implementing an in-memory graph store

Upserting links

Upserting edges

Looking up links

Iterating links/edges

Removing stale edges

Setting up a test suite for the graph implementation

Scaling across with a CockroachDB-backed graph implementation

Dealing with DB migrations

An overview of the DB schema for the CockroachDB implementation

Upserting links

Upserting edges

Looking up links

Iterating links/edges

Removing stale edges

Setting up a test suite for the CockroachDB implementation

Designing the data layer for the text indexer component

A model for indexed documents

Listing the set of operations that the text indexer needs to support

Defining the Indexer interface

Verifying indexer implementations using a shared test suite

An in-memory Indexer implementation using bleve

Indexing documents

Looking up documents and updating their PageRank score

Searching the index

Iterating the list of search results

Setting up a test suite for the in-memory indexer

Scaling across an Elasticsearch indexer implementation

Creating a new Elasticsearch indexer instance

Indexing and looking up documents

Performing paginated searches

Updating the PageRank score for a document

Setting up a test suite for the Elasticsearch indexer

Summary

Questions

Further reading

Data-Processing Pipelines

Technical requirements

Building a generic data-processing pipeline in Go

Design goals for the pipeline package

Modeling pipeline payloads

Multistage processing

Stageless pipelines – is that even possible?

Strategies for handling errors

Accumulating and returning all errors

Using a dead-letter queue

Terminating the pipeline's execution if an error occurs

Synchronous versus asynchronous pipelines

Synchronous pipelines

Asynchronous pipelines

Implementing a stage worker for executing payload processors

FIFO

Fixed and dynamic worker pools

1-to-N broadcasting

Implementing the input source worker

Implementing the output sink worker

Putting it all together – the pipeline API

Building a crawler pipeline for the Links 'R' Us project

Defining the payload for the crawler

Implementing a source and a sink for the crawler

Fetching the contents of graph links 

Extracting outgoing links from retrieved webpages

Extracting the title and text from retrieved web pages

Inserting discovered outgoing links to the graph

Indexing the contents of retrieved web pages

Assembling and running the pipeline

Summary

Questions

Further reading

Graph-Based Data Processing

Technical requirements

Exploring the Bulk Synchronous Parallel model

Building a graph processing system in Go

Queueing and delivering messages

The Message interface

Queues and message iterators

Implementing an in-memory, thread-safe queue

Modeling the vertices and edges of graphs

Defining the Vertex and Edge types

Inserting vertices and edges into the graph

Sharing global graph state through data aggregation

Defining the Aggregator interface

Registering and looking up aggregators

Implementing a lock-free accumulator for float64 values

Sending and receiving messages

Implementing graph-based algorithms using compute functions

Achieving vertical scaling by executing compute functions in parallel

Orchestrating the execution of super-steps

Creating and managing Graph instances

Solving interesting graph problems

Searching graphs for the shortest path

The sequential Dijkstra algorithm

Leveraging a gossip protocol to run Dijkstra in parallel

Graph coloring

A sequential greedy algorithm for coloring undirected graphs

Exploiting parallelism for undirected graph coloring

Calculating PageRank scores

The model of the random surfer

An iterative approach to PageRank score calculation

Reaching convergence – when should we stop iterating?

Web graphs in the real world – dealing with dead ends

Defining an API for the PageRank calculator

Implementing a compute function to calculate PageRank scores

Summary

Further reading

Communicating with the Outside World

Technical requirements

Designing robust, secure, and backward-compatible REST APIs

Using human-readable paths for RESTful resources

Controlling access to API endpoints

Basic HTTP authentication

Securing TLS connections from eavesdropping

Authenticating to external service providers using OAuth2

Dealing with API versions

Including the API version as a route prefix

Negotiating API versions via HTTP Accept headers

Building RESTful APIs in Go

Building RPC-based APIs with the help of gRPC

Comparing gRPC to REST

Defining messages using protocol buffers

Defining messages

Versioning message definitions

Representing collections

Modeling field unions

The Any type

Implementing RPC services

Unary RPCs

Server-streaming RPCs

Client-streaming RPCs

Bi-directional streaming RPCs

Security considerations for gRPC APIs

Decoupling Links 'R' Us components from the underlying data stores

Defining RPCs for accessing a remote link-graph instance

Defining RPCs for accessing a text-indexer instance

Creating high-level clients for accessing data stores over gRPC 

Summary

Questions

Further reading

Building, Packaging, and Deploying Software

Technical requirements

Building and packaging Go services using Docker

Benefits of containerization

Best practices for dockerizing Go applications

Selecting a suitable base container for your application

A gentle introduction to Kubernetes

Peeking under the hood

Summarizing the most common Kubernetes resource types

Running a Kubernetes cluster on your laptop!

Building and deploying a monolithic version of Links 'R' Us

Distributing computation across application instances

Carving the UUID space into non-overlapping partitions

Assigning a partition range to each pod

Building wrappers for the application services

The crawler service

The PageRank calculator service

Serving a fully functioning frontend to users

Specifying the endpoints for the frontend application

Performing searches and paginating results

Generating convincing summaries for search results

Highlighting search keywords

Orchestrating the execution of individual services

Putting everything together

Terminating the application in a clean way

Dockerizing and starting a single instance of the monolith

Deploying and scaling the monolith on Kubernetes

Setting up the required namespaces

Deploying CockroachDB and Elasticsearch using Helm

Deploying Links 'R' Us

Summary

Questions

Further reading

Section 4: Scaling Out to Handle a Growing Number of Users

Splitting Monoliths into Microservices

Technical requirements

Monoliths versus service-oriented architectures

Is there something inherently wrong with monoliths?

Microservice anti-patterns and how to deal with them

Monitoring the state of your microservices

Tracing requests through distributed systems

The OpenTracing project

Stepping through a distributed tracing example

The provider service

The aggregator service

The gateway

Putting it all together

Capturing and visualizing traces using Jaeger

Making logging your trusted ally

Logging best practices

The devil is in the (logging) details

Shipping and indexing logs inside Kubernetes

Running a log collector on each Kubernetes node

Using a sidecar container to collect logs

Shipping logs directly from the application

Introspecting live Go services

Building a microservice-based version of Links 'R' Us

Decoupling access to the data stores

Breaking down the monolith into distinct services

Deploying the microservices that comprise the Links 'R' Us project

Deploying the link-graph and text-indexer API services

Deploying the web crawler

Deploying the PageRank service

Deploying the frontend service

Locking down access to our Kubernetes cluster using network policies

Summary

Questions

Further reading

Building Distributed Graph-Processing Systems

Technical requirements

Introducing the master/worker model

Ensuring that masters are highly available

The leader-follower configuration

The multi-master configuration

Strategies for discovering nodes

Recovering from errors

Out-of-core distributed graph processing

Describing the system architecture, requirements, and limitations

Modeling a state machine for executing graph computations

Establishing a communication protocol between workers and masters

Defining a job queue RPC service

Establishing protocol buffer definitions for worker payloads

Establishing protocol buffer definitions for master payloads

Defining abstractions for working with bi-directional gRPC streams

Remote worker stream

Remote master stream

Creating a distributed barrier for the graph execution steps

Implementing a step barrier for individual workers

Implementing a step barrier for the master

Creating custom executor factories for wrapping existing graph instances

The workers' executor factory

The master's executor factory

Coordinating the execution of a graph job

Simplifying end user interactions with the dbspgraph package

The worker job coordinator

Running a new job

Transitioning through the stages of the graph's state machine

Handling incoming payloads from the master

Using the master as an outgoing message relay

The master job coordinator

Running a new job

Transitioning through the stages for the graph's state machine

Handling incoming worker payloads

Relaying messages between workers

Defining package-level APIs for working with master and worker nodes

Instantiating and operating worker nodes

Instantiating and operating master nodes

Handling incoming gRPC connections

Running a new job

Deploying a distributed version of the Links 'R' Us PageRank calculator

Retrofitting master and worker capabilities to the PageRank calculator service

Serializing PageRank messages and aggregator values

Defining job runners for the master and the worker

Implementing the job runner for master nodes

The worker job runner

Deploying the final Links 'R' Us version to Kubernetes

Summary

Questions

Further reading

Metrics Collection and Visualization

Technical requirements

Monitoring from the perspective of a site reliability engineer

Service-level indicators (SLIs)

Service-level objectives (SLOs)

Service-level agreements (SLAs)

Exploring options for collecting and aggregating metrics

Comparing push versus pull systems

Capturing metrics using Prometheus

Supported metric types

Automating the detection of scrape targets

Static and file-based scrape target configuration

Querying the underlying cloud provider

Leveraging the API exposed by Kubernetes

Instrumenting Go code

Registering metrics with Prometheus

Vector-based metrics

Exporting metrics for scraping

Visualizing collected metrics using Grafana

Using Prometheus as an end-to-end solution for alerting

Using Prometheus as a source for alert events

Handling alert events

Grouping alerts together

Selectively muting alerts

Configuring alert receivers

Routing alerts to receivers

Summary

Questions

Further reading

Epilogue

Assessments

Chapter 1

Chapter 2

Chapter 3

Chapter 4

Chapter 5

Chapter 6

Chapter 7

Chapter 8

Chapter 9

Chapter 10

Chapter 11

Chapter 12

Chapter 13

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Over the last few years, Go has gradually turned into one of the industry's favorite languages for building scalable and distributed systems. The language's opinionated design and built-in concurrency features make it relatively easy for engineers to author code that efficiently utilizes all available CPU cores.

This book distills the industry's best practices for writing lean Go code that is easy to test and maintain and explores their practical implementation by creating a multi-tier application from scratch called 'Links 'R' Us.' You will be guided through all the steps involved in designing, implementing, testing, deploying, and scaling the application. You'll start with a monolithic architecture and iteratively transform the project into a Service-Oriented Architecture (SOA) that supports efficient out-of-core processing of large link graphs. You will learn about various advanced and cutting-edge software engineering techniques such as building extensible data-processing pipelines, designing APIs using gRPC, and running distributed graph processing algorithms at scale. Finally, you will learn how to compile and package your Go services using Docker and automate their deployment to a Kubernetes cluster.

By the end of this book, you will start to think like a professional developer/engineer who can put theory into practice by writing lean and efficient Go code.

Who this book is for

This book is for developers and software engineers interested in effectively using Go to design and build scalable distributed systems. This book will also be useful for amateur-to-intermediate level developers who aspire to become professional software engineers.

What this book covers

Chapter 1, A Bird's-Eye View of Software Engineering, explains the difference between software engineering and programming and outlines the different types of engineering roles that you may encounter in small, medium, and large organizations. What's more, the chapter summarizes the basic software design life cycle models that every software engineer (SWE) should be aware of.

Chapter 2, Best Practices for Writing Clean and Maintainable Go Code, explains how the SOLID design principles can be applied to Go projects and provides useful tips for organizing your Go code in packages and writing code that is easy to maintain and test.

Chapter 3, Dependency Management, highlights the importance of versioning Go packages and discusses tools and strategies for vendoring your project dependencies.

Chapter 4, The Art of Testing, advocates the use of primitives such as stubs, mocks, spies, and fake objects for writing comprehensive unit tests for your code. Furthermore, the chapter enumerates the pros and cons of different types of tests (for example, black- versus white-box, integration versus functional) and concludes with an interesting discussion on advanced testing techniques such as smoke testing and chaos testing.

Chapter 5, The Links 'R' Us project, introduces the hands-on project that we will be building from scratch in the following chapters.

Chapter 6, Building a Persistence Layer, focuses on the design and implementation of the data access layer for two of the Links 'R' Us project components: the link graph and the text indexer.

Chapter 7, Data-Processing Pipelines, explores the basic principles behind data-processing pipelines and implements a framework for constructing generic, concurrent-safe, and reusable pipelines using Go primitives such as channels, contexts, and go-routines. The framework is then used to develop the crawler component for the Links 'R' Us project.

Chapter 8, Graph-Based Data Processing, explains the theory behind the Bulk Synchronous Parallel (BSP) model of computation and implements, from scratch, a framework for executing parallel algorithms against graphs. As a proof of concept, we will be using this framework to investigate parallel versions of popular graph-based algorithms (namely, shortest path and graph coloring) with our efforts culminating in the complete implementation of the PageRank algorithm, a critical component of the Links 'R' Us project.

Chapter 9, Communicating with the Outside World, outlines the key differences between RESTful and gRPC-based APIs with respect to subjects such as routing, security, and versioning. In this chapter, we will also define gRPC APIs for making the link graph and text indexer data stores for the Links 'R' Us project accessible over the network.

Chapter 10, Building, Packaging, and Deploying Software, enumerates the best practices for dockerizing your Go applications and optimizing their size. In addition, the chapter explores the anatomy of a Kubernetes cluster and enumerates the essential list of Kubernetes resources that we can use. As a proof of concept, we will be creating a monolithic version of the Links 'R' Us project and will deploy it to a Kubernetes cluster that you will spin up on your local machine.

Chapter 11, Splitting Monoliths into Microservices, explains the SOA pattern and discusses some common anti-patterns that you should be aware of and pitfalls that you want to avoid when switching from a monolithic design to microservices. To put the ideas from this chapter to the test, we will be breaking down the monolithic version of the Links 'R' Us project into microservices and deploying them to Kubernetes.

Chapter 12, Building Distributed Graph-Processing Systems, combines the knowledge from the previous chapters to create a distributed version of the graph-based data processing framework, which can be used for massive graphs that do not fit in memory (out-of-core processing).

Chapter 13, Metrics Collection and Visualization, enumerates the most popular solutions for collecting and indexing metrics from applications with a focus on Prometheus. After discussing approaches to instrumenting your Go code to capture and export Prometheus metrics, we will delve into the use of tools such as Grafana for metrics visualization, and Alert manager for setting up alerts based on the aggregated values of collected metrics.

Chapter 14, Epilogue, provides suggestions for furthering your understanding of the material by extending the hands-on project that we have built throughout the chapters of the book.

To get the most out of this book

To get the most out of this book and experiment with the accompanying code, you need to have a fairly good understanding of programming in Go as well as sufficient experience working with the various tools that comprise the Go ecosystem.

In addition, the book assumes that you have a solid grasp of basic networking theory.

Finally, some of the more technical chapters in the book utilize technologies such as Docker and Kubernetes. While a priori knowledge of these technologies is not strictly required, any prior experience using these (or equivalent) systems will certainly prove beneficial in better understanding the topics discussed in those chapters.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

Support

tab.

Click on

Code Downloads

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Software-Engineering-with-Golang. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Code in Action

To see the Code in Action please visit the following link: http://bit.ly/37QWeR2.

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838554491_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "In the following code, you can see the definition of a generic Sword type for our upcoming game."

A block of code is set as follows:

type Sword struct { name string // Important tip for RPG players: always name your swords!}// Damage returns the damage dealt by this sword.func (Sword) Damage() int { return 2}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

type Sword struct { name string // Important tip for RPG players: always name your swords!}// Damage returns the damage dealt by this sword.func (Sword) Damage() int {

return 2

}

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The following excerpt is part of a system that collects and publishes performance metrics to a key-value store."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Software Engineering and the Software Development Life Cycle

The objective of part one is to familiarize you with the concept of software engineering, the stages of the software development life cycle, and the various roles of software engineers.

This section comprises the following chapter:

Chapter 1

A Bird's-Eye View of Software Engineering

"Hiring people to write code to sell is not the same as hiring people to design and build durable, usable, dependable software."

- Larry Constantine [6]

Through the various stages of my career, I have met several people that knew how to code; people whose skill level ranged from beginner to, what some would refer to as, guru. All those people had different backgrounds and worked for both start-ups and large organizations. For some, coding was seen as a natural progression from their CS studies, while others turned to coding as part of a career change decision.

Regardless of all these differences, all of them had one thing in common: when asked to describe their current role, all of them used the term software engineer. It is quite a common practice for job candidates to use this term in their CVs as the means to set themselves apart from a globally distributed pool of software developers. A quick random sampling of job specs published online reveals that a lot of companies – and especially high-profile start-ups – also seem to subscribe to this way of thinking, as evidenced by their search for professionals to fill software engineering roles. In reality, as we will see in this chapter, the term software engineer is more of an umbrella term that covers a wide gamut of bespoke roles, each one combining different levels of software development expertise with specialized skills pertaining to topics such as system design, testing, build tools, and operations management.

So, what is software engineering and how does it differ from programming? What set of skills should a software engineer possess and which models, methodologies, and frameworks are at their disposal for facilitating the delivery of complex pieces of software? These are some of the questions that will be answered in this chapter.

This chapter covers the following topics:

A definition of software engineering

The types of software engineering roles that you may encounter in contemporary organizations

An overview of popular software development models and which one to select based on the project type and requirements

What is software engineering?

Before we dive deeper into this chapter, we need to establish an understanding of some of the basic terms and concepts around software engineering. For starters, how do we define software engineering and in what ways does it differ from software development and programming in general? To begin answering this question, we will start by examining the formal definition of software engineering, as published in IEEE's Standard Glossary of Software Engineering Terminology [7]:

"Software engineering is defined as the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software."

The main takeaway from this definition is that authoring code is just one of the many facets of software engineering. At the end of the day, any capable programmer can take a well-defined specification and convert it into a fully functioning program without thinking twice about the need to produce clean and maintainable code. A disciplined software engineer, on the other hand, would follow a more systematic approach by applying common design patterns to ensure that the produced piece software is extensible, easier to test, and well documented in case another engineer or engineering team assumes ownership of it in the future.

Besides the obvious requirement for authoring high-quality code, the software engineer is also responsible for thinking about other aspects of the systems that will be built. Some questions that the software engineer must be able to answer include the following:

What are the business use cases that the software needs to support?

What components comprise the system and how do they interact with each other?

Which technologies will be used to implement the various system components?

How will the software be tested to ensure that its behavior matches the customer's expectations?

How does load affect the system's performance and what is the plan for scaling the system?

To be able to answer these questions, the software engineer needs a special set of skills that, as you are probably aware, go beyond programming. These extra responsibilities and required skills are the main factors that differentiate a software engineer from a software developer.

Types of software engineering roles

As we discussed in the previous section, software engineering is an inherently complex, multi-stage process. In an attempt to manage this complexity, organizations around the world have invested a lot of time and effort over the years to break the process down into a set of well-defined stages and train their engineering staff to efficiently deal with each stage.

Some software engineers strive to work across all the stages of the Software Development Life Cycle (SDLC), while others have opted to specialize in and master a particular stage of the SDLC. This gave rise to a variety of software engineering roles, each one with a different set of responsibilities and a required set of skills. Let's take a brief look at the most common software engineering roles that you may encounter when working with both small- and large-sized organizations.

The role of the software engineer (SWE)

The software engineer (SWE) is the most common role that you are bound to interact with in any organization, regardless of its size. Software engineers play a pivotal role not only in designing and building new pieces of software, but also in operating and maintaining existing and legacy systems.

Depending on their experience level and technical expertise, SWEs are classified into three categories:

Junior engineer

: A junior engineer is someone who has recently started their software development career and lacks the necessary experience to build and deploy production-grade software. Companies are usually keen on hiring junior engineers as it allows them to keep their hiring costs low. Furthermore, companies often pair promising junior engineers with senior engineers in an attempt to grow them into mid-level engineers and retain them for longer.

Mid-level engineer

: A typical mid-level engineer is someone who has at least three years of software development experience. Mid-level engineers are expected to have a solid grasp of the various aspects of the software development life cycle and are the ones who can exert a significant impact on the amount of code that's produced for a particular project. To this end, they not only contribute code, but also review and offer feedback to the code that's contributed by other team members.

Senior engineer

: This class of engineer is well-versed in a wide array of disparate technologies; their breadth of knowledge makes them ideal for assembling and managing software engineering teams, as well as serving as mentors and coaches for less senior engineers. From their years of experience, senior engineers acquire a deep understanding of a particular business domain. This trait allows them to serve as a liaison between their teams and the other, technical or non-technical, business stakeholders.

Another way to classify software engineers is by examining the main focus of their work:

Frontend engineers

work exclusively on software that customers interact with. Examples of frontend work include the UI for a desktop application, a single-page web application for a

software as a service

(

SaaS

) offering, and a mobile application running on a phone or other smart device.

Backend engineers

specialize in building the parts of a system that implement the actual business logic and deal with data modeling, validation, storage, and retrieval.

Full stack engineers

are developers who have a good understanding of both frontend and backend technologies and no particular preference of doing frontend or backend work. This class of developers is more versatile as they can easily move between teams, depending on the project requirements.

The role of the software development engineer in test (SDET)

The software development engineer in test (SDET) is a role whose origins can be traced back to Microsoft's engineering teams. In a nutshell, SDETs are individuals who, just like their SWE counterparts, take part in software development, but their primary focus lies in software testing and performance.

An SDET's primary responsibility is to ensure that the development team produces high-quality software that is free from defects. A prerequisite for achieving this goal is to be cognizant of the different types of approaches to testing software, including, but not limited to, unit testing, integration testing, white/black-box testing, end-to-end/acceptance testing, and chaos testing. We will be discussing all of these testing approaches in more detail in the following chapters.

The main tool that SDETs use to meet their goals is testing automation. Development teams can iterate much faster when a Continuous Integration (CI) pipeline is in place to automatically test their changes across different devices and CPU architectures. Besides setting up the infrastructure for the CI pipeline and integrating it with the source code repository system that the team uses, SDETs are often tasked with authoring and maintaining a separate set of tests. These tests fall into the following two categories:

Acceptance tests

: A set of scripted end-to-end tests to ensure that the complete system adheres to all the customer's business requirements before a new version is given the green light for a release.

Performance regression tests

: Another set of quality control tests that monitor a series of performance metrics across builds and alert you when a metric exceeds a particular threshold. These types of tests prove to be a great asset in the case where a

service-level agreement

(

SLA

) has been signed that makes seemingly innocuous changes to the code (for example, switching to a different data structure implementation) that may trigger a breach of the SLA, even though all the unit tests pass.

Finally, SDETs collaborate with support teams to transform incoming support tickets into bug reports that the development team can work on. The combination of software development and debugging skills, in conjunction with the SDET's familiarity with the system under development, makes them uniquely capable of tracking down the location of bugs in production code and coming up with example cases (for example, a particular data input or a sequence of actions) that allow developers to reproduce the exact set of conditions that trigger each bug.

The role of the site reliability engineer (SRE)

The role of the site reliability engineer (SRE) came into the spotlight in 2016 when Google published a book on the subject of Site Reliability Engineering[4]. This book outlined the best practices and strategies that are used internally by Google to run their production systems and has since led to the wide adoption of the role by the majority of companies operating in the SaaS space.

The term was initially coined sometime around 2003 by Ben Treynor, the founder of Google's site reliability team. A site reliability engineer is a software engineer with a strong technical background who also focuses on the operations side of deploying and running production-grade services.

According to the original role definition, SREs spend approximately 50% of their time developing software and the other 50% dealing with ops-related aspects such as the following:

Working on support tickets or responding to alerts

Being on-call

Running manual tasks (for example, upgrading systems or running disaster recovery scenarios)

It is in the best interests of SREs to increase the stability and reliability of the services they operate. After all, no one enjoys being paged at 2 a.m. when a service melts down due to a sudden spike in the volume of incoming requests. The end goal is always to produce services that are highly available and self-healing; services that can automatically recover from a variety of faults without the need for human intervention.

The basic mantra of SREs is to eliminate potential sources of human errors by automating repeated tasks. One example of this philosophy is the use of a Continuous Deployment (CD) pipeline to minimize the amount of time that's required to deploy software changes to production. The benefits of this type of automation become apparent when a critical issue affecting production is identified and a fix must be deployed as soon as possible.

Ultimately, software is designed and built by humans so bugs will undoubtedly creep in. Rather than relying on a rigorous verification process to prevent defects from being deployed to production, SREs operate under the premise that we live in a non-perfect world: systems do crash and buggy software will, at some point, get deployed to production. To detect defective software deployments and mitigate their effects on end users, SREs set up monitoring systems that keep track of various health-related metrics for each deployed service and can trigger automatic rollbacks if a deployment causes an increase in a service's error rate.

The role of the release engineer (RE)

In a world where complex, monolithic systems are broken down into multiple microservices and continuous delivery has become the new norm, debugging older software releases that are still deployed out in the wild becomes a major pain point for software engineers.

To understand why this can be a pain point, let's take a look at a small example: you arrive at work on a sunny Monday morning only to find out that one of your major customers has filed a bug against the microservice-based software your team is responsible for. To make things even worse, that particular customer is running a long-term support (LTS) release of the software, which means that some, if not all, of the microservices that the run on the customer's machines are based on code that is at least a couple of hundred commits behind the current state of development. So, how can you actually come up with a bug reproducer and check whether the bug has already been fixed upstream?

This is where the concept of reproducible builds comes into play. By reproducible builds, we mean that at any point in time we should be able to compile a particular version of all the system components where the resulting artifacts match, bit by bit, the ones that have been deployed by the customer.

A release engineer (RE) is effectively a software engineer who collaborates with all the engineering teams to define and document all the required steps and processes for building and releasing code to production. A prerequisite for a release engineer is having deep knowledge of all the tools and processes that are required for compiling, versioning, testing, and packaging software. Typical tasks for REs include the following:

Authoring makefiles

Implementing workflows for containerizing software artifacts (for example, as Docker or .

rkt

images)

Ensuring all teams use exactly the same build tool (compilers, linkers, and so on) versions and flags

Ensuring that builds are both

repeatable

and

hermetic

: changes to external dependencies (for example, third-party libraries) between builds of the

same software version

should have no effect on the artifacts that are produced by each build

The role of the system architect

The last role that we will be discussing in this section, and one that you will only probably encounter when working on bigger projects or collaborating with large organizations, is the system architect. While software engineering teams focus on building the various components of the system, the architect is the one person who sees the big picture: what components comprise the system, how each component must be implemented, and how all the components fit and interact with each other.

In smaller companies, the role of the architect is usually fulfilled by one of the senior engineers. In larger companies, the architect is a distinct role that's filled by someone with both a solid technical background and strong analytical and communication skills.

Apart from coming up with a high-level, component-based design for the system, the architect is also responsible for making decisions regarding the technologies that will be used during development and setting the standards that all the development teams must adhere to.

Even though architects have a technical background, they rarely get to write any code. As a matter of fact, architects tend to spend a big chunk of their time in meetings with the various internal or external stakeholders, authoring design documents or providing technical direction to the software engineering teams.

A list of software development models that all engineers should know

The software engineering definition from the previous section alludes to the fact that software engineering is a complicated, multi-stage process. In an attempt to provide a formal description of these stages, academia has put forward the concept of the SDLC.

The SDLC is a systematic process for building high-quality software that matches the expectations of the end user or customer while ensuring that the project's cost stays within a reasonable bound.

Over the years, there has been an abundance of alternative model proposals for facilitating software development. The following diagram is a timeline illustrating the years when some of the most popular SDLC models were introduced:

Figure 1: A timeline for the software development models that will be presented in this chapter

In the upcoming sections, we will explore each of the preceding models in more detail.

Waterfall

The waterfall model is probably the most widely known model out there for implementing the SDLC. It was introduced by Winston Royce in 1970 [11] and defines a series of steps that must be sequentially completed in a particular order. Each stage produces a certain output, for example, a document or some artifact, that is, in turn, consumed by the step that follows.

The following diagram outlines the basic steps that were introduced by the waterfall model:

Requirement collection

: During this stage, the customer's requirements are captured and analyzed and a requirements document is produced.

Design

: Based on the requirement's document contents, analysts will plan the system's architecture. This step is usually split into two sub-steps: the logical system design, which models the system as a set of high-level components, and the physical system design, where the appropriate technologies and hardware components are selected.

Implementation

: The implementation stage is where the design documents from the previous step get transformed by software engineers into actual code.

Verification

: The verification stage follows the implementation stage and ensures that the piece of software that got implemented actually satisfies the set of customer requirements that were collected during the requirements gathering step.

Maintenance

: The final stage in the waterfall model is when the developed software is deployed and operated by the customer:

Figure 2: The steps defined by the waterfall model

One thing to keep in mind is that the waterfall model operates under the assumption that all customer requirements can be collected early on, especially before the project implementation stage begins. Having the full set of requirements available as a set of use cases makes it easier to get a more accurate estimate of the amount of time that's required for delivering the project and the development costs involved. A corollary to this is that software engineers are provided with all the expected use cases and system interactions in advance, thus making testing and verifying the system much simpler.

The waterfall model comes with a set of caveats that make it less favorable to use when building software systems. One potential caveat is that the model describes each stage in an abstract, high-level way and does not provide a detailed view into the processes that comprise each step or even tackle cross-cutting processes (for example, project management or quality control) that you would normally expect to execute in parallel through the various steps of the model.

While this model does work for small- to medium-scale projects, it tends, at least in my view, not to be as efficient for projects such as the ones commissioned by large organizations and/or government bodies. To begin with, the model assumes that analysts are always able to elicit the correct set of requirements from customers. This is not always the case as, oftentimes, customers are not able to accurately describe their requirements or tend to identify additional requirements just before the project is delivered. In addition to this, the sequential nature of this model means that a significant amount of time may elapse between gathering the initial requirements and the actual implementation. During this time – what some would refer to as an eternity in software engineering terms – the customer's requirements may shift. Changes in requirements necessitate additional development effort and this directly translates into increased costs for the deliverable.

Iterative enhancement

The iterative enhancement model that's depicted in the following diagram was proposed in 1975 by Basili and Victor [2] in an attempt to improve on some of the caveats of the waterfall model. By recognizing that requirements may potentially change for long-running projects, the model advocates executing a set of evolution cycles or iterations, with each one being allocated a fixed amount of time out of the project's time budget:

Figure 3: The steps of the interactive enhancement model

Instead of starting with the full set of specifications, each cycle focuses on building some parts of the final deliverable and refining the set of requirements from the cycle that precedes it. This allows the development team to make full use of any information available at that particular point in time and ensure that any requirement changes can be detected early on and acted upon.

One important rule when applying the iterative model is that the output of each cycle must be a usable piece of software. The last iteration is the most important as its output yields the final software deliverable. As we will see in the upcoming sections, the iterative model has exerted quite a bit of influence in the evolution of most of the contemporary software development models.

Spiral

The spiral development model was introduced by Barry Boehm in 1986 [5] as an approach to minimize risk when developing large-scale projects associated with significant development costs.

In the context of software engineering, risks are defined as any kind of situation or sequence of events that can cause a project to fail to meet its goals. Examples of various degrees of failure include the following:

Missing the delivery deadline

Exceeding the project budget

Delivering software on time, depending on the hardware that isn't available yet

As illustrated in the following diagram, the spiral model combines the ideas and concepts from the waterfall and iterative models with a risk assessment and analysis process. As Boehm points out, a very common mistake that people who are unfamiliar with the model tend to make when seeing this diagram for the first time is to assume that the spiral model is just a sequence of incremental waterfall steps that have to be followed in a particular order for each cycle. To dispel this misconception, Boehms provided the following definition for the spiral model:

"The spiral development model is a risk-driven process model generator that takes a cyclic approach to progressively expand the project scope while at the same time decreasing the degree of risk."

Under this definition, risk is the primary factor that helps project stakeholders answer the following questions:

What steps should we follow next?

How long should we keep following those steps before we need to reevaluate risk?

Figure 4: The original spiral model, as published by Boehm in 1986

At the beginning of each cycle, all the potential sources of risk are identified and mitigation plans are proposed to address any risk concerns. These set of risks are then ordered in terms of importance, for example, the impact on the project and the likelihood of occurring, and used as input by the stakeholders when planning the steps for the next spiral cycle.

Another common misconception about the spiral model is that the development direction is one-way and can only spiral outward, that is, no backtracking to a previous spiral cycle is allowed. This is generally not the case: stakeholders always try to make informed decisions based on the information that's available to them at a particular point in time. As the project's development progresses, circumstances may change: new requirements may be introduced or additional pieces of previously unknown information may become available. In light of the new information that's available to them, stakeholders may opt to reevaluate prior decisions and, in some cases, roll back development to a previous spiral iteration.

Agile

When we talk about agile development, we usually refer to a broader family of software development models that were initially proposed during the early 90s. Agile is a sort of umbrella term that encompasses not only a set of frameworks but also a fairly long list of best practices for software development. If we had to come up with a more specific definition for agile, we would probably define it as follows:

"Agile development advocates building software in an incremental fashion by iterating in multiple, albeit relatively, short cycles. Making use of self-organizing and cross-functional teams, it evolves project requirements and solutions by fostering intra-team collaboration."

The popularity of agile development and agile frameworks, in particular, skyrocketed with the publication of the Manifesto for Agile Software Development in 2001 [3]. At the time of writing this book, agile development practices have become the de facto standard for the software industry, especially in the field of start-up companies.

In the upcoming sections, we will be digging a bit deeper into some of the most popular models and frameworks in the agile family. While doing a deep dive on each model is outside the scope of this book, a set of additional resources will be provided at the end of this chapter if you are interested in learning more about the following models.

Lean

Lean software development is one of the earliest members of the agile family of software development models. It was introduced by Mary and Tom Poppendieck in 2003 [10]. Its roots go back to the lean manufacturing techniques that were introduced by Toyota's production system in the 70s. When applied to software development, the model advocates seven key principles.

Eliminate waste

This is one of the key philosophies of the lean development model. Anything that does not directly add value to the final deliverable is considered as a blocker and must be removed.

Typical cases of things that are characterized as waste by this model are as follows:

Introduction of non-essential, that is, nice-to-have features when development is underway.

Overly complicated decision-making processes that force development teams to remain idle while waiting for a feature to be signed off

–

in other words:

bureaucracy

Unnecessary communication between the various project stakeholders and the development teams. This disrupts the focus of the development team and hinders their development velocity.

Create knowledge

The development team should never assume that the customers' requirements are static. Instead, the assumption should always be that they are dynamic and can change over time. Therefore, it is imperative for the development team to come up with appropriate strategies to ensure that their view of the world is always aligned with the customer's.