40,81 €
Proven methodologies and concurrency techniques that will help you write faster and better code with Go programming
Key Features
Book Description
Go is an easy-to-write language that is popular among developers thanks to its features such as concurrency, portability, and ability to reduce complexity. This Golang book will teach you how to construct idiomatic Go code that is reusable and highly performant.
Starting with an introduction to performance concepts, you'll understand the ideology behind Go's performance. You'll then learn how to effectively implement Go data structures and algorithms along with exploring data manipulation and organization to write programs for scalable software. This book covers channels and goroutines for parallelism and concurrency to write high-performance code for distributed systems. As you advance, you'll learn how to manage memory effectively. You'll explore the compute unified device architecture (CUDA) application programming interface (API), use containers to build Go code, and work with the Go build cache for quicker compilation. You'll also get to grips with profiling and tracing Go code for detecting bottlenecks in your system. Finally, you'll evaluate clusters and job queues for performance optimization and monitor the application for performance regression.
By the end of this Go programming book, you'll be able to improve existing code and fulfill customer requirements by writing efficient programs.
What you will learn
Who this book is for
This Golang book is a must for developers and professionals who have an intermediate-to-advanced understanding of Go programming, and are interested in improving their speed of code execution.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 349
Veröffentlichungsjahr: 2020
Copyright © 2020 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor:Richa TripathiAcquisition Editor:Alok DhuriContent Development Editor:Pathikrit RoySenior Editor: Storm MannTechnical Editor:Pradeep SahuCopy Editor: Safis EditingProject Coordinator:Francy PuthiryProofreader: Safis EditingIndexer:Tejal Daruwale SoniProduction Designer:Joshua Misquitta
First published: March 2020 Production reference: 1230320
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78980-578-9
www.packt.com
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Bob Strecansky is a senior site reliability engineer. He graduated with a computer engineering degree from Clemson University with a focus on networking. He has worked in both consulting and industry jobs since graduation. He has worked with large telecom companies and much of the Alexa top 500. He currently works at Mailchimp, working to improve web performance, security, and reliability for one of the world's largest email service providers. He has also written articles for web publications and currently maintains the OpenTelemetry PHP project. In his free time, Bob enjoys tennis, cooking, and restoring old cars. You can follow Bob on the internet to hear musings about performance analysis: Twitter: @bobstrecansky GitHub: @bobstrecansky
Eduard Bondarenko is a long-time software developer. He prefers concise and expressive code with comments and has tried many programming languages, such as Ruby, Go, Java, and JavaScript.
Eduard has reviewed a couple of programming books and has enjoyed their broad topics and how interesting they are. Besides programming, he likes to spend time with the family, play soccer, and travel.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Hands-On High Performance with Go
Dedication
About Packt
Why subscribe?
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Code in Action
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Learning about Performance in Go
Introduction to Performance in Go
Technical requirements
Understanding performance in computer science
A brief note on Big O notation
Methods to gauge long term performance
Optimization strategies overview
Optimization levels
A brief history of Go
The Go standard library
Go toolset
Benchmarking overview
The ideology behind Go performance
Goroutines – performance from the start
Channels – a typed conduit
C-comparable performance
Large-scale distributed systems
Summary
Data Structures and Algorithms
Understanding benchmarking
Benchmark execution
Real-world benchmarking
Introducing Big O notation
Practical Big O notation example
Data structure operations and time complexity
O(1) – constant time
O(log n) - logarithmic time
O(n) – linear time
O(n log n) – quasilinear time
O(n2) – quadratic time
O(2n) – exponential time
Understanding sort algorithms
Insertion sort
Heap sort
Merge sort
Quick sort
Understanding search algorithms
Linear search
Binary search
Exploring trees
Binary trees
Doubly linked list
Exploring queues
Common queuing functions
Common queuing patterns
Summary
Understanding Concurrency
Understanding closures
Anonymous functions
Anonymous functions with respect to closures
Closures for nesting and deferring work
HTTP handlers with closures
Exploring goroutines
The Go scheduler
Go scheduler goroutine internals
The M struct
The P struct
The G struct
Goroutines in action
Introducing channels
Channel internals
Buffered channels
Ranges over channels
Unbuffered channels
Selects
Introducing semaphores
Understanding WaitGroups
Iterators and the process of iteration
Briefing on generators
Summary
STL Algorithm Equivalents in Go
Understanding algorithms in the STL
Sort
Reverse
Min element and max element
Binary search
Understanding containers
Sequence containers
Array
Vector
Deque
List
Forward list
Container adapters
Queue
Priority queue
Stack
Associative containers
Set
Multiset
Map
Multimap
Understanding function objects
Functors
Iterators
Internal iterators
External iterators
Generators
Implicit Iterators
Summary
Matrix and Vector Computation in Go
Introducing Gonum and the Sparse library
Introducing BLAS
Introducing vectors
Vector computations
Introducing matrices
Matrix operations 
Matrix addition
A practical example (matrix subtraction)
Scalar multiplication
Scalar multiplication practical example
Matrix multiplication
Matrix multiplication practical example
Matrix transposition
Matrix transposition practical example
Understanding matrix structures
Dense matrices
Sparse matrices
DOK matrix
LIL matrix
COO matrix
CSR matrix
CSC matrix
Summary
Section 2: Applying Performance Concepts in Go
Composing Readable Go Code
Maintaining simplicity in Go
Maintaining readability in Go
Exploring packaging in Go
Package naming
Packaging layout
Internal packaging
Vendor directory
Go modules
Understanding naming in Go
Understanding formatting in Go
Briefing on interfaces in Go
Comprehending methods in Go
Comprehending inheritance in Go
Exploring reflection in Go
Types
Kinds
Values
Summary
Template Programming in Go
Understanding Go generate
Generated code for protobufs
Protobuf code results
The link toolchain
Introducing Cobra and Viper for configuration programming
Cobra/Viper resulting sets
Text templating
HTML templating
Exploring Sprig
String functions
String slice functions
Default functions
Summary
Memory Management in Go
Understanding Modern Computer Memory - A Primer
Allocating memory
Introducing VSZ and RSS
Understanding memory utilization
Go runtime memory allocation
Memory allocation primer
Memory object allocation
Briefing on limited memory situations
Summary
GPU Parallelization in Go
Cgo – writing C in Go
A simple Cgo example
GPU-accelerated computing – utilizing the hardware
CUDA – utilizing host processes
Docker for GPU-enabled programming
CUDA on GCP
Creating a VM with a GPU 
Install the CUDA driver
Install Docker CE on GCP
Installing NVIDIA Docker on GCP
Scripting it all together
CUDA – powering the program
Summary
Compile Time Evaluations in Go
Exploring the Go runtime
GODEBUG
GCTRACE
GOGC
GOMAXPROCS
GOTRACEBACK
Go build cache
Vendoring dependencies
Caching and vendoring improvements
Debug
PProf/race/trace
Understanding functions
KeepAlive
NumCPU
ReadMemStats
Summary
Section 3: Deploying, Monitoring, and Iterating on Go Programs with Performance in Mind
Building and Deploying Go Code
Building Go binaries
Go build – building your Go code
Build flags
Build information
Compiler and linker flags
Build constraints
Filename conventions
Go clean – cleaning your build directory
Retrieving package dependencies with go get and go mod
Go list
Go run – executing your packages
Go install – installing your binaries
Building Go binaries with Docker
Summary
Profiling Go Code
Understanding profiling
Exploring instrumentation methodologies
Implementing profiling with go test
Manually instrumenting profiling in code
Profiling running service code
Briefing on CPU profiling
Briefing on memory profiling
Extended capabilities with upstream pprof
Comparing multiple profiles
Interpreting flame graphs within pprof
Detecting memory leaks in Go
Summary
Tracing Go Code
Implementing tracing instrumentation
Understanding the tracing format
Understanding trace collection
Movement in the tracing window
Exploring pprof-like traces
Go distributed tracing
Implementing OpenCensus for your application
Summary
Clusters and Job Queues
Clustering in Go
K-nearest neighbors
K-means clustering
Exploring job queues in Go
Goroutines as job queues
Buffered channels as job queues
Integrating job queues
Kafka
RabbitMQ
Summary
Comparing Code Quality Across Versions
Go Prometheus exporter – exporting data from your Go application
APM – watching your distributed system performance
Google Cloud environment setup
Google Cloud Trace code
SLIs and SLOs – setting goals
Measuring traffic
Measuring latency
Measuring errors
Measuring saturation
Grafana
Logging – keeping track of your data
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
Hands-On High Performance with Go is a complete resource with proven methodologies and techniques to help you diagnose and fix performance problems in your Go applications. The book starts with an introduction to the concepts of performance, where you will learn about the ideology behind the performance of Go. Next, you will learn how to implement Go data structures and algorithms effectively and explore data manipulation and organization in order to write programs for scalable software. Channels and goroutines for parallelism and concurrency in order to write high-performance codes for distributed systems are also a core part of this book. Moving on, you'll learn how to manage memory effectively. You'll explore the Compute Unified Device Architecture (CUDA) driver application programming interface (API), use containers to build Go code, and work with the Go build cache for faster compilation. You'll also get a clear picture of profiling and tracing Go code to detect bottlenecks in your system. At the end of the book, you'll evaluate clusters and job queues for performance optimization and monitor the application for performance regression.
This Go book is a must for developers and professionals who have an intermediate-to-advanced understanding of Go programming and are interested in improving their speed of code execution.
Chapter 1, Introduction to Performance in Go, will discuss why performance in computer science is important. You will also learn why performance is important in the Go language.
Chapter 2, Data Structures and Algorithms, deals with data structures and algorithms, which are the basic units of building software, notably complex performance software. Understanding them will help you to think about how to most impact fully organize and manipulate data in order to write effective, performant software. Also, iterators and generators are essential to Go. This chapter will include explanations of different data structures and algorithms, as well as how their big O notation is impacted.
Chapter 3, Understanding Concurrency, will talk about utilizing channels and goroutines for parallelism and concurrency, which are idiomatic in Go and are the best ways to write high-performance code in your system. Being able to understand when and where to use each of these design patterns is essential to writing performant Go.
Chapter 4, STL Algorithm Equivalents in Go, discusses how many programmers coming from other high-performance languages, namely C++, understand the concept of the standard template library, which provides common programming data structures and functions in a generalized library in order to rapidly iterate and write performant code at scale.
Chapter 5, Matrix and Vector Computation in Go, deals with matrix and vector computations in general. Matrices are important in graphics manipulation and AI, namely image recognition. Vectors can hold a myriad of objects in dynamic arrays. They use contiguous storage and can be manipulated to accommodate growth.
Chapter 6, Composing Readable Go Code, focuses on the importance of writing readable Go code. Understanding the patterns and idioms discussed in this chapter will help you to write Go code that is more easily readable and operable between teams. Also, being able to write idiomatic Go will help raise the level of your code quality and help your project maintain velocity.
Chapter 7, Template Programming in Go, focuses on template programming in Go. Metaprogramming allows the end user to write Go programs that produce, manipulate, and run Go programs. Go has clear, static dependencies, which helps with metaprogramming. It has shortcomings that other languages don't have in metaprogramming, such as __getattr__ in Python, but we can still generate Go code and compile the resulting code if it's deemed prudent.
Chapter 8, Memory Management in Go, discusses how memory management is paramount to system performance. Being able to utilize a computer's memory footprint to the fullest allows you to keep highly functioning programs in memory so that you don't often have to take the large performance hit of swapping to disk. Being able to manage memory effectively is a core tenet of writing performant Go code.
Chapter 9, GPU Parallelization in Go, focuses on GPU accelerated programming, which is becoming more and more important in today's high-performance computing stacks. We can use the CUDA driver API for GPU acceleration. This is commonly used in topics such as deep learning algorithms.
Chapter 10, Compile Time Evaluations in Go, discusses minimizing dependencies and each file declaring its own dependencies while writing a Go program. Regular syntax and module support also help to improve compile times, as well as interface satisfaction. These things help to make Go compilation quicker, alongside using containers for building Go code and utilizing the Go build cache.
Chapter 11, Building and Deploying Go Code, focuses on how to deploy new Go code. To elaborate further, this chapter explains how we can push this out to one or multiple places in order to test against different environments. Doing this will allow us to push the envelope of the amount of throughput that we have for our system.
Chapter 12, Profiling Go Code, focuses on profiling Go code, which is one of the best ways to determine where bottlenecks live within your Go functions. Performing this profiling will help you to deduce where you can make improvements within your function and how much time individual pieces take within your function call with respect to the overall system.
Chapter 13, Tracing Go Code, deals with a fantastic way to check interoperability between functions and services within your Go program, also known as tracing. Tracing allows you to pass context through your system and evaluate where you are being held up. Whether it's a third-party API call, a slow messaging queue, or an O(n2) function, tracing will help you to find where this bottleneck resides.
Chapter 14, Clusters and Job Queues, focuses on the importance of clustering and job queues in Go as good ways to get distributed systems to work synchronously and deliver a consistent message. Distributed computing is difficult, and it becomes very important to watch for potential performance optimizations within both clustering and job queues.
Chapter 15, Comparing Code Quality Across Versions, deals with what you should do after you have written, debugged, profiled, and monitored Go code that is monitoring your application in the long term for performance regressions. Adding new features to your code is fruitless if you can't continue to deliver a level of performance that other systems in your infrastructure depend on.
This book is for Go professionals and developers seeking to execute their code faster, so an intermediate to advanced understanding of Go programming is necessary to make the most out of this book. The Go language has relatively minimal system requirements. A modern computer with a modern operating system should support the Go runtime and its dependencies. Go is used in many low power devices that have limited CPU, Memory, and I/O requirements.
You can see the requirements for the language listed at the following URL: https://github.com/golang/go/wiki/MinimumRequirements.
In this book I used Fedora Core Linux (version 29 during the time of writing this book) as the operating system. Instructions on how to install the Fedora Workstation Linux distribution can be found on the Fedora page at the following URL: https://getfedora.org/en/workstation/download/.
Docker is used for many of the examples in this book. You can see the requirements listed for Docker at the following URL: https://docs.docker.com/install/.
In Chapter 9, GPU Parallelization in Go, we discuss GPU programming. To perform the tasks of this chapter, you'll need one of two things:
A NVIDIA enabled GPU. I used a NVIDIA GeForce GTX 670 in my testing, with a Compute Capability of 3.0.
A GPU enabled cloud instance. Chapter 9 discusses a couple of different providers and methodologies for this. GPUs on Compute Engine work for this. More up to date information on GPUs on Compute Engine can be found at the following URL:
https://cloud.google.com/compute/docs/gpus
.
After you read this book; I hope you'll be able to write more efficient Go code. You'll hopefully be able to quantify and validate your efforts as well.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
Support
tab.
Click on
Code Downloads
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/bobstrecansky/HighPerformanceWithGo/. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Code in Action videos for this book can be viewed at http://bit.ly/2QcfEJI.
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781789805789_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: " The following code blocks will show theNext()incantation"
A block of code is set as follows:
// Note the trailing () for this anonymous function invocationfunc() { fmt.Println("Hello Go")}()
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
// Note the trailing () for this anonymous function invocationfunc() {
fmt.Println("Hello Go")
}
()
Any command-line input or output is written as follows:
$ go test -bench=. -benchtime 2s -count 2 -benchmem -cpu 4
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The reverse algorithm takes a dataset and reverses the values of the set"
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
In this section, you will learn why performance in computer science is important. You will also learn why performance is important in the Go language. Moving on, you will learn about data structures and algorithms, concurrency, STL algorithm equivalents, and the matrix and vector computations in Go.
The chapters in this section include the following:
Chapter 1
,
Introduction to Performance in Go
Chapter 2
,
Data Structures and Algorithms
Chapter 3
,
U
n
d
e
r
s
t
a
n
d
i
n
g
C
o
n
c
u
r
r
e
n
c
y
Chapter 4
,
STL Algorithm Equivalents in Go
Chapter 5
,
Matrix and Vector Computation in Go
This book is written with intermediate to advanced Go developers in mind. These developers will be looking to squeeze more performance out of their Go application. To do this, this book will help to drive the four golden signals as defined in the Site Reliability Engineering Workbook (https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/). If we can reduce latency and errors, as well as increase traffic whilst reducing saturation, our programs will continue to be more performant. Following the ideology of the four golden signals is beneficial for anyone developing a Go application with performance in mind.
In this chapter, you'll be introduced to some of the core concepts of performance in computer science. You'll learn some of the history of the Go computer programming language, how its creators decided that it was important to put performance at the forefront of the language, and why writing performant Go is important. Go is a programming language designed with performance in mind, and this book will take you through some of the highlights on how to use some of Go's design and tooling to your advantage. This will help you to write more efficient code.
In this chapter, we will cover the following topics:
Understanding performance in computer science
A brief history of Go
The ideology behind Go performance
These topics are provided to guide you in beginning to understand the direction you need to take to write highly performant code in the Go language.
For this book, you should have a moderate understanding of the Go language. Some key concepts to understand before exploring these topics include the following:
The Go reference specification:
https://golang.org/ref/spec
How to write Go code:
https://golang.org/doc/code.html
Effective Go:
https://golang.org/doc/effective_go.html
Throughout this book, there will be many code samples and benchmark results. These are all accessible via the GitHub repository at https://github.com/bobstrecansky/HighPerformanceWithGo/.
If you have a question or would like to request a change to the repository, feel free to create an issue within the repository at https://github.com/bobstrecansky/HighPerformanceWithGo/issues/new.
Performance in computer science is a measure of work that can be accomplished by a computer system. Performant code is vital to many different groups of developers. Whether you're part of a large-scale software company that needs to quickly deliver masses of data to customers, an embedded computing device programmer who has limited computing resources available, or a hobbyist looking to squeeze more requests out of the Raspberry Pi that you are using for your pet project, performance should be at the forefront of your development mindset. Performance matters, especially when your scale continues to grow.
It is important to remember that we are sometimes limited by physical bounds. CPU, memory, disk I/O, and network connectivity all have performance ceilings based on the hardware that you either purchase or rent from a cloud provider. There are other systems that may run concurrently alongside our Go programs that can also consume resources, such as OS packages, logging utilities, monitoring tools, and other binaries—it is prudent to remember that our programs are very frequently not the only tenants on the physical machines they run on.
Optimized code generally helps in many ways, including the following:
Decreased response time: The total amount of time it takes to respond to a request.
Decreased latency: The time delay between a cause and effect within a system.
Increased throughput: The rate at which data can be processed.
Higher scalability: More work can be processed within a contained system.
There are many ways to service more requests within a computer system. Adding more individual computers (often referred to as horizontal scaling) or upgrading to more powerful computers (often referred to as vertical scaling) are common practices used to handle demand within a computer system. One of the fastest ways to service more requests without needing additional hardware is to increase code performance. Performance engineering acts as a way to help with both horizontal and vertical scaling. The more performant your code is, the more requests you can handle on a single machine. This pattern can potentially result in fewer or less expensive physical hosts to run your workload. This is a large value proposition for many businesses and hobbyists alike, as it helps to drive down the cost of operation and improves the end user experience.
Big O notation (https://en.wikipedia.org/wiki/Big_O_notation) is commonly used to describe the limiting behavior of a function based on the size of the inputs. In computer science, Big O notation is used to explain how efficient algorithms are in comparison to one another—we'll discuss this more in detail in Chapter 2, Data Structures and Algorithms. Big O notation is important in optimizing performance because it is used as a comparison operator in explaining how well algorithms will scale. Understanding Big O notation will help you to write more performant code, as it will help to drive performance decisions in your code as the code is being composed. Knowing at what point different algorithms have relative strengths and weaknesses helps you to determine the correct choice for the implementation at hand. We can't improve what we can't measure—Big O notation helps us to give a concrete measurement to the problem statement at hand.
As we make our performance improvements, we will need to continually monitor our changes to view impact. Many methods can be used to monitor the long-term performance of computer systems. A couple of examples of these methods would be the following:
Brendan Gregg's USE Method: Utilization, saturation, and errors (
www.brendangregg.com/usemethod.html
)
Tom Wilkie's RED Metrics: Requests, errors, and duration (
https://www.weave.works/blog/the-red-method-key-metrics-for-microservices-architecture/
)
Google SRE's four Golden Signals: Latency, traffic, errors, and saturation (
https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/
)
We will discuss these concepts further in Chapter 15, Comparing Code Quality Across Versions. These paradigms help us to make smart decisions about the performance optimizations in our code as well as avoid premature optimization. Premature optimization plays as a very crucial aspect for many a computer programmers. Very frequently, we have to determine what fast enough is. We can waste our time trying to optimize a small segment of code when many other code paths have an opportunity to improve from a performance perspective. Go's simplicity allows for additional optimization without cognitive load overhead or an increase in code complexity. The algorithms that we will discuss in Chapter 2, Data Structures and Algorithms, will help us to avoid premature optimization.
In this book, we will also attempt to understand what exactly we are optimizing for. The techniques for optimizing for CPU or memory utilization may look very different than optimizing for I/O or network latency. Being cognizant of your problem space as well as your limitations within your hardware and upstream APIs will help you to determine how to optimize for the problem statement at hand. Optimization also often shows diminishing returns. Frequently the return on development investment for a particular code hotspot isn't worthwhile based on extraneous factors, or adding optimizations will decrease readability and increase risk for the whole system. If you can determine whether an optimization is worth doing early on, you'll be able to have a more narrowly scoped focus and will likely continue to develop a more performant system.
It can be helpful to understand baseline operations within a computer system. Peter Norvig, the Director of Research at Google, designed a table (the image that follows) to help developers understand the various common timing operations on a typicalcomputer (https://norvig.com/21-days.html#answers):
Having a clear understanding of how different parts of a computer can interoperate with one another helps us to deduce where our performance optimizations should lie. As derived from the table, it takes quite a bit longer to read 1 MB of data sequentially from disk versus sending 2 KBs over a 1 Gbps network link. Being able to have back-of-the-napkin math comparison operators for common computer interactions can very much help to deduce which piece of your code you should optimize next. Determining bottlenecks within your program becomes easier when you take a step back and look at a snapshot of the system as a whole.
Breaking down performance problems into small, manageable sub problems that can be improved upon concurrently is a helpful shift into optimization. Trying to tackle all performance problems at once can often leave the developer stymied and frustrated, and often lead to many performance efforts failing. Focusing on bottlenecks in the current system can often yield results. Fixing one bottleneck will often quickly identify another. For example, after you fix a CPU utilization problem, you may find that your system's disk can't write the values that are computed fast enough. Working through bottlenecks in a structured fashion is one of the best ways to create a piece of performant and reliable software.
Starting at the bottom of the pyramid in the following image, we can work our way up to the top. This diagram shows a suggested priority for making performance optimizations. The first two levels of this pyramid—the design level and algorithm and data structures level—will often provide more than ample real-world performance optimization targets. The following diagram shows an optimization strategy that is often efficient. Changing the design of a program alongside the algorithms and data structures are often the most efficient places to improve the speed and quality of code bases:
Design-level decisions often have the most measurable impact on performance. Determining goals during the design level can help to determine the best methodology for optimization. For example, if we are optimizing for a system that has slow disk I/O, we should prioritize lowering the number of calls to our disk. Inversely, if we are optimizing for a system that has limited compute resources, we need to calculate only the most essential values needed for our program's response. Creating a detailed design document at the inception of a new project will help with understanding where performance gains are important and how to prioritize time within the project. Thinking from a perspective of transferring payloads within a compute system can often lead to noticing places where optimization can occur. We will talk more about design patterns in Chapter 3, Understanding Concurrency.
Algorithm and data structure decisions often have a measurable performance impact on a computer program. We should focus on trying to utilize constant O(1), logarithmic O(log n), linear O(n), and log-linear O(n log n) functions while writing performant code. Avoiding quadratic complexity O(n2) at scale is also important for writing scalable programs. We will talk more about O notation and its relation to Go in Chapter 2, Data Structures and Algorithms.
Robert Griesemer, Rob Pike, and Ken Thompson created the Go programming language in 2007. It was originally designed as a general-purpose language with a keen focus on systems programming. The creators designed the Go language with a couple of core tenets in mind:
Static typing
Runtime efficiency
Readable
Usable
Easy to learn
High-performance networking and multiprocessing
Go was publicly announced in 2009 and v1.0.3 was released on March 3, 2012. At the time of the writing of this book, Go version 1.14 has been released, and Go version 2 is on the horizon. As mentioned, one of Go's initial core architecture considerations was to have high-performance networking and multiprocessing. This book will cover a lot of the design considerations that Griesemer, Pike, and Thompson have implemented and evangelized on behalf of their language. The designers created Go because they were unhappy with some of the choices and directions that were made in the C++ language. Long-running complications on large distributed compile clusters were a main source of pain for the creators. During this time, the authors started learning about the next C++ programming language release, dubbed C++x11. This C++ release had very many new features being planned, and the Go team decided they wanted to adopt an idiom of less is more in the computing language that they were using to do their work.
The authors of the language had their first meeting where they discussed starting with the C programming language, building features and removing extraneous functionality they didn't feel was important to the language. The team ended up starting from scratch, only borrowing some of the most atomic pieces of C and other languages they were comfortable with writing. After their work started to take form, they realized that they were taking away some of the core traits of other languages, notably the absence of headers, circular dependencies, and classes. The authors believe that even with the removal of many of these fragments, Go still can be more expressive than its predecessors.
The standard library in Go follows this same pattern. It has been designed with both simplicity and functionality in mind. Adding slices, maps, and composite literals to the standard library helped the language to become opinionated early. Go's standard library lives within $GOROOT and is directly importable. Having these default data structures built into the language enables developers to use these data structures effectively. The standard library packages are bundled in with the language distribution and are available immediately after you install Go. It is often mentioned that the standard library is a solid reference on how to write idiomatic Go. The reasoning on standard library idiomatic Go is these core library pieces are written clearly, concisely, and with quite a bit of context. They also add small but important implementation details well, such as being able to set timeouts for connections and being explicitly able to gather data from underlying functions. These language details have helped the language to flourish.
Some of the notable Go runtime features include the following:
Garbage collection for safe memory management (a concurrent, tri-color, mark-sweep collector)
Concurrency to support more than one task simultaneously (more about this in
Chapter 3
,
Understanding Concurrency
)
Stack management for memory optimization (segmented stacks were used in the original implementation; stack copying is the current incantation of Go stack management)
Go's binary release also includes a vast toolset for creating optimized code. Within the Go binary, the go command has a lot of functions that help to build, deploy, and validate code. Let's discuss a couple of the core pieces of functionality as they relate to performance.
Godoc is Go's documentation tool that keeps the cruxes of documentation at the forefront of program development. A clean implementation, in-depth documentation, and modularity are all core pieces of building a scalable, performant system. Godoc helps with accomplishing these goals by auto-generating documentation. Godoc extracts and generates documentation from packages it finds within $GOROOT and $GOPATH. After generating this documentation, Godoc runs a web server and displays the generated documentation as a web page. Documentation for the standard library can be seen on the Go website. As an example, the documentation for the standard library pprof package can be found at https://golang.org/pkg/net/http/pprof/.
The addition of gofmt (Go's code formatting tool) to the language brought a different kind of performance to Go. The inception of gofmt allowed Go to be very opinionated when it comes to code formatting. Having precise enforced formatting rules makes it possible to write Go in a way that is sensible for the developer whilst letting the tool format the code to follow a consistent pattern across Go projects. Many developers have their IDE or text editor perform a gofmt command when they save the file that they are composing. Consistent code formatting reduces the cognitive load and allows the developer to focus on other aspects of their code, rather than determining whether to use tabs or spaces to indent their code. Reducing the cognitive load helps with developer momentum and project velocity.
Go's build system also helps with performance. The go build command is a powerful tool that compiles packages and their dependencies. Go's build system is also helpful in dependency management. The resulting output from the build system is a compiled, statically linked binary that contains all of the necessary elements to run on the platform that you've compiled for. go module (a new feature with preliminary support introduced in Go 1.11 and finalized in Go 1.13) is a dependency management system for Go. Having explicit dependency management for a language helps to deliver a consistent experience with groupings of versioned packages as a cohesive unit, allowing for more reproducible builds. Having reproducible builds helps developers to create binaries via a verifiable path from the source code. The optional step to create a vendored directory within your project also helps with locally storing and satisfying dependencies for your project.
