32,39 €
A tutorial leading the aspiring Go developer to full mastery of Golang's distributed features.
Distributed Computing with Go gives developers with a good idea how basic Go development works the tools to fulfill the true potential of Golang development in a world of concurrent web and cloud applications. Nikhil starts out by setting up a professional Go development environment. Then you’ll learn the basic concepts and practices of Golang concurrent and parallel development.
You’ll find out in the new few chapters how to balance resources and data with REST and standard web approaches while keeping concurrency in mind. Most Go applications these days will run in a data center or on the cloud, which is a condition upon which the next chapter depends. There, you’ll expand your skills considerably by writing a distributed document indexing system during the next two chapters. This system has to balance a large corpus of documents with considerable analytical demands.
Another use case is the way in which a web application written in Go can be consciously redesigned to take distributed features into account. The chapter is rather interesting for Go developers who have to migrate existing Go applications to computationally and memory-intensive environments. The final chapter relates to the rather onerous task of testing parallel and distributed applications, something that is not usually taught in standard computer science curricula.
This book is for developers who are familiar with the Golang syntax and have a good idea of how basic Go development works. It would be advantageous if you have been through a web application product cycle, although it’s not necessary.
V.N. Nikhil Anurag is a Go developer currently working in Berlin. He speaks at conferences about how to use Go in domains such as Concurrency, file systems, and distributed systems. He is also trying to bridge the gap between the rich literature on concurrency and the practice of programming goroutines and channels. He did his Bachelor's in Electronics and Instrumentation Engineering from JNTU, India and Master of Science in Control System from University of Sheffield, UK.Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 226
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor:Dominic ShakeshaftAcquisition Editor:Frank PohlmannProject Editor: Radhika AtitkarContent Development Editor:Monika SangwanTechnical Editor:Nidhisha ShettyCopy Editor:Tom JacobProofreader:Safis EditingIndexer:Rekha NairGraphics:Tom ScariaProduction Coordinator:Nilesh Mohite
First published: February 2018
Production reference: 1270218
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78712-538-4
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
V.N. Nikhil Anurag is a Go developer currently working in Berlin. He speaks at conferences about how to use Go in domains such as concurrency, file systems, and distributed systems. He is also trying to bridge the gap between the rich literature on concurrency and the practice of programming goroutines and channels. He did his Bachelor's in Electronics and Instrumentation Engineering from JNTU, India and Master of Science in Control System from University of Sheffield, UK.
Pankaj Khairnar is a cofounder and CTO at Qwentic (A Golang specialized development company). He loves programming, and for the last 10 years, he has been developing highly scalable and distributed enterprise applications using various technologies.
Jinzhu Zhang is a veteran coder, creator/contributor of many open source projects, such as GORM. He is on Github at github.com/jinzhu.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Distributed Computing with Go
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Developer Environment for Go
GOROOT
GOPATH
src/
pkg/
bin/
Package management
go get
glide
go dep
Structuring a project
Working with book's code
Containers
Docker
Docker versus Virtual Machine (VM)
Understanding Docker
Testing Docker setup
Dockerfile
main.go
Testing in Go
variadic.go
variadic_test.go
Running tests in variadic_test.go
addInt.go
addInt_test.go
Running tests in addInt_test.go
nil_test.go
Running tests in nil_test.go
Summary
Understanding Goroutines
Concurrency and parallelism
Concurrency
Code overview
Serial task execution
Serial task execution with goroutines
Concurrent task execution
Parallelism
Go's runtime scheduler
Goroutine
OS thread or machine
Context or processor
Scheduling with G, M, and P
Gotchas when using goroutines
Single goroutine halting the complete program
Goroutines aren't predictable
Summary
Channels and Messages
Controlling parallelism
Distributed work without channels
Distributed work with channels
What is a channel?
Solving the cashier problem with goroutines
Channels and data communication
Messages and events
Types of channels
The unbuffered channel
The buffered channel
The unidirectional buffer
Closing channels
Multiplexing channels
Summary
The RESTful Web
HTTP and sessions
A brief history of HTTP
HTTP sessions
The REST protocol
The server and client architecture
The standard data format
Resources
Reusing the HTTP protocol
GET
POST
PUT and PATCH
DELETE
Upgradable components
Fundamentals of a REST server
A simple web server
Designing a REST API
The data format
The book resource
GET /api/books
GET /api/books/<id>
POST /api/books
PUT /api/books/<id>
DELETE /api/books/<id>
Unsuccessful requests
Design decisions
The REST server for books API
main.go
books-handler/common.go
books-handler/actions.go
books-handler/handler.go
How to make REST calls
cURL
GET
DELETE
PUT
POST
Postman
net/http
Summary
Introducing Goophr
What is Goophr?
Design overview
OpenAPI specification
Goophr Concierge API definition
Goophr Librarian API definition
Project structure
Summary
Goophr Concierge
Revisiting the API definition
Document feeder – the REST API endpoint
Query handler – the REST API endpoint
Conventions
Code conventions
Diagram conventions
Logical flow diagrams
The doc processor
The doc store
The index processor
The line store
The consolidated flow diagram
Queue workers
Single stores
Buffered channels
The Concierge source code
Running tests
The Concierge server
Summary
Goophr Librarian
The standard indexing model
An example – books with an index of words
The inverted indexing model
An example – the inverted index for words in books
Ranking
Revisiting the API definition
The document indexer – the REST API endpoint
The query resolver – the REST API endpoint
Code conventions
Librarian source code
main.go
common/helpers.go
api/index.go
api/query.go
Testing Librarian
Testing feeder.go using /api/index
Testing /api/query
Summary
Deploying Goophr
Updating Goophr Concierge
Handle multiple Librarians
Aggregated search results
Orchestrating with docker-compose
Environment variables and API ports
The file server
The Goophr source code
librarian/main.go
concierge/main.go
concierge/api/query.go
simple-server/Dockerfile
simple-server/main.go
docker-compose.yaml
.env
Running Goophr with docker-compose
Adding documents to Goophr
Searching for keywords with Goophr
Search – "apple"
Search – "cake"
Search – "apple", "cake"
Individual logs with docker-compose
Authorization on a web server
secure/secure.go
secure/secure_test.go
Test results
Summary
Foundations of Web Scale Architecture
Scaling a web application
The single server instance
Separate layers for the web and database
Multiple server instances
The load balancer
Multi-availability zones
The database
SQL versus NoSQL
Which type of database should we use?
Database replication
Master-replica replication
Master-master replication
Failover cluster replication
Monolith versus microservices
Mediator design pattern
Deployment options
Maintainability of multiple instances
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
The Go programming language was developed at Google to solve the problems they faced while developing software for their infrastructure. They needed a language that was statically typed without slowing down the developer, would compile and execute instantaneously, take advantage of multicore processors, and make working across distributed systems, effortless.
The mission of Distributed computing with Go is to make reasoning about concurrency and parallelism, effortless and provide the reader with the confidence to design and implement such programs in Go. We will start by digging into the core concepts behind goroutines and channels, the two fundamental concepts in Go around which the language is built. Next, we will design and build a distributed search engine using Go and Go's standard library.
This book is for developers who are familiar with the Golang syntax and have a good idea of how basic Go development works. It would be advantageous if you have been through a web application product cycle, although it's not necessary.
Chapter 1, Developer Environment for Go, covers a list of topics and concepts required to start working with Go and rest of the book. Some of these topics include Docker and testing in Go.
Chapter 2, Understanding Goroutines, introduces the topic of concurrency and parallelism and then dives deep into the implementation details of goroutines, Go's runtime scheduler, and many more.
Chapter 3, Channels and Messages, begins by explaining the complexity of controlling parallelism before introducing strategies to control parallelism, using different types of channels.
Chapter 4, The RESTful Web, provides all the context and knowledge required to start designing and building REST APIs in Go. We will also discuss the interaction with a REST API server using different available approaches.
Chapter 5, Introducing Goophr, opens the discussion on what is meant by a distributed search engine, using OpenAPI specification to describe REST APIs and describing the responsibilities of the components of a search engine, using OpenAPI. Finally, we'll describe the project structure.
Chapter 6, Goophr Concierge, dives deep into the first component of Goophr by describing in detail how the component is supposed to work. These concepts are further driven home with the help of architectural and logical flow diagrams. Finally, we'll look at how to implement and test the component.
Chapter 7, Goophr Librarian, is a detailed look at the component that is responsible for maintaining the index for the search terms. We also look at how to search for given terms and how to order our search results and many more. Finally, we'll look at how to implement and test the component.
Chapter 8, Deploying Goophr, brings together everything we have implemented in the previous three chapters and start the application on the local system. We will then test our design by adding a few documents and searching against them via the REST API.
Chapter 9, Foundations of Web Scale Architecture, is an introduction to the vast and complex topic on how to design and scale a system to meet with the demands at web scale. We will start with a single instance of a monolith running on a single server and scale it up to span across multiple region, have redundancy safeguards to ensure that the service is never down and many more.
The material in the book is designed to enable a hands-on approach. Throughout the book, a conscious effort has been made to provide all the relevant information to the reader beforehand so that, if the reader chooses, they can try to solve the problem on their own and then refer to the solution provided in the book.
The code in the book does not have any Go dependencies beyond the standard library. This is done in order to ensure that the code examples provided in the book never change, and this also allows us to explore the standard library.
The source code in the book should be placed at
$GOPATH/src/distributed-go
. The source code for examples given will be located inside the
$GOPATH/src/distributed-go/chapterX
folder, where
X
stands for the chapter number.
Download and install Go from
https://golang.org/
and Docker from
https://www.docker.com/community-edition
website
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
http://www.packtpub.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the on-screen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Distributed-Computing-with-Go. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/DistributedComputingwithGo_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example, "Now that we have all the code in place, let's build the Docker image using the Dockerfile file."
A block of code is set as follows:
// addInt.go package main func addInt(numbers ...int) int { sum := 0 for _, num := range numbers { sum += num } return sum }
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
// addInt.go package main
func addInt(numbers ...int) int {
sum := 0 for _, num := range numbers { sum += num } return sum }
Any command-line input or output is written as follows:
$ cd docker
Bold: Indicates a new term, an important word, or words that you see on the screen, for example, in menus or dialog boxes, also appear in the text like this. For example, "Select System info from the Administration panel."
Feedback from our readers is always welcome.
General feedback: Email [email protected], and mention the book's title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book we would be grateful if you would report this to us. Please visit, http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
Go is a modern programming language built for the 21st century application development. Hardware and technology have advanced significantly over the past decade, and most of the other languages do not take advantage of these technical advancements. As we shall see throughout the book, Go allows us to build network applications that take advantage of concurrency and parallelism made available with multicore systems.
In this chapter, we will look at some of the topics required to work through rest of the book, such as:
Go configuration—
GOROOT
,
GOPATH
, and so on.
Go package management
Project structure used throughout the book
Container technology and how to use Docker
Writing tests in Go
In order to run or build a Go project, we need to have access to the Go binary and its libraries. A typical installation of Go (instructions can be found at https://golang.org/dl/) on Unix-based systems will place the Go binary at /usr/bin/go. However, it is possible to install Go on a different path. In that case, we need to set the GOROOT environment variable to point to our Go installation path and also append it to ourPATHenvironment variable.
Programmers tend to work on many projects and it is good practice to have the source code separate from nonprogramming-related files. It is a common practice to have the source code in a separate location or workspace. Every programming language has its own conventions on how the language-related projects should be set up and Go is no exception to this.
GOPATH is the most important environment variable the developer has to set. It tells the Go compiler where to find the source code for the project and its dependencies. There are conventions within the GOPATH that need to be followed, and they have to deal with folder hierarchies.
This is the directory that will contain the source code of our projects and their dependencies. In general, we want our source code to have version control and be hosted on the cloud. It would also be great if we or anyone else could easily use our project. This requires a little extra setup on our part.
Let's imagine that our project is hosted at http://git-server.com/user-name/my-go-project. We want to clone and build this project on our local system. To make it properly work, we need to clone it to $GOPATH/src/git-server.com/user-name/my-go-project. When we build a Go project with dependencies for the first time, we will see that the src/ folder has many directories and subdirectories that contain the dependencies of our project.
Go is a compiled programming language; we have the source code and code for the dependencies that we want to use in our project. In general, every time we build a binary, the compiler has to read the source code of our project and dependencies and then compile it to machine code. Compiling unchanged dependencies every time we compile our main program would lead to a very slow build process. This is the reason that object files exist; they allow us to compile dependencies into reusable machine code that can be readily included in our Go binary.
These object files are stored in $GOPATH/pkg; they follow a directory structure similar to that of src/, except that they are within a subdirectory. These directories tend to follow the naming pattern of <OS>_<CPU-Architecture>, because we can build executable binaries for multiple systems:
$ tree $GOPATH/pkg
pkg
└── linux_amd64
├── github.com
│ ├── abbot
│ │ └── go-http-auth.a
│ ├── dimfeld
│ │ └── httppath.a
│ ├── oklog
│ │ └── ulid.a
│ ├── rcrowley
│ │ └── go-metrics.a
│ ├── sirupsen
│ │ └── logrus.a
│ ├── sony
│ │ └── gobreaker.a
└── golang.org
└── x
├── crypto
│ ├── bcrypt.a
│ ├── blowfish.a
│ └── ssh
│ └── terminal.a
├── net
│ └── context.a
└── sys
Go compiles and builds our projects into executable binaries and places them in this directory. Depending on the build specs, they might be executable on your current system or other systems. In order to use the binaries that are available in the bin/ directory, we need to set the corresponding GOBIN=$GOPATH/bin environment variable.
In the days of yore, all programs were written from scratch—every utility function and every library to run the code had to written by hand. Now a days, we don't want to deal with the low level details on a regular basis; it would be unimaginable to write all the required libraries and utilities from scratch. Go comes with a rich library, which will be enough for most of our needs. However, it is possible that we might need a few extra libraries or features not provided by the standard library. Such libraries should be available on the internet, and we can download and add them into our project to start using them.
In the previous section, GOPATH, we discussed how all our projects are saved into qualified paths of the $GOPATH/src/git-server.com/user-name/my-go-project form. This is true for any and all dependencies we might have. There are multiple ways to handle dependencies in Go. Let's look at some of them.
The go get is the utility provided by the standard library for package management. We can install a new package/library by running the following command:
$ go get git-server.com/user-name/library-we-need
This will download and build the source code and then install it as a binary executable (if it can be used as a standalone executable). The go get utility also installs all the dependencies required by the dependency retrieved for our project.
The glide is one of the most widely used package management tool in Go community. It addresses the limitations of go get, but it needs to be installed manually by the developer. The following is a simple way to install and use glide:
$ curl https://glide.sh/get | sh
$ mkdir new-project && cd new-project
$ glide create
$ glide get github.com/last-ent/skelgor # A helper project to generate project skeleton.
$ glide install # In case any dependencies or configuration were manually added.
$ glide up # Update dependencies to latest versions of the package.
$ tree
.
├── glide.lock
├── glide.yaml
└── vendor
└── github.com
└── last-ent
└── skelgor
├── LICENSE
├── main.go
└── README.md
In case you do not wish to install glide via curl and sh, other options are available and described in better detail on the project page, available at https://github.com/masterminds/glide.
The go dep is a new dependency management tool being developed by the Go community. Right now, it requires Go 1.7 or newer to compile, and it is ready for production use. However, it is still undergoing changes and hasn't yet been merged into Go's standard library.
A project might have more than just the source code for the project, for example, configuration files and project documentation. Depending upon preferences, the way the project is structured can drastically change. However, the most important thing to remember is that the entry point to the whole program is through the main function, which is implemented within main.go as a convention.
The application we will be building in this book, will have the following initial structure:
$ tree
.
├── common
│ ├── helpers.go
│ └── test_helpers.go
└── main.go
The source code discussed throughout the book can be obtained in two ways:
Using
go get -u github.com/last-ent/distributed-go
Downloading the code bundle from the website and extracting it to
$GOPATH/src/github.com/last-ent/distributed-go
The code for complete book should now be available at $GOPATH/src/github.com/last-ent/distributed-go and the code specific for each chapter will be available in that particular chapter number's directory.
For example,
Code for Chapter 1 -> $GOPATH/src/github.com/last-ent/distributed-go/chapter1
Code for Chapter 2 -> $GOPATH/src/github.com/last-ent/distributed-go/chapter2
And so on.
Whenever we discuss code in any particular chapter, it is implied that we are in the respective chapter's folder.
Throughout the book, we will be writing Go programs that will be compiled to binaries and run directly on our system. However, in the latter chapters we will be using docker-compose to build and run multiple Go applications. These applications can run without any real problem on our local system; however, our ultimate goal is to be able to run these programs on servers and to be able to access them over the internet.
During the 1990s and early 2000s, the standard way to deploy applications to the internet was to get a server instance, copy the code or binary onto the instance, and then start the program. This worked great for a while, but soon complications began to arise. Here are a few of them:
Code that worked on the developer's machine might not work on the server.
Programs that ran perfectly on a server instance might fail upon applying the latest patch to the server's OS.
For every new instance added as part of a service, various installation scripts had to be run so that we can bring the new instance to be on par with all the other instances. This can be a very slow process.
