35,99 €
Get acquainted with GCP and manage robust, highly available, and dynamic solutions to drive business objective
Key Features●Identify the strengths, weaknesses and ideal use-cases for individual services offered on the Google Cloud Platform●Make intelligent choices about which cloud technology works best for your use-case●Leverage Google Cloud Platform to analyze and optimize technical and business processesBook Description
Using a public cloud platform was considered risky a decade ago, and unconventional even just a few years ago. Today, however, use of the public cloud is completely mainstream - the norm, rather than the exception. Several leading technology firms, including Google, have built sophisticated cloud platforms, and are locked in a fierce competition for market share.
The main goal of this book is to enable you to get the best out of the GCP, and to use it with confidence and competence. You will learn why cloud architectures take the forms that they do, and this will help you become a skilled high-level cloud architect. You will also learn how individual cloud services are configured and used, so that you are never intimidated at having to build it yourself. You will also learn the right way and the right situation in which to use the important GCP services.
By the end of this book, you will be able to make the most out of Google Cloud Platform design.
What you will learn●Set up GCP account and utilize GCP services using the cloud shell, web console, and client APIs●Harness the power of App Engine, Compute Engine, Containers on the Kubernetes Engine, and Cloud Functions●Pick the right managed service for your data needs, choosing intelligently between Datastore, BigTable, and BigQuery●Migrate existing Hadoop, Spark, and Pig workloads with minimal disruption to your existing data infrastructure, by using Dataproc intelligently●Derive insights about the health, performance, and availability of cloud-powered applications with the help of monitoring, logging, and diagnostic tools in StackdriverWho this book is for
If you are a Cloud architect who is responsible to design and manage robust cloud solutions with Google Cloud Platform, then this book is for you. System engineers and Enterprise architects will also find this book useful. A basic understanding of distributed applications would be helpful, although not strictly necessary. Some working experience on other public cloud platforms would help too.
Vitthal Srinivasan is a Google Cloud Platform Authorized Trainer and certified Google Cloud Architect and Data Engineer. Vitthal holds master's degrees in math and electrical engineering from Stanford and an MBA from INSEAD. He has worked at Google as well as at other large firms, such as Credit Suisse and Flipkart. He is currently in Loonycorn, a technical video content studio, of which he is a cofounder. Janani Ravi is a certified Google Cloud Architect and Data Engineer. She has earned her master's degree in electrical engineering from Stanford. She is currently in Loonycorn, a technical video content studio, of which she is a cofounder. Prior to co-founding Loonycorn, she worked at various leading companies, such as Google and Microsoft, for several years as a software engineer. Judy Raj is a Google Certified Professional Cloud Architect, and she has great experience with the three leading cloud platforms, namely AWS, Azure, and the GCP. She has also worked with a wide range of technologies in machine learning, data science, IoT, robotics, and mobile and web app development. She is currently a technical content engineer in Loonycorn. She holds a degree in computer science and engineering from Cochin University of Science and Technology. Being a driven engineer fascinated with technology, she is a passionate coder, an AI enthusiast, and a cloud aficionado.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 348
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Vijin BorichaAcquisition Editor:Rohit RajkumarContent Development Editor:Abhishek JadhavTechnical Editor:Mohd Riyan KhanCopy Editors:Safis Editing, Dipti MankameProject Coordinator:Judie JoseProofreader: Safis EditingIndexer:Priyanka DhadkeGraphics:Tom ScariaProduction Coordinator: Shantanu Zagade
First published: June 2018
Production reference: 1220618
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78883-430-8
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Vitthal Srinivasan is a Google Cloud Platform Authorized Trainer and certified Google Cloud Architect and Data Engineer. Vitthal holds master's degrees in math and electrical engineering from Stanford and an MBA from INSEAD. He has worked at Google as well as at other large firms, such as Credit Suisse and Flipkart. He is currently in Loonycorn, a technical video content studio, of which he is a cofounder.
Janani Ravi is a certified Google Cloud Architect and Data Engineer. She has earned her master's degree in electrical engineering from Stanford. She is currently in Loonycorn, a technical video content studio, of which she is a cofounder. Prior to co-founding Loonycorn, she worked at various leading companies, such as Google and Microsoft, for several years as a software engineer.
Judy Raj is a Google Certified Professional Cloud Architect, and she has great experience with the three leading cloud platforms, namely AWS, Azure, and the GCP. She has also worked with a wide range of technologies in machine learning, data science, IoT, robotics, and mobile and web app development. She is currently a technical content engineer in Loonycorn. She holds a degree in computer science and engineering from Cochin University of Science and Technology. Being a driven engineer fascinated with technology, she is a passionate coder, an AI enthusiast, and a cloud aficionado.
Tim Berry is a systems architect and software engineer with over 20 years of experience in building enterprise infrastructure and systems on the internet and mobile platforms. He currently leads a team of SREs building customer solutions on Google Cloud Platform for a managed services provider in the UK. Tim is a Google Certified Professional Cloud Architect and Data Engineer, a Red Hat Certified Engineer, and systems administrator. He holds Red Hat Certified Specialist status for configuration management and containerized application development.
Nisarg M. Vasavada is a content engineer in Loonycorn. He has pursued his master's in engineering at GTU, and he has been an active member of technical education and research community with his publications. He loves writing and believes that simplifying complexities is the biggest responsibility of an author.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Google Cloud Platform for Architects
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Conventions used
Get in touch
Reviews
The Case for Cloud Computing
Genesis
Why Google Cloud Platform (GCP)?
Autoscaling and autohealing
Capital expenditure (CAPEX) versus operating expenses (OPEX)
Career implications
Summary
Introduction to Google Cloud Platform
Global, regional, and zonal resources
Accessing the Google Cloud Platform
Projects and billing
Setting up a GCP account
Using the Cloud Shell
Summary
Compute Choices – VMs and the Google Compute Engine
Google Compute Engine – GCE
Creating VMs
Creating a VM instance using the web console
Creating a VM instance using the command line
VM customization options
Operating system
Compute zone
Machine type
Networks – aka VPCs
Storage options
Persistent disks and local SSDs – block storage for GCE
Understanding persistent disks and local SSDs
Creating and attaching a persistent disk
Linux procedure for formatting and mounting a persistent disk
Sharing a persistent disk between multiple instances
Resizing a persistent disk
More on working with GCE VMs
Rightsizing recommendations
Availability policies
Auto-restart
Preemptibillity
Load balancing
Autoscaling and managed instance groups
Billing
Labels and tags
Startup scripts
Snapshots and images
How to snapshot a disk
How to create an image of a disk
Cloud launcher
Deploying LAMP stack using GCE
Modifying GCE VMs
Summary
GKE, App Engine, and Cloud Functions
GKE
Contrasting containers and VMs
What is a container?
Docker containers and Kubernetes – complements, not substitutes
GKE
Creating a Kubernetes cluster and deploying a WordPress container
Using the features of GKE
Storage and persistent disks
Load balancing
Auto scaling
Scaling nodes with the cluster autoscaler
Scaling pods with the horizontal pod autoscaler
Multi-zone clusters
Cloud VPN integration
Rolling updates
The container registry
Federated clusters
Google App Engine – flexible
Hosted Docker containers with App Engine Flex
Running a simple Python application with App Engine Flex 
Cron Jobs with App Engine Flex
Advantages of GKE over Docker on VMs or App Engine Flex
Google App Engine – standard
Hosted web apps with App Engine Standard
Typical App Engine architecture
Deploying and running on App Engine Standard
Traffic splitting
Serverless compute with cloud functions
Cloud Functions triggered by HTTP
Cloud Functions triggered by Pub/Sub
Cloud functions triggered by GCS object notifications
Summary
Google Cloud Storage – Fishing in a Bucket
Knowing when (and when not) to use GCS
Serving Static Content with GCS Buckets
Storage classes–Regional, multi-regional, nearline, and coldline
Working with GCS buckets
Creating buckets
Creating buckets using the web console
Creating buckets using gsutil
Changing the storage class of bucket and objects
Transferring data in and out of buckets
Uploading data to buckets using the web console
Uploading data to buckets using gsutil
Copying data between buckets using the web console
Copying data between buckets using the gsutil command line
Using the Transfer Service (instead of gsutil or the web console)
Transfer Service or gsutil?
Use case – Object Versioning
Object versioning in the Cloud Storage bucket
Use case – object life cycle policies
Managing bucket life cycle using the web console
Manipulating object life-cycle via JSON file
Deleting objects permanently using the web console
Deleting objects permanently using gsutil
Use case – restricting access with both ACLs and IAM
Managing permissions in bucket using the GCP console
Use case – signed and timed URLs
Setting up signed URLs for cloud storage
Use case – reacting to object changes
Setting up object change notifications with the gsutil notification watchbucket
Use case – using customer supplied encryption keys
Use case – auto-syncing folders
Use case – mounting GCS using gcsfuse
Mounting GCS buckets
Use case – offline ingestion options
Summary
Relational Databases
Relational databases, SQL, and schemas
OLTP and the ACID properties
Scaling up versus scaling out
GCP Cloud SQL
Creating a Cloud SQL instance
Creating a database in a Cloud SQL instance
Importing a database
Testing Cloud SQL instances
Use case – managing replicas
Use case – managing certificates
Use case – operating Cloud SQL through VM instances
Automatic backup and restore
Cloud Spanner
Creating a Cloud Spanner instance
Creating a database in Cloud Spanner instances
Querying a database in a Cloud Spanner instance
Interleaving tables in Cloud Spanner
Summary
NoSQL Databases
NoSQL databases
Cloud Bigtable
Fundamental properties of Bigtable
Columnar datastore
Denormalization
Support for ACID properties
Working with Bigtable
When to use Bigtable
Solving hot-spotting
Choosing storage for Bigtable
Solving performance issues
Ideal row key choices
Performing operations on Bigtable
Creating and operating an HBase table using Cloud Bigtable
Exporting/Importing a table from Cloud Bigtable
Scaling GCP Cloud BigTable
The Google Cloud Datastore
Comparison with traditional databases
Working with Datastore
When to use Datastore
Full indexing and perfect index
Using Datastore
Summary
BigQuery
Underlying data representation of BigQuery
BigQuery public datasets
Legacy versus standard SQL
Working with the BigQuery console
Loading data into a table using BigQuery
Deleting datasets
Working with BigQuery using CLI
BigQuery pricing
Analyzing financial time series with BigQuery
Summary
Identity and Access Management
Resource hierarchy of GCP
Permissions and roles
Units of identity in GCP
Creating a Service Account
Working with cloud IAM – grant a role
Working with IAM – creating a custom role
Summary
Managing Hadoop with Dataproc
Hadoop and Spark
Hadoop on the cloud
Google Cloud Dataproc
Compute options for Dataproc
Working with Dataproc
Summary
Load Balancing
Why load balancers matter now
Taxonomy of GCP load balancers
HTTP(S) load balancing
Configuring HTTP(S) load balancing
Configuring Internal Load Balancing
Other load balancing
Summary
Networking in GCP
Why GCP's networking model is unique
VPC networks and subnets
The default VPC
Internal and external IP addresses
VPN and cloud router
Working with VPCs
Working with custom subnets
Working with firewall rules
Summary
Logging and Monitoring
Logging
Working with logs
More Stackdriver – creating log-based metrics
Monitoring
Summary
Infrastructure Automation
Managed Instance Groups
Cloud deployment manager
Summary
Security on the GCP
Security features at Google and on the GCP
Google-provided tools and options for security
Some security best practices
BeyondCorp – Identity-Aware Proxy
Summary
Pricing Considerations
Compute Engine
BigTable
BigQuery
Datastore
Cloud SQL
Google Kubernetes Engine
Pub/Sub
Cloud ML Engine
Stackdriver
Video Intelligence API
Key Management Service – KMS
Vision API
Summary
Effective Use of the GCP
Eat the Kubernetes frog
Careful that you don't get nickel-and-dimed
Pay for what you allocate not what you use
Make friends with the gsuite admins
Try to find reasons to use network peering
Understand how sustained use discounts work
Read the fine print on GCS pricing
Use BigQuery unless you have a specific reason not to
Use pre-emptible instances in your Dataproc clusters
Keep your Dataproc clusters stateless
Understand the unified architecture for batch and stream
Understand the main choices for ML applications
Understand the differences between snapshots and images
Don't be Milton!
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
The Google Cloud Platform is fast emerging as a leading public cloud provider. The GCP, as it is popularly known, is backed by Google’s awe-inspiring engineering expertise and infrastructure, and is able to draw upon the goodwill and respect that Google has come to enjoy. The GCP is one of a handful of public cloud providers to offer the full range of cloud computing services, ranging from Infrastructure as a Service (IaaS) to Platform as a Service (PaaS). There is another reason why the GCP is fast gaining popularity; genre-defining technologies such as TensorFlow and Kubernetes originated at Google before being open-sourced, and the GCP is a natural choice of cloud on which to run them. If you are a cloud professional today, time spent on mastering the GCP is likely to be an excellent investment.
Using a public cloud platform was considered risky a decade ago and unconventional even just a few years ago. Today, however, the use of the public cloud is completely mainstream—the norm, rather than the exception. Several leading technology firms, including Google, have built sophisticated cloud platforms, and they are locked in a fierce competition for market share.
The main goal of this book is to enable you to get the best out of the GCP and to use it with confidence and competence. You will learn why cloud architectures take the forms that they do, and this will help you to become a skilled, high-level cloud architect. You will also learn how individual cloud services are configured and used so that you are never intimidated at having to build it yourself. You will also learn the right way and the right situation in which to use the important GCP services.
By the end of this book, you will be able to make the most out of Google Cloud Platform design.
If you are a Cloud architect who is responsible for designing and managing robust cloud solutions with Google Cloud Platform, then this book is for you. System engineers and Enterprise architects will also find this book useful. A basic understanding of distributed applications would be helpful, although not strictly necessary. Some working experience on other public cloud platforms would help too.
Chapter 1, The Case for Cloud Computing, starts with the brief history of cloud computing. Furthermore, the chapter delves into autohealing and autoscaling.
Chapter 2, Introduction to Google Cloud Platform, gets you into the nitty-gritty of the Google Cloud Platform, describing the diversity and versatility of the platform in terms of the resources available to us.
Chapter 3, Compute Choices – VMs and the Google Compute Engine, explores GCE, which serves as an IaaS provision of GCP. You will learn to create GCE VMs, along with its various aspects such as disk type and machine types.
Chapter 4, GKE, AppEngine, and Cloud Functions, discusses the four compute options on the GCP, ranging from IaaS through PaaS.
Chapter 5, Google Cloud Storage – Fishing in a Bucket, gets you familiar with GCS and gives an idea of where it would fit within with your overall infrastructure.
Chapter 6, Relational Databases, introduces you to RDMS and SQL. We further dive deep into Cloud SQL and Cloud Spanner that are available under GCP.
Chapter 7, NoSQL Databases, takes you through Bigtable and Datastore. This chapter explains how Bigtable is used for large datasets, whereas on the other hand, Datastore is meant for far smaller data.
Chapter 8, BigQuery, teaches you about the architecture of BigQuery and how it is Google’s fully managed petabyte-scale serverless database.
Chapter 9, Identity and Access Management, dives into how IAM lets you control access to all of the GCP resources in terms of roles and permissions.
Chapter 10, Managing Hadoop with Dataproc, helps you to understand Dataproc as a managed and cost-effective solution for Apache Spark and Hadoop workloads.
Chapter 11, Load Balancing, takes you through HTTP, TCP, and network load balancing with reference to its concepts and implementation.
Chapter 12, Networking in GCP, teaches you about Virtual Private Cloud Networks of GCP and their infrastructure and how to create and manage our own VPC networks.
Chapter 13, Logging and Monitoring, discusses how Stackdriver offers logging and monitoring services of GCP resources for free up to a certain quota and then monitoring both GCP and AWS resources for premium account holders.
Chapter 14, Infrastructure Automation, delves into the idea of how provisioning resources can be done programmatically, using templates, commands, and even code.
Chapter 15, Security on the GCP, mostly covers things such as how Google has planned for security on the GCP.
Chapter 16, Pricing Considerations, helps avoid sticker-shock and sudden unpleasant surprises regarding the pricing of the services that you use.
Chapter 17, Effective Use of the GCP, sharpens all of the GCP features and offerings that you learned in the previous chapters to make sure that we conclude our journey on a satisfactory note.
First, go breadth-first. Read each chapter rapidly, paying particular attention to the early bits and to the rhymes. They summarize the key points.
Don’t forget to laugh while reading the rhymes! Seriously, pay attention to each line in the rhymes as they are particularly packed with information.
After you finish going through the entire book quickly, come back to the chapters that relate to your specific use cases and go through them in detail.
For the drills in the book, understand what step is trying to accomplish, then try it out on your own. In particular, also search for online updates for your most important use cases—the world of cloud computing and the GCP is changing incredibly fast.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "A public dataset named samples.natality is queried"
A block of code is set as follows:
#standardSQL SELECT weight_pounds, state, year, gestation_weeks FROM `bigquery-public-data.samples.natality` ORDER BY weight_pounds DESC LIMIT 10;
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
#standardSQL SELECT
weight_pounds, state, year, gestation_weeks
FROM `bigquery-public-data.samples.natality` ORDER BY weight_pounds DESC LIMIT 10;
Any command-line input or output is written as follows:
curl -f -O http://repo1.maven.org/maven2/com/google/cloud/bigtable/bigtable-beam-import/1.1.2/bigtable-beam-import-1.1.2-shaded.jar
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "To upload the datafile, click on the Choose file button."
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
Cloud computing is a pretty big deal in the world of technology, and in addition it is also a pretty big deal for those who are not quite in technology. Some developments, for instance, the rise of Java and object-oriented programming, were momentous changes for people who were completely into technology at the time, but it was rare for a non-technical person to have to wake up in the morning, read the newspaper and ask themselves, Wow, this Java thing is getting pretty big, will this affect my career? Cloud computing, perhaps like machine learning or Artificial Intelligence (AI), is different; there is a real chance that it, by itself, will affect the lives of people far beyond the world of technology. Let's understand why.
You will learn the following topics in this chapter:
A brief history of cloud computing
Autohealing and autoscaling—good technical reasons for moving to the cloud
Some good financial reasons for moving to the cloud
Possible implications of cloud computing on your career
In the beginning, Jeff Bezos created Amazon.com and took the company to a successful Initial Public Offering (IPO) by 1997. Everyone knows Amazon.com, of course, and it has become a force of nature, dominating the online retail and diversifying into several other fields. However, in the early 2000s, after the Dotcom bubble burst, the company's future was not quite as certain as now. Even so, one of the many things that Amazon was doing right even then was architecting its internal computer systems in a truly robust and scalable way.
Amazon had a lot of users and a lot of traffic, and in order to service that traffic, the company really had to think deeply about how to build scalable, cost-effective compute capacity. Now you could argue rightly that other companies had to think about the same issues too. Google also had a lot of users and a lot of traffic, and it had to think really carefully about how to handle it. Even so, most observers agree that a couple of important differences existed between the two giants. For one, Google's business was (and is) fundamentally a far more profitable one, which means that Google could always afford to overinvest in compute, secure in the knowledge that its money printing press in the ad business would cover the costs. For another, Google's primary technical challenges came in processing and making sense of vast quantities of data (it was basically indexing the entire internet for Google Search). Amazon's primary technical challenges lay around making sure that the inherently spiky traffic of their hundreds of millions of users was serviced just right. The spiky nature of consumer traffic remains a huge consideration for any online retail firm. Just consider Alibaba, which did $25 billion in sales on Singles Day (11/11) in 2017.
Somewhere along the line, Amazon realized that it had created something really cool: a set of APIs and services, a platform in fact that external customers would be willing to pay for, and that would help Amazon monetize excess server capacity it had lying about. Let's not underestimate the magnitude of that achievement; plenty of companies have overinvested in servers and have extra capacity lying around, but virtually none of them have built a platform that other external customers are willing and able to use and to pay top dollar for.
So, in 2006, Amazon launched Elastic Compute Cloud (EC2), basically, cloud Virtual Machine (VM) instances, and Simple Storage Service (S3), basically, elastic object storage, which to this day are the bedrock of the AWS cloud offerings. Along the way, the other big firms with the money and technical know how to offer such services jumped in as well. Microsoft launched Azure in 2010, and Google had actually gotten into the act even earlier, in 2008, with the launch of App Engine.
Notice how Amazon's first product offerings were basically Infrastructure as a service(IaaS), whereas Google's initial offering was a Platform as a service (PaaS). That is a significant fact and with the benefit of hindsight, a significant mistake on Google's part. If you are a large organization, circa 2010, and contemplating moving to the cloud, you are unlikely to bet the house on moving to an untested cloud-specific platform such as App Engine. The path of least resistance for big early adopters is definitely the IaaS route. The first-mover advantage and the smart early focus on IaaS helped Amazon open up a huge lead in the cloud market, one which they still hold on to.
In recent years, however, a host of other cloud providers have crowded into the cloud space, notably Microsoft and, to a lesser extent, Google. That partially has to do with the economics of the cloud business; Amazon first broke out the financials of AWS separately in April 2015 and stunned the world with its size and profitability. Microsoft missed a few important big trends in computing, but after Satya Nadella replaced Steve Ballmer at the helm, he really made the cloud a company-wide priority in a way that mobile, search, and hardware never were. The results are obvious, and if you are a Microsoft shareholder, very gratifying. Microsoft is probably the momentum player in the cloud market right now; many smart observers have realized that Azure is challenging AWS despite the still-significant differences between their market shares.
Okay, you say, all fine and good: if AWS is the market leader, and Azure is the momentum player, then why exactly are we reading and writing a book about the Google Cloud Platform? That's an excellent question; in a nutshell, our considered view is that the GCP is a great technology to jump into right now for a few, very rational reasons, as follows:
Demand-supply
: There is a ton of demand for AWS and Azure professionals, but there is also a ton of supply. In contrast, there is growing demand for the GCP, but not yet all that much supply of highly skilled GCP professionals. Careers are made by smart bets on technologies like this one.
PaaS versus IaaS
: Notice how we called out Amazon for being smart in focusing on IaaS early on. That made a lot of sense when cloud computing was new and untested. Now, however, everyone trusts the cloud; that model works, and people know it. This means that folks are now ready to give up control in return for great features. PaaS is attractive now, and GCP's PaaS offerings are very competitive relative to its competitors.
Kubernetes for hybrid, multi-cloud architectures
: You may or may not have heard about this, but Amazon acquired a US-based grocery chain,
Whole Foods
, some time ago. It gave many current and potential AWS consumers pause for thought,
what if Amazon buys up a company in my sector and starts competing with me?
As a result, more organizations are likely to want a hybrid, multi-cloud architecture rather than to tie themselves to any one cloud provider. The term hybrid implies that both on-premise data centers and public cloud resources are used, and multi-cloud refers to the fact that more than one cloud provider is in the game. Now, if the world does go the hybrid, multi-cloud way, one clear winner is likely to be a container orchestration technology named Kubernetes. If that does happen, GCP is likely to be a big beneficiary. Kubernetes was developed at Google before being open-sourced, and the GCP offers great support for Kubernetes.
The technical rationale for moving to the cloud can often be summed up in two words—autoscaling and autohealing.
Autoscaling
: The idea of autoscaling is simple enough although the implementations can get quite involved—apps are deployed on compute, the amount of compute capacity increases or decreases depending on the level of incoming client requests. In a nutshell, all the public cloud providers have services that make autoscaling and autohealing easily available. Autoscaling, in particular, is a huge deal. Imagine a large Hadoop cluster, with say 1,000 nodes. Try scaling that; it probably is a matter of weeks or even months. You'd need to get and configure the machines, reshard the data and jump through a trillion hoops. With a cloud provider, you'd simply use an elastic version of Hadoop such as Dataproc on the GCP or
Elastic MapReduce
(
EMR
) on AWS and you'd be in business in minutes. This is not some marketing or sales spiel; the speed of scaling up and down on the cloud is just insane.
Here’s a little rhyme to help you remember the main point of our conversation here—we’ll keep using them throughout the remainder of the book just to mix things up a bit. Oh, and they might sometimes introduce a few new terms or ideas that will be covered at length in the following sections, so don’t let any forward references bother you just yet!
Autohealing
: The idea of autohealing is just as important as that of autoscaling, but it is less explicitly understood. Let's say that we deploy an app that could be a Java JAR, Python package, or Docker container to a set of compute resources, which again could be cloud VMs, App Engine backends, or pods in a Kubernetes cluster. Those compute resources will have problems from time to time; they will crash, hang, run out of memory, throw exceptions, and misbehave in all kinds of unpredictable ways. If we did nothing about these problems, those compute resources would effectively be out of action, and our total compute capacity would fall and, sooner or later, become insufficient to meet client requests. So, clearly, we need to somehow detect whether our compute resources got sick, and then heal them. In the pre-cloud days, this would have been pretty manual, some poor sap of an engineer would have to nurse a bare metal or VM back to health. Now, with cloud-based abstractions, individual compute units are much more expendable. We can just take them down and replace them with new ones. Because these units of compute capacity are interchangeable (or fungible—a fancier word that means the same thing), autohealing is now possible:
The financial considerations for moving to the cloud are as important as the technical ones, and it is important for architects and technical cloud folks to understand these so that we don't sound dumb while discussing these at cocktail party conversations with the business guys.
CAPEX
refers to a large upfront spend of money used to get an asset (an asset is a resource that will yield benefits over time, not just in the current period)
OPEX
refers to smaller, recurring spends of money for current period benefit
Now, that the last line in the previous image makes clear the big difference between using the cloud versus going on-premise. If you use the cloud, you don't need to make the big upfront payment. That in turn has several associated advantages:
No depreciation
: When you buy a house, hopefully, it appreciates, but when you buy a car, it loses a fourth of its value as soon as you drive it out of the dealer's parking lot. Depreciation is called an intangible expense, but try selling a barely used car, and you will find it tangible enough. With the cloud, you don't worry about depreciation, but the cloud provider does.
Transparency
: Let's face it folks—big contracts are complicated and have been known to have big payoffs for people concerned. This is a reality of doing business: procurement and sourcing are messy businesses for a reason. Using the cloud is far more transparent and simpler for internal processes and audit
Flexibility in capacity planning
: The worlds of business and history are littered with unfulfilled ambitions and unrealistic plans and targets that went nowhere. Your firm might have a target to triple revenues in 3 years: such ambitious plans are common enough. If you, as an architect, sign off on a commensurate tripling in your server capacity and that business growth does not happen,
the finance guys will come asking you why overspent
if you are still in this firm 3 years down the line. At that point, you will likely not have the luxury of pointing a finger at the CEO who launched the 3X plan. He likely will have moved on adroitly to another shiny plan.
Note, incidentally, that we did not include straight cost as a reason to move to the cloud. The big cloud providers are all rolling in the money, their stock prices are surging on the heady cocktail of high revenue growth and high profitability. The cloud business is a great business to be in if you can get in. It stands to reason that if the suppliers of cloud services are doing so well financially, the consumers of cloud services are paying for it. So, cloud services are not necessarily cheap, they do, however, offer all of these other attractions that make it a real win-win for both sides in the bargain.
Our considered opinion is that the move to the cloud is going to affect a lot of folks more than they expect. In particular, employees at a host of IT services companies and system integrators will need to retool fast. Now that's not to say that these companies are clear losers, because the cloud services are pretty complex too and will provide lots of room for several different ecosystems. Some things that used to be hard will now be easy, and some things that used to be easy will now be hard. Workforces will need to be retrained, and the expectations of career trajectories will need to be changed. So, if you are new to the cloud world, here are three topics you might want to spend time really understanding—these are now a lot more important than they used to be:
Containers, Docker, and Kubernetes
Load balancers
IaaS technologies such as Terraform or Google Cloud Deployment Manager
On the other hand, folks who are in the following teams probably need to think long and hard about how to get with today's (and tomorrow's) hot technologies because the cloud is radically simplifying what they currently work on:
Virtual Machines and IaaS sysadmins
Physical networking, router, and VPN engineers
Hadoop administrators
You learned about the rise of public cloud computing and how GCP, AWS, and Azure came to where they currently are in the market. We also examined some important technical reasons for switching to the cloud. We looked at some good and bad financial implications of moving to the cloud. Finally, we pointed out some technologies that stand to gain from the rise of the cloud and some others that stand to recede in relevance.
Now let's get into the nitty-gritty of the Google Cloud Platform. Any cloud platform really is all about resources. It allows you to use resources for a fee. What's cool about Cloud Platforms is the great diversity and versatility in terms of what resources are available to us. This might include hardware such as virtual machine instances, or persistent disks, services such as BigQuery or BigTable or even more complex software such as Cloud ML Engine and the various APIs for machine learning. But in addition to just the hardware and software, there is a lot of little detailed networking, load balancing, logging, monitoring, and so on. The GCP, like other major cloud platforms, provides a great variety of services; take load balancing for instance, GCP load balancing options span to four different layers (that is, data link, network, transport, and application layers) of the OSI networking stack.
You will learn the following topics in this chapter:
The difference between regions and zones
The organization of resources in a GCP project
Accessing cloud resources using Cloud Shell and the web console
Now, of course, there is no free lunch in life and you have to pay for (almost) all of this, and the payment models are going to differ. For instance, with persistent disks, you pay for the capacity that you allocate, whereas with cloud storage buckets, you pay for the capacity that you actually use. However, the basic idea is that there are resources that will be billed. All billable resources are grouped into entities named projects.
Let's now look at how resources are structured. The way the Google Cloud Platform operates, all resources are scoped as being the following:
Global
Regional
Zonal
Now you might think that this geographical location of resources is an implementation detail that you shouldn't have to worry about, but that's only partially true. The scoping actually also determines how you architect your cloud applications and infrastructure.
Regions are geographical regions at the level of a subcontinent—the central US, western Europe, or east Asia. Zones are basically data centers within regions. This mapping between a zone and a data center is loose, and it's not really explicit anywhere, but that really is what a zone is.
These distinctions matter to use as an end users because regional and zonal resources are often billed and treated differently by the platform. You will pay more or less depending on the choices you make regarding these levels of scope access. The reason that you pay more or less is that there are some implicit promises made about performance within regions.
For instance, the Cloud Docs tell us that zones within the same region will typically have network latencies in the order of less than 5 milliseconds. What does typical mean? Well here, it is the 95 percentile delay latency, that is, 95% of all network traffic within a region will have latency of less than 5 ms. That's a fancy way of saying that within a region, network speeds will be very high, whereas across regions, those speeds will be slower.
