47,99 €
The ultimate guide to managing, building, and deploying large-scale clusters with Apache Mesos
The book aims to serve DevOps engineers and system administrators who are familiar with the basics of managing a Linux system and its tools
Apache Mesos is open source cluster management software that provides efficient resource isolations and resource sharing distributed applications or frameworks.
This book will take you on a journey to enhance your knowledge from amateur to master level, showing you how to improve the efficiency, management, and development of Mesos clusters. The architecture is quite complex and this book will explore the difficulties and complexities of working with Mesos.
We begin by introducing Mesos, explaining its architecture and functionality. Next, we provide a comprehensive overview of Mesos features and advanced topics such as high availability, fault tolerance, scaling, and efficiency. Furthermore, you will learn to set up multi-node Mesos clusters on private and public clouds.
We will also introduce several Mesos-based scheduling and management frameworks or applications to enable the easy deployment, discovery, load balancing, and failure handling of long-running services. Next, you will find out how a Mesos cluster can be easily set up and monitored using the standard deployment and configuration management tools.
This advanced guide will show you how to deploy important big data processing frameworks such as Hadoop, Spark, and Storm on Mesos and big data storage frameworks such as Cassandra, Elasticsearch, and Kafka.
This advanced guide provides a detailed step-by-step account of deploying a Mesos cluster. It will demystify the concepts behind Mesos.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 346
Veröffentlichungsjahr: 2016
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: May 2016
Production reference: 1200516
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-624-9
www.packtpub.com
Authors
Dipa Dubhashi
Akhil Das
Reviewer
Naveen Molleti
Commissioning Editor
Akram Hussain
Acquisition Editor
Sonali Vernekar
Content Development Editor
Onkar Wani
Technical Editor
Hussain Kanchwala
Copy Editor
Shruti Iyer
Project Coordinator
Bijal Patel
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Kirk D'Penha
Production Coordinator
Aparna Bhagat
Cover Work
Aparna Bhagat
Dipa Dubhashi is an alumnus of the prestigious Indian Institute of Technology and heads product management at Sigmoid. His prior experience includes consulting with ZS Associates besides founding his own start-up. Dipa specializes in envisioning enterprise big data products, developing their roadmaps, and managing their development to solve customer use cases across multiple industries. He advises several leading start-ups as well as Fortune 500 companies about architecting and implementing their next-generation big data solutions. Dipa has also developed a course on Apache Spark for a leading online education portal and is a regular speaker at big data meetups and conferences.
Akhil Das is a senior software developer at Sigmoid primarily focusing on distributed computing, real-time analytics, performance optimization, and application scaling problems using a wide variety of technologies such as Apache Spark and Mesos, among others. He contributes actively to the Apache Spark project and is a regular speaker at big data conferences and meetups, MesosCon 2015 being the most recent one.
We would like to thank several people that helped make this book a reality: Revati Dubhashi, without whose driving force this book would not have seen the light of day; Chithra, for her constant encouragement and support; and finally, Mayur Rustagi, Naveen Molleti, and the entire Sigmoid family for their invaluable guidance and technical input.
Naveen Molleti works at Sigmoid as a technology lead, heading product architecture and scalability. Although he graduated in computer science from IIT Kharagpur in 2011, he has worked for about a decade developing software on various OSes and platforms in a variety of programming languages. He enjoys exploring technologies and platforms and developing systems software and infrastructure.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be effortlessly deployed leveraging its pluggable architecture.
This book will provide a detailed step-by-step guide to deploying a Mesos cluster using all the standard DevOps tools to port Mesos frameworks effectively and in general demystify the concept of Mesos.
The book will first establish the raison d'être of Mesos and explain its architecture in an effective manner. From there, the book will walk the reader through the complex world of Mesos, moving progressively from simple single machine setups to highly complex multi-node cluster setups with new concepts logically introduced along the way. At the end of the journey, the reader will be armed with all the resources that he/she requires to effectively manage the complexities of today's modern datacenter requirements.
Chapter 1, Introducing Mesos, introduces Mesos, dives deep into its architecture, and introduces some important topics, such as frameworks, resource allocation, and resource isolation. It also discusses the two-level scheduling approach that Mesos employs, provides a detailed overview of its API, and provides a few examples of how Mesos is used in production.
Chapter 2, Mesos Internals, provides a comprehensive overview of Mesos' features and walks the reader through several important topics regarding high availability, fault tolerance, scaling, and efficiency, such as resource allocation, resource reservation, and recovery, among others.
Chapter 3, Getting Started with Mesos, covers how to manually set up and run a Mesos cluster on the public cloud (AWS, GCE, and Azure) as well as on a private datacentre (on premise). It also discuss the various debugging methods and explores how to troubleshoot the Mesos setup in detail.
Chapter 4, Service Scheduling and Management Frameworks, introduces several Mesos-based scheduling and management frameworks or applications that are required for the easy deployment, discovery, load balancing, and failure handling of long-running services.
Chapter 5, Mesos Cluster Deployment, explains how a Mesos cluster can be easily set up and monitored using the standard deployment and configuration management tools used by system administrators and DevOps engineers. It also discusses some of the common problems faced while deploying a Mesos cluster along with their corresponding resolutions.
Chapter 6, Mesos Frameworks, walks the reader through the concept and features of Mesos frameworks in detail. It also provides a detailed overview of the Mesos API, including the new HTTP Scheduler API, and provides a recipe to build custom frameworks on Mesos.
Chapter 7, Mesos Containerizers, introduces the concepts of containers and talks a bit about Docker, probably the most popular container technology available today. It also provides a detailed overview of the different "containerizer" options in Mesos, besides introducing some other topics such as networking for Mesos-managed containers and the fetcher cache. Finally, an example of deploying containerized apps in Mesos is provided for better understanding.
Chapter 8, Mesos Big Data Frameworks, acts as a guide to deploying important big data processing frameworks such as Hadoop, Spark, Storm, and Samza on top of Mesos.
Chapter 9, Mesos Big Data Frameworks 2, guides the reader through deploying important big data storage frameworks such as Cassandra, the Elasticsearch-Logstash-Kibana (ELK) stack, and Kafka on top of Mesos.
To get the most of this book, you need to have basic understanding of Mesos and cluster management along with familiarity with Linux. You will also need to have access to cloud services such as AWS, GCE, and Azure, preferably running with 15 GB RAM and four cores on the Ubuntu or CentOS operating system.
The book aims to serve DevOps engineers and system administrators who are familiar with the basics of managing a Linux system and its tools
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "For the sake of simplicity, we will simply run the sleep command."
A block of code is set as follows:
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
Any command-line input or output is written as follows:
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Now press the ADD button to add a specific port."
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
You can download the code files by following these steps:
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github. com/PacktPublishing/Mastering-Mesos. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
Apache Mesos is open source, distributed cluster management software that came out of AMPLab, UC Berkeley in 2011. It abstracts CPU, memory, storage, and other computer resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to be easily built and run effectively. It is referred to as a metascheduler (scheduler of schedulers) and a "distributed systems kernel/distributed datacenter OS".
It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be deployed by leveraging its pluggable architecture. It is scalable and efficient and provides a host of features, such as resource isolation and high availability, which, along with a strong and vibrant open source community, makes this one of the most exciting projects.
We will cover the following topics in this chapter:
Over the past decade, datacenters have graduated from packing multiple applications into a single server box to having large datacenters that aggregate thousands of servers to serve as a massively distributed computing infrastructure. With the advent of virtualization, microservices, cluster computing, and hyperscale infrastructure, the need of the hour is the creation of an application-centric enterprise that follows a software-defined datacenter strategy.
Currently, server clusters are predominantly managed individually, which can be likened to having multiple operating systems on the PC, one each for processor, disk drive, and so on. With an abstraction model that treats these machines as individual entities being managed in isolation, the ability of the datacenter to effectively build and run distributed applications is greatly reduced.
Another way of looking at the situation is comparing running applications in a datacenter to running them on a laptop. One major difference is that while launching a text editor or web browser, we are not required to check which memory modules are free and choose ones that suit our need. Herein lies the significance of a platform that acts like a host operating system and allows multiple users to run multiple applications simultaneously by utilizing a shared set of resources.
Datacenters now run varied distributed application workloads, such as Spark, Hadoop, and so on, and need the capability to intelligently match resources and applications. The datacenter ecosystem today has to be equipped to manage and monitor resources and efficiently distribute workloads across a unified pool of resources with the agility and ease to cater to a diverse user base (noninfrastructure teams included). A datacenter OS brings to the table a comprehensive and sustainable approach to resource management and monitoring. This not only reduces the cost of ownership but also allows a flexible handling of resource requirements in a manner that isolated datacenter infrastructure cannot support.
The idea behind a datacenter OS is that of intelligent software that sits above all the hardware in a datacenter and ensures efficient and dynamic resource sharing. Added to this is the capability to constantly monitor resource usage and improve workload and infrastructure management in a seamless way that is not tied to specific application requirements. In its absence, we have a scenario with silos in a datacenter that force developers to build software catering to machine-specific characteristics and make the moving and resizing of applications a highly cumbersome procedure.
The datacenter OS acts as a software layer that aggregates all servers in a datacenter into one giant supercomputer to deliver the benefits of multilatency, isolation, and resource control across all microservice applications. Another major advantage is the elimination of human-induced error during the continual assigning and reassigning of virtual resources.
From a developer's perspective, this will allow them to easily and safely build distributed applications without restricting them to a bunch of specialized tools, each catering to a specific set of requirements. For instance, let's consider the case of Data Science teams who develop analytic applications that are highly resource intensive. An operating system that can simplify how the resources are accessed, shared, and distributed successfully alleviates their concern about reallocating hardware every time the workloads change.
Of key importance is the relevance of the datacenter OS to DevOps, primarily a software development approach that emphasizes automation, integration, collaboration, and communication between traditional software developers and other IT professionals. With a datacenter OS that effectively transforms individual servers into a pool of resources, DevOps teams can focus on accelerating development and not continuously worry about infrastructure issues.
In a world where distributed computing becomes the norm, the datacenter OS is a boon in disguise. With freedom from manually configuring and maintaining individual machines and applications, system engineers need not configure specific machines for specific applications as all applications would be capable of running on any available resources from any machine, even if there are other applications already running on them. Using a datacenter OS results in centralized control and smart utilization of resources that eliminate hardware and software silos to ensure greater accessibility and usability even for noninfrastructural professionals.
Examples of some organizations administering their hyperscale datacenters via the datacenter OS are Google with the Borg (and next generation Omega) systems. The merits of the datacenter OS are undeniable, with benefits ranging from the scalability of computing resources and flexibility to support data sharing across applications to saving team effort, time, and money while launching and managing interoperable cluster applications.
It is this vision of transforming the datacenter into a single supercomputer that Apache Mesos seeks to achieve. Born out of a Berkeley AMPLab research paper in 2011, it has since come a long way with a number of leading companies, such as Apple, Twitter, Netflix, and AirBnB among others, using it in production. Mesosphere is a start-up that is developing a distributed OS product with Mesos at its core.
Mesos is an open-source platform for sharing clusters of commodity servers between different distributed applications (or frameworks), such as Hadoop, Spark, and Kafka among others. The idea is to act as a centralized cluster manager by pooling together all the physical resources of the cluster and making it available as a single reservoir of highly available resources for all the different frameworks to utilize. For example, if an organization has one 10-node cluster (16 CPUs and 64 GB RAM) and another 5-node cluster (4 CPUs and 16 GB RAM), then Mesos can be leveraged to pool them into one virtual cluster of 720 GB RAM and 180 CPUs, where multiple distributed applications can be run. Sharing resources in this fashion greatly improves cluster utilization and eliminates the need for an expensive data replication process per-framework.
Some of the important features of Mesos are:
Mesos is based on the same principles as the Linux kernel and aims to provide a highly available, scalable, and fault-tolerant base for enabling various frameworks to share cluster resources effectively and in isolation. Distributed applications are varied and continuously evolving, a fact that leads Mesos design philosophy towards a thin interface that allows an efficient resource allocation between different frameworks and delegates the task of scheduling and job execution to the frameworks themselves. The two advantages of doing so are:
Mesos' architecture hands over the responsibility of scheduling tasks to the respective frameworks by employing a resource offer abstraction that packages a set of resources and makes offers to each framework. The Mesos master node decides the quantity of resources to offer each framework, while each framework decides which resource offers to accept and which tasks to execute on these accepted resources. This method of resource allocation is shown to achieve a good degree of data locality for each framework sharing the same cluster.
An alternative architecture would implement a global scheduler that took framework requirements, organizational priorities, and resource availability as inputs and provided a task schedule breakdown by framework and resource as output, essentially acting as a matchmaker for jobs and resources with priorities acting as constraints. The challenges with this architecture, such as developing a robust API that could capture all the varied requirements of different frameworks, anticipating new frameworks, and solving a complex scheduling problem for millions of jobs, made the former approach a much more attractive option for the creators.
A Mesos framework sits between Mesos and the application and acts as a layer to manage task scheduling and execution. As its implementation is application-specific, the term is often used to refer to the application itself. Earlier, a Mesos framework could interact with the Mesos API using only the libmesos C++ library, due to which other language bindings were developed for Java, Scala, Python, and Go among others that leveraged libmesos heavily. Since v0.19.0, the changes made to the HTTP-based protocol enabled developers to develop frameworks using the language they wanted without having to rely on the C++ code. A framework consists of two components: a) Scheduler and b) Executor.
Scheduler is responsible for making decisions on the resource offers made to it and tracking the current state of the cluster. Communication with the Mesos master is handled by the SchedulerDriver module, which registers the framework with the master, launches tasks, and passes messages to other components.
The second component, Executor, is responsible, as its name suggests, for the execution of tasks on slave nodes. Communication with the slaves is handled by the ExecutorDriver module, which is also responsible for sending status updates to the scheduler.
The Mesos API, discussed later in this chapter, allows programmers to develop their own custom frameworks that can run on top of Mesos. Some other features of frameworks, such as authentication, authorization, and user management, will be discussed at length in Chapter 6, Mesos Frameworks.
A list of some of the services and frameworks built on Mesos is given here. This list is not exhaustive, and support for new frameworks is added almost every day. You can also refer to http://mesos.apache.org/documentation/latest/frameworks/ apart from the following list:
Mesos describes the slave nodes present in the cluster by the following two methods:
Attributes are used to describe certain additional information regarding the slave node, such as its OS version, whether it has a particular type of hardware, and so on. They are expressed as key-value pairs with support for three different value types—scalar, range, and text—that are sent along with the offers to frameworks. Take a look at the following code:
Mesos can manage three different types of resources: scalars, ranges, and sets. These are used to represent the different resources that a Mesos slave has to offer. For example, a scalar resource type could be used to represent the amount of CPU on a slave. Each resource is identified by a key string, as follows:
Predefined uses and conventions
The Mesos master predefines how it handles the following list of resources:
In particular, a slave without the cpu and mem resources will never have its resources advertised to any frameworks. Also, the master's user interface interprets the scalars in mem and disk in terms of MB. For example, the value 15000 is displayed as 14.65GB.
Here are some examples of configuring the Mesos slaves:
In this case, we have three different types of resources, scalars, a range, and a set. They are called cpus, mem, and disk, and the range type is ports.
In the case of attributes, we will end up with three attributes:
Mesos has a two-level scheduling mechanism to allocate resources to and launch tasks on different frameworks. In the first level, the master process that manages slave processes running on each node in the Mesos cluster determines the free resources available on each node, groups them, and offers them to different frameworks based on organizational policies, such as priority or fair sharing. Organizations have the ability to define their own sharing policies via a custom allocation module as well.
In the second level, each framework's scheduler component that is registered as a client with the master accepts or rejects the resource offer made depending on the framework's requirements. If the offer is accepted, the framework's scheduler sends information regarding the tasks that need to be executed and the number of resources that each task requires to the Mesos master. The master transfers the tasks to the corresponding slaves, which assign the necessary resources to the framework's executor component, which manages the execution of all the required tasks in containers. When the tasks are completed, the containers are dismantled, and the resources are freed up for use by other tasks.
The following diagram and explanation from the Apache Mesos documentation (http://mesos.apache.org/documentation/latest/architecture/) explains this concept in more detail:
Let's have a look at the pointers mentioned in the preceding diagram:
Mesos also provides frameworks with the ability to reject resource offers. A framework can reject the offers that do not meet its requirements. This allows frameworks to support a wide variety of complex resource constraints while keeping Mesos simple at the same time. A policy called delay scheduling, in which frameworks wait for a finite time to get access to the nodes storing their input data, gives a fair level of data locality albeit with a slight latency tradeoff.
If the framework constraints are complex, it is possible that a framework might need to wait before it receives a suitable resource offer that meets its requirements. To tackle this, Mesos allows frameworks to set filters specifying the criteria that they will use to always reject certain resources. A framework can set a filter stating that it can run only on nodes with at least 32 GB of RAM space free, for example. This allows it to bypass the rejection process, minimizes communication overheads, and thus reduces overall latency.
