E-Book
47,99 €

Mastering Mesos E-Book

Dipa Dubhashi

0,0

47,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

The ultimate guide to managing, building, and deploying large-scale clusters with Apache Mesos

About This Book

Master the architecture of Mesos and intelligently distribute your task across clusters of machines
Explore a wide range of tools and platforms that Mesos works with
This real-world comprehensive and robust tutorial will help you become an expert

Who This Book Is For

The book aims to serve DevOps engineers and system administrators who are familiar with the basics of managing a Linux system and its tools

What You Will Learn

Understand the Mesos architecture
Manually spin up a Mesos cluster on a distributed infrastructure
Deploy a multi-node Mesos cluster using your favorite DevOps
See the nuts and bolts of scheduling, service discovery, failure handling, security, monitoring, and debugging in an enterprise-grade, production cluster deployment
Use Mesos to deploy big data frameworks, containerized applications, or even custom build your own applications effortlessly

In Detail

Apache Mesos is open source cluster management software that provides efficient resource isolations and resource sharing distributed applications or frameworks.

This book will take you on a journey to enhance your knowledge from amateur to master level, showing you how to improve the efficiency, management, and development of Mesos clusters. The architecture is quite complex and this book will explore the difficulties and complexities of working with Mesos.

We begin by introducing Mesos, explaining its architecture and functionality. Next, we provide a comprehensive overview of Mesos features and advanced topics such as high availability, fault tolerance, scaling, and efficiency. Furthermore, you will learn to set up multi-node Mesos clusters on private and public clouds.

We will also introduce several Mesos-based scheduling and management frameworks or applications to enable the easy deployment, discovery, load balancing, and failure handling of long-running services. Next, you will find out how a Mesos cluster can be easily set up and monitored using the standard deployment and configuration management tools.

This advanced guide will show you how to deploy important big data processing frameworks such as Hadoop, Spark, and Storm on Mesos and big data storage frameworks such as Cassandra, Elasticsearch, and Kafka.

Style and approach

This advanced guide provides a detailed step-by-step account of deploying a Mesos cluster. It will demystify the concepts behind Mesos.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 346

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Mastering Mesos

Credits

About the Authors

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Introducing Mesos

Introduction to the datacenter OS and architecture of Mesos

The architecture of Mesos

Introduction to frameworks

Frameworks built on Mesos

Long-running services

Big data processing

Batch scheduling

Data storage

The attributes and resources of Mesos

Attributes

Resources

Examples

Two-level scheduling

Resource allocation

Max-min fair share algorithm

Resource isolation

Monitoring in Mesos

Monitoring provided by Mesos

Types of metrics

The Mesos API

Messages

API details

Executor API

The Executor Driver API

The Scheduler API

The Scheduler Driver API

Mesos in production

Case study on HubSpot

The cluster environment

Benefits

Challenges

Looking ahead

Summary

2. Mesos Internals

Scaling and efficiency

Resource allocation

The Dominant Resource Fairness algorithm (DRF)

Weighted DRF

Configuring resource offers on Mesos

Reservation

Static reservation

Role definition

Framework assignment

Role resource policy setting

Dynamic reservation

Offer::Operation::Reserve

Offer::Operation::Unreserve

/reserve

/unreserve

Oversubscription

Revocable resource offers

Registering with the revocable resources capability

An example offer with a mix of revocable and standard resources

Resource estimator

The QoS controller

Configuring oversubscription

Extendibility

Mesos modules

Module invocation

Building a module

Hooks

The currently supported modules

The allocator module

Implementing a custom allocator module

High availability and fault tolerance

Mastering high availability

Framework scheduler fault tolerance

Slave fault tolerance

Executor/task

Slave recovery

Enabling slave checkpointing

Enabling framework checkpointing

Reconciliation

Task reconciliation

Offer reconciliation

Persistent Volumes

Offer::Operation::Create

Offer::Operation::Destroy

Summary

3. Getting Started with Mesos

Virtual Machine (VM) instances

Setting up a multi-node Mesos cluster on Amazon Web Services (AWS)

Instance types

Launching instances

Installing Mesos

Downloading Mesos

Building Mesos

Using mesos-ec2 script to launch many machines at once

Setting up a multi-node Mesos cluster on Google Compute Engine (GCE)

Introduction to instance types

Launching machines

Set up a Google Cloud Platform project

Create the network and firewall rules

Create the instances

Installing Mesos

Downloading Mesos

Building Mesos

Setting up a multi-node Mesos cluster on Microsoft Azure

Introduction to instance types

Launching machines

Create a cloud service

Create the instances

Configuring the network

Installing Mesos

Downloading Mesos

Building Mesos

Starting mesos-master

Start mesos-slaves

Mesos commands

Testing the installation

Setting up a multi-node Mesos cluster on your private datacenter

Installing Mesos

Preparing the environment

Downloading Mesos

Building Mesos

Starting mesos-master

Starting mesos-slaves

Automating the process when you have many machines

Debugging and troubleshooting

Handling missing library dependencies

Issues with directory permissions

Missing Mesos library (libmesos*.so not found)

Debugging a failed framework

Understanding the Mesos directory structure

Mesos slaves are not connecting with Mesos masters

Launching multiple slave instances on the same machine

Summary

4. Service Scheduling and Management Frameworks

Using Marathon to launch and manage long-running applications on Mesos

Installing Marathon

Installing ZooKeeper to store the state

Launching Marathon in local mode

Multi-node Marathon cluster setup

Launching a test application from the UI

Scaling the application

Terminating the application

Chronos as a cluster scheduler

Installing Chronos

Scheduling a new job

Chronos plus Marathon

The Chronos REST API endpoint

Listing the running jobs

Manually starting a job

Adding a scheduled job

Deleting a job

Deleting all the tasks of a job

The Marathon REST API endpoint

Listing the running applications

Adding an application

Changing the configuration of an application

Deleting the application

Introduction to Apache Aurora

Installing Aurora

Introduction to Singularity

Installing Singularity

Creating a Singularity configuration file

Service discovery using Marathoner

Service discovery using Consul

Running Consul

Load balancing with HAProxy

Creating the bridge between HAProxy and Marathon

Bamboo - Automatically configuring HAProxy for Mesos plus Marathon

Introduction to Netflix Fenzo

Introduction to PaaSTA

A comparative analysis of different Scheduling/Management frameworks

Summary

5. Mesos Cluster Deployment

Deploying and configuring a Mesos cluster using Ansible

Installing Ansible

Installing the control machine

Creating an ansible-mesos setup

Deploying and configuring Mesos cluster using Puppet

Deploying and configuring a Mesos cluster using SaltStack

SaltStack installation

Deploying and configuring a Mesos cluster using Chef

Recipes

Configuring mesos-master

Configuring mesos-slave

Deploying and configuring a Mesos cluster using Terraform

Installing Terraform

Spinning up a Mesos cluster using Terraform on Google Cloud

Destroying the cluster

Deploying and configuring a Mesos cluster using Cloudformation

Setting up cloudformation-zookeeper

Using cloudformation-mesos

Creating test environments using Playa Mesos

Installations

Monitoring the Mesos cluster using Nagios

Installing Nagios 4

Monitoring the Mesos cluster using Satellite

Satellite installation

Common deployment issues and solutions

Summary

6. Mesos Frameworks

Introduction to Mesos frameworks

Frameworks – Authentication, authorization, and access control

Framework authentication

Configuration options

Framework authorization

Access Control Lists (ACLs)

Examples

The Mesos API

The scheduler HTTP API

Request Calls

TEARDOWN

DECLINE

REVIVE

KILL

SHUTDOWN

ACKNOWLEDGE

RECONCILE

MESSAGE

REQUEST

Response events

SUBSCRIBED

OFFERS

RESCIND

UPDATE

MESSAGE

FAILURE

ERROR

HEARTBEAT

Building a custom framework on Mesos

Driver implementation

Executor implementation

Scheduler implementation

Running the framework

Summary

7. Mesos Containerizers

Containers

Why containers?

Docker

Containerizer

Motivation

Containerizer types

Containerizer creation

Mesos containerizer

The launching process

Mesos containerizer states

Internals

Shared Filesystem

Pid namespace

Posix Disk isolator

Docker containerizer

Setup

Launching process

Docker containerizer states

Composing containerizer

Networking for Mesos-managed containers

Architecture

Key terms

The process

IP-per-container capability in frameworks

NetworkInfo message

Examples for specifying network requirements

Address discovery

Implementing a Custom Network Isolator Module

Monitoring container network statistics

Example statistics

Mesos Image Provisioner

Setup and configuration options

Mesos fetcher

Mechanism

Cache entry

URI flow diagram

Cache eviction

Deploying containerized apps using Docker and Mesos

Summary

8. Mesos Big Data Frameworks

Hadoop on Mesos

Introduction to Hadoop

MapReduce

Hadoop Distributed File System

Setting up Hadoop on Mesos

An advanced configuration guide

Common problems and solutions

Spark on Mesos

Why Spark

Logistic regression in Hadoop and Spark

The Spark ecosystem

Spark Core

Spark SQL

Spark Streaming

MLlib

GraphX

Setting up Spark on Mesos

Submitting jobs in client mode

Submitting jobs in cluster mode

An advanced configuration guide

Spark configuration properties

Storm on Mesos

The Storm architecture

Setting up Storm on Mesos

Running a sample topology

An advanced configuration guide

Deploying Storm through Marathon

Samza on Mesos

Important concepts of Samza

Streams

Jobs

Partitions

Tasks

Dataflow graphs

Setting up Samza on Mesos

The deployment of Samza through Marathon

An advanced configuration guide

Summary

9. Mesos Big Data Frameworks 2

Cassandra on Mesos

Introduction to Cassandra

Setting up Cassandra on Mesos

An advanced configuration guide

The Elasticsearch-Logstash-Kibana (ELK) stack on Mesos

Introduction to Elasticsearch, Logstash, and Kibana

Elasticsearch

Logstash

Kibana

The ELK stack data pipeline

Setting up Elasticsearch-Logstash-Kibana on Mesos

Elasticsearch on Mesos

Logstash on Mesos

Logstash on Mesos configurations

Kibana on Mesos

Kafka on Mesos

Introduction to Kafka

Use cases of Kafka

Setting up Kafka

Kafka logs management

An advanced configuration guide

Summary

Index

Mastering Mesos

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: May 2016

Production reference: 1200516

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78588-624-9

www.packtpub.com

Credits

Authors

Dipa Dubhashi

Akhil Das

Reviewer

Naveen Molleti

Commissioning Editor

Akram Hussain

Acquisition Editor

Sonali Vernekar

Content Development Editor

Onkar Wani

Technical Editor

Hussain Kanchwala

Copy Editor

Shruti Iyer

Project Coordinator

Bijal Patel

Proofreader

Safis Editing

Indexer

Rekha Nair

Graphics

Kirk D'Penha

Production Coordinator

Aparna Bhagat

Cover Work

Aparna Bhagat

About the Authors

Dipa Dubhashi is an alumnus of the prestigious Indian Institute of Technology and heads product management at Sigmoid. His prior experience includes consulting with ZS Associates besides founding his own start-up. Dipa specializes in envisioning enterprise big data products, developing their roadmaps, and managing their development to solve customer use cases across multiple industries. He advises several leading start-ups as well as Fortune 500 companies about architecting and implementing their next-generation big data solutions. Dipa has also developed a course on Apache Spark for a leading online education portal and is a regular speaker at big data meetups and conferences.

Akhil Das is a senior software developer at Sigmoid primarily focusing on distributed computing, real-time analytics, performance optimization, and application scaling problems using a wide variety of technologies such as Apache Spark and Mesos, among others. He contributes actively to the Apache Spark project and is a regular speaker at big data conferences and meetups, MesosCon 2015 being the most recent one.

We would like to thank several people that helped make this book a reality: Revati Dubhashi, without whose driving force this book would not have seen the light of day; Chithra, for her constant encouragement and support; and finally, Mayur Rustagi, Naveen Molleti, and the entire Sigmoid family for their invaluable guidance and technical input.

About the Reviewer

Naveen Molleti works at Sigmoid as a technology lead, heading product architecture and scalability. Although he graduated in computer science from IIT Kharagpur in 2011, he has worked for about a decade developing software on various OSes and platforms in a variety of programming languages. He enjoys exploring technologies and platforms and developing systems software and infrastructure.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be effortlessly deployed leveraging its pluggable architecture.

This book will provide a detailed step-by-step guide to deploying a Mesos cluster using all the standard DevOps tools to port Mesos frameworks effectively and in general demystify the concept of Mesos.

The book will first establish the raison d'être of Mesos and explain its architecture in an effective manner. From there, the book will walk the reader through the complex world of Mesos, moving progressively from simple single machine setups to highly complex multi-node cluster setups with new concepts logically introduced along the way. At the end of the journey, the reader will be armed with all the resources that he/she requires to effectively manage the complexities of today's modern datacenter requirements.

What this book covers

Chapter 1, Introducing Mesos, introduces Mesos, dives deep into its architecture, and introduces some important topics, such as frameworks, resource allocation, and resource isolation. It also discusses the two-level scheduling approach that Mesos employs, provides a detailed overview of its API, and provides a few examples of how Mesos is used in production.

Chapter 2, Mesos Internals, provides a comprehensive overview of Mesos' features and walks the reader through several important topics regarding high availability, fault tolerance, scaling, and efficiency, such as resource allocation, resource reservation, and recovery, among others.

Chapter 3, Getting Started with Mesos, covers how to manually set up and run a Mesos cluster on the public cloud (AWS, GCE, and Azure) as well as on a private datacentre (on premise). It also discuss the various debugging methods and explores how to troubleshoot the Mesos setup in detail.

Chapter 4, Service Scheduling and Management Frameworks, introduces several Mesos-based scheduling and management frameworks or applications that are required for the easy deployment, discovery, load balancing, and failure handling of long-running services.

Chapter 5, Mesos Cluster Deployment, explains how a Mesos cluster can be easily set up and monitored using the standard deployment and configuration management tools used by system administrators and DevOps engineers. It also discusses some of the common problems faced while deploying a Mesos cluster along with their corresponding resolutions.

Chapter 6, Mesos Frameworks, walks the reader through the concept and features of Mesos frameworks in detail. It also provides a detailed overview of the Mesos API, including the new HTTP Scheduler API, and provides a recipe to build custom frameworks on Mesos.

Chapter 7, Mesos Containerizers, introduces the concepts of containers and talks a bit about Docker, probably the most popular container technology available today. It also provides a detailed overview of the different "containerizer" options in Mesos, besides introducing some other topics such as networking for Mesos-managed containers and the fetcher cache. Finally, an example of deploying containerized apps in Mesos is provided for better understanding.

Chapter 8, Mesos Big Data Frameworks, acts as a guide to deploying important big data processing frameworks such as Hadoop, Spark, Storm, and Samza on top of Mesos.

Chapter 9, Mesos Big Data Frameworks 2, guides the reader through deploying important big data storage frameworks such as Cassandra, the Elasticsearch-Logstash-Kibana (ELK) stack, and Kafka on top of Mesos.

What you need for this book

To get the most of this book, you need to have basic understanding of Mesos and cluster management along with familiarity with Linux. You will also need to have access to cloud services such as AWS, GCE, and Azure, preferably running with 15 GB RAM and four cores on the Ubuntu or CentOS operating system.

Who this book is for

The book aims to serve DevOps engineers and system administrators who are familiar with the basics of managing a Linux system and its tools

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "For the sake of simplicity, we will simply run the sleep command."

A block of code is set as follows:

{ "args": [ "--zk=zk://Zookeeper.service.consul:2181/Mesos" ], "container": { "type": "DOCKER", "Docker": { "network": "BRIDGE", "image": "{{ Mesos_consul_image }}:{{ Mesos_consul_image_tag }}" } }, "id": "Mesos-consul", "instances": 1, "cpus": 0.1, "mem": 256 }

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

# Tasks for Master, Slave, and ZooKeeper nodes - name: Install mesos package apt: pkg={{item}} state=present update_cache=yes with_items: - mesos={{ mesos_pkg_version }} sudo: yes

Any command-line input or output is written as follows:

# Update the packages.$ sudo apt-get update# Install the latest OpenJDK.$ sudo apt-get install -y openjdk-7-jdk# Install autotools (Only necessary if building from git repository).$ sudo apt-get install -y autoconf libtool# Install other Mesos dependencies.$ sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev maven libapr1-dev libsvn-dev

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Now press the ADD button to add a specific port."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github. com/PacktPublishing/Mastering-Mesos. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Introducing Mesos

Apache Mesos is open source, distributed cluster management software that came out of AMPLab, UC Berkeley in 2011. It abstracts CPU, memory, storage, and other computer resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to be easily built and run effectively. It is referred to as a metascheduler (scheduler of schedulers) and a "distributed systems kernel/distributed datacenter OS".

It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be deployed by leveraging its pluggable architecture. It is scalable and efficient and provides a host of features, such as resource isolation and high availability, which, along with a strong and vibrant open source community, makes this one of the most exciting projects.

We will cover the following topics in this chapter:

Introduction to the datacenter OS and architecture of MesosIntroduction to frameworksAttributes, resources and resource scheduling, allocation, and isolationMonitoring and APIs provided by MesosMesos in production

Introduction to the datacenter OS and architecture of Mesos

Over the past decade, datacenters have graduated from packing multiple applications into a single server box to having large datacenters that aggregate thousands of servers to serve as a massively distributed computing infrastructure. With the advent of virtualization, microservices, cluster computing, and hyperscale infrastructure, the need of the hour is the creation of an application-centric enterprise that follows a software-defined datacenter strategy.

Currently, server clusters are predominantly managed individually, which can be likened to having multiple operating systems on the PC, one each for processor, disk drive, and so on. With an abstraction model that treats these machines as individual entities being managed in isolation, the ability of the datacenter to effectively build and run distributed applications is greatly reduced.

Another way of looking at the situation is comparing running applications in a datacenter to running them on a laptop. One major difference is that while launching a text editor or web browser, we are not required to check which memory modules are free and choose ones that suit our need. Herein lies the significance of a platform that acts like a host operating system and allows multiple users to run multiple applications simultaneously by utilizing a shared set of resources.

Datacenters now run varied distributed application workloads, such as Spark, Hadoop, and so on, and need the capability to intelligently match resources and applications. The datacenter ecosystem today has to be equipped to manage and monitor resources and efficiently distribute workloads across a unified pool of resources with the agility and ease to cater to a diverse user base (noninfrastructure teams included). A datacenter OS brings to the table a comprehensive and sustainable approach to resource management and monitoring. This not only reduces the cost of ownership but also allows a flexible handling of resource requirements in a manner that isolated datacenter infrastructure cannot support.

The idea behind a datacenter OS is that of intelligent software that sits above all the hardware in a datacenter and ensures efficient and dynamic resource sharing. Added to this is the capability to constantly monitor resource usage and improve workload and infrastructure management in a seamless way that is not tied to specific application requirements. In its absence, we have a scenario with silos in a datacenter that force developers to build software catering to machine-specific characteristics and make the moving and resizing of applications a highly cumbersome procedure.

The datacenter OS acts as a software layer that aggregates all servers in a datacenter into one giant supercomputer to deliver the benefits of multilatency, isolation, and resource control across all microservice applications. Another major advantage is the elimination of human-induced error during the continual assigning and reassigning of virtual resources.

From a developer's perspective, this will allow them to easily and safely build distributed applications without restricting them to a bunch of specialized tools, each catering to a specific set of requirements. For instance, let's consider the case of Data Science teams who develop analytic applications that are highly resource intensive. An operating system that can simplify how the resources are accessed, shared, and distributed successfully alleviates their concern about reallocating hardware every time the workloads change.

Of key importance is the relevance of the datacenter OS to DevOps, primarily a software development approach that emphasizes automation, integration, collaboration, and communication between traditional software developers and other IT professionals. With a datacenter OS that effectively transforms individual servers into a pool of resources, DevOps teams can focus on accelerating development and not continuously worry about infrastructure issues.

In a world where distributed computing becomes the norm, the datacenter OS is a boon in disguise. With freedom from manually configuring and maintaining individual machines and applications, system engineers need not configure specific machines for specific applications as all applications would be capable of running on any available resources from any machine, even if there are other applications already running on them. Using a datacenter OS results in centralized control and smart utilization of resources that eliminate hardware and software silos to ensure greater accessibility and usability even for noninfrastructural professionals.

Examples of some organizations administering their hyperscale datacenters via the datacenter OS are Google with the Borg (and next generation Omega) systems. The merits of the datacenter OS are undeniable, with benefits ranging from the scalability of computing resources and flexibility to support data sharing across applications to saving team effort, time, and money while launching and managing interoperable cluster applications.

It is this vision of transforming the datacenter into a single supercomputer that Apache Mesos seeks to achieve. Born out of a Berkeley AMPLab research paper in 2011, it has since come a long way with a number of leading companies, such as Apple, Twitter, Netflix, and AirBnB among others, using it in production. Mesosphere is a start-up that is developing a distributed OS product with Mesos at its core.

The architecture of Mesos

Mesos is an open-source platform for sharing clusters of commodity servers between different distributed applications (or frameworks), such as Hadoop, Spark, and Kafka among others. The idea is to act as a centralized cluster manager by pooling together all the physical resources of the cluster and making it available as a single reservoir of highly available resources for all the different frameworks to utilize. For example, if an organization has one 10-node cluster (16 CPUs and 64 GB RAM) and another 5-node cluster (4 CPUs and 16 GB RAM), then Mesos can be leveraged to pool them into one virtual cluster of 720 GB RAM and 180 CPUs, where multiple distributed applications can be run. Sharing resources in this fashion greatly improves cluster utilization and eliminates the need for an expensive data replication process per-framework.

Some of the important features of Mesos are:

Scalability: It can elastically scale to over 50,000 nodesResource isolation: This is achieved through Linux/Docker containersEfficiency: This is achieved through CPU and memory-aware resource scheduling across multiple frameworksHigh availability: This is through Apache ZooKeeperMonitoring Interface: A Web UI for monitoring the cluster state

Mesos is based on the same principles as the Linux kernel and aims to provide a highly available, scalable, and fault-tolerant base for enabling various frameworks to share cluster resources effectively and in isolation. Distributed applications are varied and continuously evolving, a fact that leads Mesos design philosophy towards a thin interface that allows an efficient resource allocation between different frameworks and delegates the task of scheduling and job execution to the frameworks themselves. The two advantages of doing so are:

Different frame data replication works can independently devise methods to address their data locality, fault-tolerance, and other such needsIt simplifies the Mesos codebase and allows it to be scalable, flexible, robust, and agile

Mesos' architecture hands over the responsibility of scheduling tasks to the respective frameworks by employing a resource offer abstraction that packages a set of resources and makes offers to each framework. The Mesos master node decides the quantity of resources to offer each framework, while each framework decides which resource offers to accept and which tasks to execute on these accepted resources. This method of resource allocation is shown to achieve a good degree of data locality for each framework sharing the same cluster.

An alternative architecture would implement a global scheduler that took framework requirements, organizational priorities, and resource availability as inputs and provided a task schedule breakdown by framework and resource as output, essentially acting as a matchmaker for jobs and resources with priorities acting as constraints. The challenges with this architecture, such as developing a robust API that could capture all the varied requirements of different frameworks, anticipating new frameworks, and solving a complex scheduling problem for millions of jobs, made the former approach a much more attractive option for the creators.

Introduction to frameworks

A Mesos framework sits between Mesos and the application and acts as a layer to manage task scheduling and execution. As its implementation is application-specific, the term is often used to refer to the application itself. Earlier, a Mesos framework could interact with the Mesos API using only the libmesos C++ library, due to which other language bindings were developed for Java, Scala, Python, and Go among others that leveraged libmesos heavily. Since v0.19.0, the changes made to the HTTP-based protocol enabled developers to develop frameworks using the language they wanted without having to rely on the C++ code. A framework consists of two components: a) Scheduler and b) Executor.

Scheduler is responsible for making decisions on the resource offers made to it and tracking the current state of the cluster. Communication with the Mesos master is handled by the SchedulerDriver module, which registers the framework with the master, launches tasks, and passes messages to other components.

The second component, Executor, is responsible, as its name suggests, for the execution of tasks on slave nodes. Communication with the slaves is handled by the ExecutorDriver module, which is also responsible for sending status updates to the scheduler.

The Mesos API, discussed later in this chapter, allows programmers to develop their own custom frameworks that can run on top of Mesos. Some other features of frameworks, such as authentication, authorization, and user management, will be discussed at length in Chapter 6, Mesos Frameworks.

Frameworks built on Mesos

A list of some of the services and frameworks built on Mesos is given here. This list is not exhaustive, and support for new frameworks is added almost every day. You can also refer to http://mesos.apache.org/documentation/latest/frameworks/ apart from the following list:

Long-running services

Aurora: This is a service scheduler that runs on top of Mesos, enabling you to run long-running services that take advantage of the scalability, fault-tolerance, and resource isolation of Mesos.Marathon: This is a private PaaS built on Mesos. It automatically handles hardware or software failures and ensures that an app is "always on".Singularity: This is a scheduler (the HTTP API and web interface) for running Mesos tasks, such as long-running processes, one-off tasks, and scheduled jobs.SSSP: This is a simple web application that provides a "Megaupload" white label to store and share files in S3.

Big data processing

Cray Chapel is a productive parallel programming language. The Chapel Mesos scheduler lets you run Chapel programs on Mesos.Dark is a Python clone of Spark, a MapReduce-like framework written in Python and running on Mesos.Exelixi is a distributed framework used to run genetic algorithms at scale.Hadoop Running Hadoop on Mesos distributes MapReduce jobs efficiently across an entire cluster.Hama is a distributed computing framework based on Bulk Synchronous Parallel computing techniques for massive scientific computations—for example, matrix, graph, and network algorithms.MPI is a message-passing system designed to function on a wide variety of parallel computers.Spark is a fast and general-purpose cluster computing system that makes parallel jobs easy to write.Storm is a distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real-time processing what Hadoop does for batch processing.

Batch scheduling

Chronos is a distributed job scheduler that supports complex job topologies. It can be used as a more fault-tolerant replacement for cron.Jenkins is a continuous integration server. The Mesos-Jenkins plugin allows it to dynamically launch workers on a Mesos cluster, depending on the workload.JobServer is a distributed job scheduler and processor that allows developers to build custom batch processing Tasklets using a point and click Web UI.

Data storage

Cassandra is a performant and highly available distributed database. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.Elasticsearch is a distributed search engine. Mesos makes it easy for it to run and scale.

The attributes and resources of Mesos

Mesos describes the slave nodes present in the cluster by the following two methods:

Attributes

Attributes are used to describe certain additional information regarding the slave node, such as its OS version, whether it has a particular type of hardware, and so on. They are expressed as key-value pairs with support for three different value types—scalar, range, and text—that are sent along with the offers to frameworks. Take a look at the following code:

attributes : attribute ( ";" attribute )* attribute : text ":" ( scalar | range | text )

Resources

Mesos can manage three different types of resources: scalars, ranges, and sets. These are used to represent the different resources that a Mesos slave has to offer. For example, a scalar resource type could be used to represent the amount of CPU on a slave. Each resource is identified by a key string, as follows:

resources : resource ( ";" resource )* resource : key ":" ( scalar | range | set ) key : text ( "(" resourceRole ")" )? resourceRole : text | "*"

Predefined uses and conventions

The Mesos master predefines how it handles the following list of resources:

cpusmemdiskports

In particular, a slave without the cpu and mem resources will never have its resources advertised to any frameworks. Also, the master's user interface interprets the scalars in mem and disk in terms of MB. For example, the value 15000 is displayed as 14.65GB.

Examples

Here are some examples of configuring the Mesos slaves:

resources='cpus:24;mem:24576;disk:409600;ports:[21000-24000];bugs:{a,b,c}'attributes='rack:abc;zone:west;os:centos5;level:10;keys:[1000-1500]'

In this case, we have three different types of resources, scalars, a range, and a set. They are called cpus, mem, and disk, and the range type is ports.

A scalar called cpus with the value 24A scalar called mem with the value 24576A scalar called disk with the value 409600A range called ports with values 21000 through 24000 (inclusive)A set called bugs with the values a, b, and c

In the case of attributes, we will end up with three attributes:

A rack attribute with the text value abcA zone attribute with the text value westAn os attribute with the text value centos5A level attribute with the scalar value 10A keys attribute with range values 1000 through 1500 (inclusive)

Two-level scheduling

Mesos has a two-level scheduling mechanism to allocate resources to and launch tasks on different frameworks. In the first level, the master process that manages slave processes running on each node in the Mesos cluster determines the free resources available on each node, groups them, and offers them to different frameworks based on organizational policies, such as priority or fair sharing. Organizations have the ability to define their own sharing policies via a custom allocation module as well.

In the second level, each framework's scheduler component that is registered as a client with the master accepts or rejects the resource offer made depending on the framework's requirements. If the offer is accepted, the framework's scheduler sends information regarding the tasks that need to be executed and the number of resources that each task requires to the Mesos master. The master transfers the tasks to the corresponding slaves, which assign the necessary resources to the framework's executor component, which manages the execution of all the required tasks in containers. When the tasks are completed, the containers are dismantled, and the resources are freed up for use by other tasks.

The following diagram and explanation from the Apache Mesos documentation (http://mesos.apache.org/documentation/latest/architecture/) explains this concept in more detail:

Let's have a look at the pointers mentioned in the preceding diagram:

1: Slave 1 reports to the master that it has four CPUs and 4 GB of memory free. The master then invokes the allocation module, which tells it that Framework 1 should be offered all the available resources.2: The master sends a resource offer describing these resources to Framework 1.3: The framework's scheduler replies to the master with information about two tasks to run on the slave using two CPUs and 1 GB RAM for the first task and one CPU and 2 GB RAM for the second task.4: The master sends the tasks to the slave, which allocates appropriate resources to the framework's executor, which in turn launches the two tasks. As one CPU and 1 GB of RAM are still free, the allocation module may now offer them to Framework 2. In addition, this resource offers process repeats when tasks finish and new resources become free.

Mesos also provides frameworks with the ability to reject resource offers. A framework can reject the offers that do not meet its requirements. This allows frameworks to support a wide variety of complex resource constraints while keeping Mesos simple at the same time. A policy called delay scheduling, in which frameworks wait for a finite time to get access to the nodes storing their input data, gives a fair level of data locality albeit with a slight latency tradeoff.

If the framework constraints are complex, it is possible that a framework might need to wait before it receives a suitable resource offer that meets its requirements. To tackle this, Mesos allows frameworks to set filters specifying the criteria that they will use to always reject certain resources. A framework can set a filter stating that it can run only on nodes with at least 32 GB of RAM space free, for example. This allows it to bypass the rejection process, minimizes communication overheads, and thus reduces overall latency.

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Mastering Mesos E-Book

Dipa Dubhashi

About This Book

Who This Book Is For

What You Will Learn

In Detail

Style and approach

Table of Contents

Mastering Mesos

Mastering Mesos

Credits

About the Authors

About the Reviewer

www.PacktPub.com

eBooks, discount offers, and more

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Chapter 1. Introducing Mesos

Introduction to the datacenter OS and architecture of Mesos

The architecture of Mesos

Introduction to frameworks

Frameworks built on Mesos

Long-running services

Big data processing

Batch scheduling

Data storage

The attributes and resources of Mesos

Attributes

Resources

Examples

Two-level scheduling