Scheduling of Large-scale Virtualized Infrastructures - Flavien Quesnel - E-Book

Scheduling of Large-scale Virtualized Infrastructures E-Book

Flavien Quesnel

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

System virtualization has become increasingly common in distributed systems because of its functionality and convenience for the owners and users of these infrastructures. In Scheduling of Large-scale Virtualized Infrastructures, author Flavien Quesnel examines the management of large-scale virtual infrastructures with an emphasis on scheduling up to 80,000 virtual machines on 8,000 nodes. The text fills a need for updated software managing to meet the increasing size of virtual infrastructures. Virtual machine managers and virtual operators will appreciate this guide to improvement in cooperative software management.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 173

Veröffentlichungsjahr: 2014

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

List of Abbreviations

Introduction

PART 1 Management of Distributed Infrastructures

1 Distributed Infrastructures Before the Rise of Virtualization

1.1. Overview of distributed infrastructures

1.2. Distributed infrastructure management from the software point of view

1.3. Frameworks traditionally used to manage distributed infrastructures

1.4. Conclusion

2 Contributions of Virtualization

2.1. Introduction to virtualization

2.2. Virtualization and management of distributed infrastructures

2.3. Conclusion

3 Virtual Infrastructure Managers Used in Production

3.1. Overview of virtual infrastructure managers

3.2. Resource organization

3.3. Scheduling

3.4. Advantages

3.5. Limits

3.6. Conclusion

PART 2 Toward a Cooperative and Decentralized Framework to Manage Virtual Infrastructures

4 Comparative Study Between Virtual Infrastructure Managers and Distributed Operating Systems

4.1. Comparison in the context of a single node

4.2. Comparison in a distributed context

4.3. Conclusion

5 Dynamic Scheduling of Virtual Machines

5.1. Scheduler architectures

5.2. Limits of a centralized approach

5.3. Presentation of a hierarchical approach: Snooze

5.4. Presentation of multiagent approaches

5.5. Conclusion

PART 3 DVMS, a Cooperative and Decentralized Framework to Dynamically Schedule Virtual Machines

6 DVMS: A Proposal to Schedule Virtual Machines in a Cooperative and Reactive Way

6.1. DVMS fundamentals

6.2. Implementation

6.3. Conclusion

7 Experimental Protocol and Testing Environment

7.1. Experimental protocol

7.2. Testing framework

7.3. Grid’5000 test bed

7.4. SimGrid simulation toolkit

7.5. Conclusion

8 Experimental Results and Validation of DVMS

8.1. Simulations on Grid’5000

8.2. Real experiments on Grid’5000

8.3. Simulations with SimGrid

8.4. Conclusion

9 Perspectives Around DVMS

9.1. Completing the evaluations

9.2. Correcting the limitations

9.3. Extending DVMS

9.4. Conclusion

Conclusion

Bibliography

List of Tables

List of Figures

Index

 

First published 2014 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd 27-37 St George’s Road London SW19 4EU UK

www.iste.co.uk

John Wiley & Sons, Inc. 111 River Street Hoboken, NJ 07030 USA

www.wiley.com

© ISTE Ltd 2014 The rights of Flavien Quesnel to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2014941926

British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISSN 2051-2481 (Print) ISSN 2051-249X (Online) ISBN 978-1-84821-620-4

List of Abbreviations

ACO

Ant Colony Optimization

API

Application Programming Interface

BOINC

Berkeley Open Infrastructure for Network Computing

BVT

Borrowed Virtual Time scheduler

CFS

Completely Fair Scheduler

CS

Credit Scheduler

DOS

Distributed Operating System

DVMS

Distributed Virtual Machine Scheduler

EC2

Elastic Compute Cloud

EGEE

Enabling Grids for E-sciencE

EGI

European Grid Infrastructure

I/O

Input/Output

GPOS

General Purpose Operating System

IaaS

Infrastructure as a Service

IP

Internet Protocol

JRE

Java Runtime Environment

JVM

Java Virtual Machine

KSM

Kernel Shared Memory

KVM

Kernel-based Virtual Machine

LHC

Large Hadron Collider

MHz

Megahertz

MPI

Message Passing Interface

NFS

Network File System

NTP

Network Time Protocol

OSG

Open Science Grid

PaaS

Platform as a Service

SaaS

Software as a Service

SCVMM

System Center Virtual Machine Manager

URL

Uniform Resource Locator

VIM

Virtual Infrastructure Manager

VLAN

Virtual Local Area Network

VM

Virtual Machine

WLCG

Worldwide LHC Computing Grid

XSEDE

Extreme Science and Engineering Discovery Environment

Introduction

Context

Nowadays, increasing needs in computing power are satisfied by federating more and more computers (or nodes) to build distributed infrastructures.

Historically, these infrastructures have been managed by means of user-space frameworks [FOS 06, LAU 06] or distributed operating systems [MUL 90, PIK 95, LOT 05, RIL 06, COR 08].

Over the past few years, a new kind of software manager has appeared, managers that rely on system virtualization [NUR 09, SOT 09, VMW 10, VMW 11, APA 12, CIT 12, MIC 12, OPE 12, NIM 13]. System virtualization allows dissociating the software from the underlying node by encapsulating it in a virtual machine [POP 74, SMI 05]. This technology has important advantages for distributed infrastructure providers and users. It has especially favored the emergence of cloud computing, and more specifically of infrastructure as a service. In this model, raw virtual machines are provided to users, who can customize them by installing an operating system and applications.

Problem statement and contributions

These virtual machines are created, deployed on nodes and managed during their entire lifecycle by virtual infrastructure managers (VIMs).

Most of the VIMs are highly centralized, which means that a few dedicated nodes commonly handle the management tasks. Although this approach facilitates some administration tasks and is sometimes required, for example, to have a global view of the utilization of the infrastructure, it can lead to problems. As a matter of fact, centralization limits the scalability of VIMs, in other words their ability to be reactive when they have to manage large-scale virtual infrastructures (tens of thousands of nodes) that are increasingly common nowadays [WHO 13].

In this book, we focus on ways to improve the scalability of VIMs; one of them consists of decentralizing the processing of several management tasks.

Decentralization has already been studied through research on distributed operating systems (DOSs). Therefore, we wondered whether the VIMs could benefit from the results of this research. To answer this question, we compared the management features proposed by VIMs and DOSes at the node level and at the whole infrastructure level [QUE 11]. We first developed the reflections initiated a few years ago [HAN 05, HEI 06, ROS 07], to show that virtualization technologies have benefited from the research on operating systems, and vice versa. We then extended our study to a distributed context.

Comparing VIMs and DOSes enabled us to identify some possible contributions, especially to decentralize the dynamic scheduling of virtual machines. Dynamic scheduling of virtual machines aims to move virtual machines from one node to another when it is necessary, for example (1) to enable a system administrator to perform a maintenance operation or (2) to optimize the utilization of the infrastructure by taking into account the evolution of virtual machines’ resource needs. Dynamic scheduling is still uncommonly used by VIMs deployed in production, even though several approaches have been proposed in the scientific literature. However, given the fact that they rely on a centralized model, these approaches face scalability issues and are not able to react quickly when some nodes are overloaded. This can lead to the violation of service level agreements proposed to users, since virtual machines’ resource needs are not satisfied for some time.

To mitigate this problem, several proposals have been made to decentralize the dynamic scheduling of virtual machines [BAR 10, YAZ 10, MAR 11, MAS 11, ROU 11, FEL 12b, FEL 12c]. Yet, almost all of the implemented prototypes use some partially centralized mechanisms, and satisfy the needs of reactivity and scalability only to a limited extent.

The contribution of this book lies precisely in this area of research; more specifically, we propose distributed virtual machine scheduler (DVMS), a more decentralized application to dynamically schedule virtual machines hosted on a distributed infrastructure. DVMS is deployed as a network of agents organized following a ring topology, and that also cooperate with one another to process the events (linked to overloaded/underloaded node problems) that occur on the infrastructure as quickly as possible; DVMS can process several events simultaneously and independently by dynamically partitioning the infrastructure, each partition having a size that is appropriate to the complexity of the event to be processed. We optimized the traversal of the ring by defining shortcuts, to enable a message to leave a partition as quickly as possible, instead of crossing each node of this partition. Moreover, we guaranteed that an event would be solved if a solution existed. For this purpose, we let pairs of partitions merge when there is no free node left to be absorbed by a partition that needs to grow to solve its event; it is necessary to make partitions reach a consensus before merging, to avoid deadlocks.

We implemented these concepts through a prototype, which we validated (1) by means of simulations (first with a test framework specifically designed to meet our needs, second with the SimGrid toolkit [CAS 08]) and (2) with the help of real world experiments on the Grid’5000 test bed [GRI 13] (using Flauncher [BAL 12] to configure the nodes and the virtual machines). We observed that DVMS was particularly reactive to manage virtual infrastructures involving several tens of thousands of virtual machines distributed across thousands of nodes; as a matter of fact, DVMS needed approximately 1 s to find a solution to the problem linked with an overloaded node, where other prototypes could require several minutes.

Once the prototype had been validated [QUE 12, QUE 13], we focused on the future work on DVMS, and especially on:

– Defining new events corresponding to virtual machine submissions or maintenance operations on a node;
– Adding fault-tolerance mechanisms, so that scheduling can go on even if a node crashes;
– Taking account of the network topology to build partitions, to let nodes communicate efficiently even if they are linked with one another by a wide area network. The final goal will be to implement a full decentralized VIM. This goal should be reached by the discovery [LEB 12] initiative, which will leverage this work.

Structure of this book

The remainder of this book is structured as follows.

Part 1: management of distributed infrastructures

The first part deals with distributed infrastructures.

In Chapter 1, we present the main types of distributed infrastructures that exist nowadays, and the software frameworks that are traditionally used to manage them.

In Chapter 2, we introduce virtualization and explain its advantages to manage and use distributed infrastructures.

In Chapter 3, we focus on the features and limitations of the main virtual infrastructure managers.

Part 2: toward a cooperative and decentralized framework to manage virtual infrastructures

The second part is a study of the components that are necessary to build a cooperative and decentralized framework to manage virtual infrastructures.

In Chapter 4, we investigate the similarities between virtual infrastructure managers and the frameworks that are traditionally used to manage distributed infrastructures; moreover, we identify some possible contributions, mainly on virtual machine scheduling.

In Chapter 5, we focus on the latest contributions on decentralized dynamic scheduling of virtual machines.

Part 3: DVMS, a cooperative and decentralized framework to dynamically schedule virtual machines

The third part deals with DVMS, a cooperative and decentralized framework to dynamically schedule virtual machines.

In Chapter 6, we present the theory behind DVMS and the implementation of the prototype.

In Chapter 7, we detail the experimental protocol and the tools used to evaluate and validate DVMS.

In Chapter 8, we analyze the experimental results.

In Chapter 9, we describe future work.

PART 1

Management of Distributed Infrastructures

1

Distributed Infrastructures Before the Rise of Virtualization

Organizations having huge needs in computing power can use either powerful mainframes or federations of less powerful computers (called nodes), which are part of distributed infrastructures. The latter solution has become increasingly popular over the past few years; this can be explained by the fact that: (1) a federation of nodes is cheaper than a mainframe for the same computing power and (2) a federation involving a huge number of nodes is more powerful than a mainframe.

In this chapter, we present the main kinds of distributed infrastructures and we focus on their management from the software point of view. In particular, we give an overview of the frameworks that were designed to manage these infrastructures before virtualization became popular.

1.1. Overview of distributed infrastructures

The first distributed infrastructures to appear were clusters; data centers, grids and volunteer computing platforms then followed (see Figure 1.1).

1.1.1.Cluster

The unit generally used in distributed infrastructures is the cluster.

Figure 1.1.Order of appearance of the main categories of distributed infrastructures

DEFINITION 1.1.– Cluster – A cluster is a federation of homogeneous nodes, that is to say all nodes are identical, to facilitate their maintenance as well as their utilization. These nodes are close to one another (typically the same room) and are linked by means of a highperformance local area network.

1.1.2.Data center

Clusters can be grouped inside a federation, for example a data center.

DEFINITION 1.2.– Data Center – A data center is a kind of federation of clusters, where these clusters are close to one another (typically the same building or group of buildings) and communicate through a local area network.

The characteristics of the nodes can vary from one cluster to another, especially if these clusters were not built at the same date. Each cluster has its own network; network performance can differ from one network to another.

1.1.3.Grid

Clusters and data centers belonging to several organizations sharing a common goal can be pooled to build a more powerful infrastructure, called grid.

DEFINITION 1.3.– Grid – A grid is a distributed infrastructure that “enable(s) resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations” [FOS 08].

A grid is generally made of heterogeneous nodes.

Moreover, the components of a grid communicate by means of a wide area network, whose performance is worse than a local area network; this is especially true for the latency (that is to say the time required to transmit a message between two distant nodes), and sometimes also for the bandwidth (in other words, the maximum amount of data that can be transferred between two distant nodes per unit of time).

There are many grids. Some of them are nationwide, like Grid’5000 [GRI 13] and the infrastructure managed by France Grilles [FRA 13] in France, or FutureGrid [FUT 13], Open Science Grid (OSG) [OSG 13] and Extreme Science and Engineering Discovery Environment (XSEDE, previously TeraGrid) [XSE 13] in the USA. Others were implemented on a whole continent, by leveraging nationwide grids, like the European Grid Infrastructure (EGI, formerly – Enabling Grids for E-sciencE (EGEE)) [EGI 13] in Europe. Finally, other grids are worldwide, like the Worldwide LHC Computing Grid (WLCG) [WIC 13] that relies especially on OSG and EGI to analyze data from the Large Hadron Collider (LHC) of the European Center for Nuclear Research (CERN).

1.1.4.Volunteer computing platforms

Pooled resources belonging to individuals rather than organization are the building blocks for volunteer computing platforms.

DEFINITION 1.4.– Volunteer Computing Platform – A volunteer computing platform is similar to a grid, except that it is composed of heterogeneous nodes made available by volunteers (not necessarily organizations) that are typically linked through the Internet.

Berkeley Open Infrastructure for Network Computing (BOINC) [AND 04] is an example of such a platform. It aims to federate Internet users around different research projects, like SETI@home [SET 13]. The goal of SETI@home is to analyze radio communications from space, searching for extra-terrestrial intelligence. Internet users simply need to download the BOINC application and join the project they want to take part in; when they do not use their computer, the application automatically fetches some tasks (for example, computations to perform or data to analyze) from the aforementioned project, processes them and then submits the results to the project.

XtremWeb [FED 01] is an application that allows building a platform that is similar to BOINC. However, contrary to BOINC, it allows the tasks that are distributed across the computers of several users to communicate directly with one another.

1.2. Distributed infrastructure management from the software point of view

The management of the aforementioned distributed infrastructures requires taking account of several concerns, especially the connection of users to the system and their identification, submission of tasks, scheduling, deployment, monitoring and termination. These concerns may involve several kinds of resources (see Figure 1.2):

– access nodes, for user’s connections;
– one or several node(s) dedicated to infrastructure management;
– storage nodes, for user’s data;
– worker nodes, to process the tasks submitted by users.

1.2.1.Secured connection to the infrastructure and identification of users

In order to use the infrastructure, users first need to connect to it [LAU 06, COR 08, GRI 13].

This connection can be made in several ways. From the hardware point of view, users may use a private network or the Internet; moreover, they may be authorized to connect to every node of the infrastructure, or only to dedicated nodes (the access nodes). From the software point of view, it is mandatory to decide which application and which protocol to use; this choice is critical to the security of the infrastructure, to identify users, to determine which resources they can have access to and for how much time, to take account of the resources they have used so far and to prevent a malicious user to steal resources or data from another user.

Figure 1.2.Organization of a distributed infrastructure

1.2.2.Submission of tasks

Once connected to the infrastructure, users should be able to submit tasks [LAU 06, COR 08, GRI 13].

For this purpose, they have to specify the characteristics of tasks, depending on the functionalities provided by the infrastructure:

– programs and/or data required to process the tasks; if necessary, users should be able to upload them to the infrastructure;

– required resources, from a qualitative and quantitative point of view, and the duration of use; users may also mention whether their tasks can be processed in a degraded mode, that is to say with less resources than what they asked for;

– the date/time processing must start and end;

– links between tasks (if applicable), and possible precedence constraints, that specify that some tasks have be processed before others; when tasks are linked with one another, the infrastructure manager has to execute coherent actions on these tasks; this is done during scheduling.

1.2.3.Scheduling of tasks

DEFINITION 1.5.– Scheduling – Scheduling is the process of assigning resources to tasks, in order to process them [ROT 94, TAN 01, LEU 04, STA 08]. Scheduling is performed by a scheduler.

Scheduling has to take account of the aforementioned characteristics of tasks: required resources, start/end date/time, priority, links between tasks, etc. Scheduling may be static or dynamic.

DEFINITION 1.6.– Static Scheduling – Scheduling is said to be static when each task remains on the same worker node when it is processed. The initial placement of tasks takes account of resource requirements given by users, and not the real needs of resources.

DEFINITION 1.7.– Dynamic Scheduling – Scheduling is said to be dynamic when tasks can be migrated from one worker node to another while they are processed; dynamic scheduling takes account of the real needs of resources.

Scheduling is designed to meet one or more goals. Some goals are related to how fast tasks are processed. Others aim to distribute resources across tasks in a fair way. Others are intended for optimal use of resources, for example to balance the workload between resources, or to consolidate it on a few resources to maximize their utilization rate. Others are designed to enforce placement constraints, which can result from affinities or antagonisms between tasks. Finally, some goals may consist of enforcing other kinds of constraints, like precedence constraints.

Scheduling can take account of the volatility of the infrastructure, which results from the addition or removal of resources. This addition or removal may be wanted, if infrastructure owners desire to make more resources available to users, retire obsolete resources, or perform a maintenance operation on some resources (for example to update applications or replace faulty components). The removal can also be unwanted in case of hardware or software faults, which is more likely to happen if the infrastructure is large.

1.2.4.Deployment of tasks

Once the scheduler has decided which resources to assign to a task, it needs to deploy the latter on the right worker node.

This may require installing and configuring an appropriate runtime, in addition to the copy of programs and data necessary to process the task.

Data associated with the task can be stored: (1) locally on the worker node or (2) remotely, on a shared storage server, on a set of nodes hosting a distributed file system, or in a storage array.

1.2.5.Monitoring the infrastructure

Each task is likely to be migrated from one worker node to another if dynamic scheduling is applied; in this case, the scheduler uses information collected by the monitoring system; monitoring is also interesting for other purposes.

Monitoring enables system administrators to obtain information on the state of the infrastructure, and especially to be notified in case of hardware or software faults.