Performance Evaluation by Simulation and Analysis with Applications to Computer Networks - Ken Chen - E-Book

Performance Evaluation by Simulation and Analysis with Applications to Computer Networks E-Book

Ken Chen

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book is devoted to the most used methodologies for performance evaluation: simulation using specialized software and mathematical modeling. An important part is dedicated to the simulation, particularly in its theoretical framework and the precautions to be taken in the implementation of the experimental procedure.  These principles are illustrated by concrete examples achieved through operational simulation languages ​​(OMNeT ++, OPNET). Presented under the complementary approach, the mathematical method is essential for the simulation. Both methodologies based largely on the theory of probability and statistics in general and particularly Markov processes, a reminder of the basic results is also available.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 341

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

List of Tables

List of Figures

List of Listings

Preface

1. Performance Evaluation

1.1. Performance evaluation

1.2. Performance versus resources provisioning

1.3. Methods of performance evaluation

1.4. Modeling

1.5. Types of modeling

1.6. Analytical modeling versus simulation

PART 1: Simulation

2. Introduction to Simulation

2.1. Presentation

2.2. Principle of discrete event simulation

2.3. Relationship with mathematical modeling

3. Modeling of Stochastic Behaviors

3.1. Introduction

3.2. Identification of stochastic behavior

3.3. Generation of random variables

3.4. Generation of U(0, 1) r.v.

3.5. Generation of a given distribution

3.6. Some commonly used distributions and their generation

3.7. Applications to computer networks

4. Simulation Languages

4.1. Simulation languages

4.2. Scheduler

4.3. Generators of random variables

4.4. Data collection and statistics

4.5. Object-oriented programming

4.6. Description language and control language

4.7. Validation

5. Simulation Running and Data Analysis

5.1. Introduction

5.2. Outputs of a simulation

5.3. Mean value estimation

5.4. Running simulations

5.5. Variance reduction

5.6. Conclusion

6. OMNET++

6.1. A summary presentation

6.2. Installation

6.3. Architecture of OMNeT++

6.4. The NED langage

6.5. The IDE of OMNeT++

6.6. The project

6.7. A first example

6.8. Data collection and statistics

6.9. A FIFO queue

6.10. An elementary distributed system

6.11. Building large systems: an example with INET

PART 2: Queueing Theory

7. Introduction to the Queueing Theory

7.1. Presentation

7.2. Modeling of the computer networks

7.3. Description of a queue

7.4. Main parameters

7.5. Performance indicators

7.6. The Little’s law

8. Poisson Process

8.1. Definition

8.2. Interarrival interval

8.3. Erlang distribution

8.4. Superposition of independent Poisson processes

8.5. Decomposition of a Poisson process

8.6. Distribution of arrival instants over a given interval

8.7. The PASTA property

9. Markov Queueing Systems

9.1. Birth-and-death process

9.2. The M/M/1 queues

9.3. The M/M/∞ queues

9.4. The M/M/m queues

9.5. The M/M/1/K queues

9.6. The M/M/m/m queues

9.7. Examples

10. The M/G/1 QUeues

10.1. Introduction

10.2. Embedded Markov chain

10.3. Length of the queue

10.4. Sojourn time

10.5. Busy period

10.6. Pollaczek–Khinchin mean value formula

10.7. M/G/1 queue with server vacation

10.8. Priority queueing systems

11. Queueing Networks

11.1. Generality

11.2. Jackson network

11.3. Closed network

PART 3: Probability and Statistics

12. An Introduction to the Theory of Probability

12.1. Axiomatic base

12.2. Conditional probability

12.3. Independence

12.4. Random variables

12.5. Some common distributions

12.6. Joint probability distribution of multiple random variables

12.7. Some interesting inequalities

12.8. Convergences

13. An Introduction to Statistics

13.1. Introduction

13.2. Description of a sample

13.3. Parameters estimation

13.4. Hypothesis testing

14. Markov Process

14.1. Stochastic process

14.2. Discrete-time Markov chains

14.3. Continuous-time Markov chain

Bibliography

Index

First published 2015 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUK

www.iste.co.uk

John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USA

www.wiley.com

© ISTE Ltd 2015The rights of Ken Chen to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2014958253

British Library Cataloguing-in-Publication DataA CIP record for this book is available from the British LibraryISBN 978-1-84821-747-8

List of Tables

3.1.

An example of PRNG with the PRNG of Von Neumann

10.1.

Distribution of the number of required controls

12.1.

Correspondence between the set language and the probability language

13.1.

Outcomes of the observation

S

13.2.

Some useful threshold values for the

χ

2

test

13.3.

Chi-square (

χ

2

) test: first partition

13.4.

Chi-square (

χ

2

) test: second partition

13.5.

Chi-square (

χ

2

) test: a bad partition

List of Figures

List of Listings

6.1.

Description in MySource.ned

6.2.

Implementation in MySource.cc

6.3.

Description in MySink.ned

6.4.

Implementation in MySink.cc

6.5.

Composition of the model (MyBasic.ned)

6.6.

Initialisation file: omnetpp.ini

6.7.

Data collection in MySink.cc

6.8.

Selection of statistics in MySink.ned

6.9.

Description of MyFiFo

6.10.

Implementation of MyFiFo

6.11.

New description of MySource

6.12.

New implementation of MySource

6.13.

Parametric configuration of simulation runs

6.14.

MyPacket.msg

6.15.

Description: TRNode.ned

6.16.

Header file: TRNode.h

6.17.

Implementation: TRNode.cc

6.18.

Definition of a compound module

6.19.

Definition of two models

6.20.

Configuration of simulation

6.21.

The myethernet project and its namespace in package.ned

6.22.

An Ethernet-based communication system (MyEthernet.ned)

6.23.

An Ethernet card with LLC (MyEthernetLLC.ned)

6.24.

Description of MyApp (MyApp.ned)

6.25.

Implementation of MyApp (MyApp.cc)

6.26.

MyEthernet: simulation configuration (omnetpp.ini)

Preface

When, as a PhD candidate, I discovered the technique of computer-based simulation through the book of G. Fishman [FIS 73] in the documentation centre of INRIA1 (Rocquencourt, France), I was impressed by this numerical tool and its use for performance evaluation, because of its high degree of modeling capability, and also by it being relatively easy to learn with my knowledge in computer programming. By practising it, I knew very quickly that, behind its programming interface, this experimental tool is actually governed by a rigorous scientific background of statistics and the theory of probability. With years of practice, I understood that it is also an art.

This book primarily comes from lectures I give to students in the engineering faculty on Telecommunications and Computer networks, as well as those following the Master’s degree curriculum on the same subject. The matter addressed in this book can also be useful for PhD students as well as engineers. This book chooses the computer networks as the target application domain. However, the tools presented here are generic, they are applicable to the computer systems in general, and more widely to many other application domains.

Computer networks are taking an increasingly significant place in modern society. More particularly, this book is focused on a crucial problem which is the mastering of the performance of computer networks. The latter are becoming increasingly complex systems, and their presence and importance do not cease growing in all sectors. Thus, the control of these systems becomes a major issue and a common concern for many actors: network operators, Internet Service Providers (ISP), equipment manufacturers, etc.

We are interested here, more particularly, in the quantitative aspects of these systems, rather than their functional aspects. As an example, let us imagine that you are in charge of setting-up a Website to promote a product. You must of course have a good knowledge of adequate technologies (HTML, PHP, MySQL, etc.) to ensure the building of this Website. You must also choose the adequate hardware and configuration on which the Website will run. The quantitative aspect then plays a crucial role. Actually, the hardware and configuration must meet the expectation of potential customers. There is a real trade-off between:

– performance to reach: i.e. identification of some indicators (e.g. response time of the site), as well as the associated threshold values, related to the proper functioning of the Website and the user’s satisfaction;
– resource dimensioning: i.e. the assessment of the physical resources (e.g. type of the server (processing power), amount of bandwidth) to be provisioned in order to reach these performances.

An over-sizing induces a useless additional cost, whereas a lack of provisioning could make potential customers unsatisfied. In both cases, there is a negative impact on competitiveness. It is thus necessary to establish a good adequacy between the required performance and the resources to be committed. This trade-off is not easy to reach and it claims a mastering of the quantitative aspects of the system. This is the objective of this book.

This book presents a set of theoretical results and practical methods which constitutes a panoply of complementary tools for the quantitative study of computer networks. The complementarity of these tools is an essential element for the author, since the systems to be studied are complex. This book was designed in this spirit and this is why it includes two facets:

– computer-based simulation with its languages and its practices;
– mathematical modeling with the queueing theory, and, more generally, the Markov processes.

This book wants to be pragmatic and devotes a significant part to the simulation since the latter presents a double advantage:

– It allows a detailed and faithful modeling;
– It is accessible with a minimum knowledge in computer programming without claiming advanced expertise in mathematics.

The author presents the practice of simulation through the OMNeT++ simulation environment. This tool is chosen not only because of its rich functional features and its popularity, but also because of its accessibility: this open source software is available from the Internet and can be used freely (under some conditions similar to GNU GPL).

A study by simulation, as detailed as it could be, is basically a statistical study, for a particular configuration of the target system. However, the mathematical (or analytical) modeling offers firmly built results with theoretical background. These theoretical results are general and parametrizable, ready for study of various scenarios. However, the constraints of mathematical tractability often lead to a model which is less faithful than that of a simulator. So analytical modeling and simulation form a couple of complementary tools for the quantitative study of the systems.

The author also has a concern of making the theoretical bases of the performance evaluation available to a public with average mathematical level. For this reason, we choose to emphasize on concepts, rather than to focus on mathematical details. For this reason also, we include a recall of the principal concepts and results resulting from the theory of probability and statistics which are used, implicitly or explicitly, in this book.

The structure (see Figure P.1) of the book is presented below:

– we start with a short introduction to the problem of performance evaluation in computer networks. Then, we introduce the general principles of the performance evaluation and the principal tools, which are analytical modeling and computer-based simulation. We divide the continuation of the book into three parts, which are devoted respectively to simulation, analytical models and the underlying theoretical background (probability, statistics and Markov process);
– the first part (Chapters 2 to 6, page 13 to page 127) is focused on simulation. We deal, of course, with the principle and practice of the coding of a simulation program. We also emphasize fact that simulations are basically statistical experiments. So the design and the control of a simulation, as well as the collection and the interpretation of the data, must follow precise rules which we will present. This part will be illustrated through examples carried out in OMNeT++;
– the second part (Chapters 7 to 11, page 131 to page 199) is devoted to modeling by queueing theory, and, more generally, stochastic processes. We present some basic theoretical results, including M/M/. queues, M/G/1 and its variants (server vacation and priority queueing), as well as the queueing networks (mainly Jackson networks);
– in the third part (Chapters 12 to 14, page 203 to page 272), a short introduction to the theory of probability and statistics is provided. Indeed, both analytical modeling and simulation actually proceed to a modeling of the system as a stochastic system. Thus the statistics and the theory of probability constitute the theoretical background of these two approaches. This part also contains an introduction to the Markov processes. This part was conceived to minimize the repetitions of the same concept used in various sections of the book; it also aims to facilitate possible needs for refreshing certain mathematical concepts, it makes this manuscript autonomous (self-containing).

Figure P.1.Outline of this book

Over a total of about 272 pages of text, the simulation, i.e. Part I and Chapter 13 (Statistics), constitutes roughly 50%, the analytical modeling, i.e. Part II and Chapter 14 (Markov process), constitutes 35% and the basis of Probability (Chapter 12) 10%, the rest is occupied by introductory matters (see Figure P.2).

Figure P.2.Different parts of this book

In this book, an effort is made to facilitate the cross referencing (à la Html documents). Indeed, a printed book is by definition, a serialized document, whereas the matters addressed in this book contain many mutual overlaps.

Ken CHENDecember 2014

1Institut national de recherche en informatique et en automatique, the French National Institute for Research in Computer and Control Sciences.

1

Performance Evaluation

In this chapter we introduce the performance evaluation jointly with the resource provisioning. We then give a survey on the general aspects of modeling before focusing on simulation and analytical modeling.

1.1. Performance evaluation

Recent developments in telecommunications are characterized by a double diversity: diversity of applications and diversity of infrastructures. New applications and services appear day by day. The amount of data to be transported is constantly increasing, of which multimedia data are taking an increasingly important part. These data are characterized by their tight time constraints, high flow rates and abrupt variations in time. We are witnessing the advent of the so-called ambient intelligence, i.e. computers and data are everywhere, they can be produced everywhere (cloud, body-area sensor, your smart phone, etc.) and consumed everywhere else (your laptop, your smart phone, a remote monitoring facility, etc.). This ubiquitous/pervasive computing environment is supported by various networking infrastructures, including Digital Subscriber Line (DSL) connections, 3G/4G cellular networks, wireless Local Area Network (LAN), optical networks, etc.

One important issue is the so-called Quality of Service (QoS) of the data transmission. Roughly speaking, the QoS required by an application is the set of parameters (throughput, delay, jitter, reliability, etc.) on which the application relies. For example, for a videoconferencing, the nominal end-to-end delay is about 150 ms. If the actual delay is too variable, such that the delay is frequently higher than 150 ms, we simply cannot perform decent videoconferencing.

It is a fundamental matter to guarantee QoS for different applications. One of the key issues is the performance evaluation, and, in a dual manner, the resource provisioning. Performance evaluation aims to quantitatively assess various parameters of a system running most often in a stochastic environment.

A computer network is a complex system which is constituted of various components organized in layers, according to the standard International Organization for Standardization/Open System Interconnection (ISO/OSI) seven-layers model. Here are examples of performance indicators:

– at the Physical layer: bit error rate of a wireless link;
– at the Link/Medium Access Control (MAC) layer: collision probability in a common medium (e.g. wireless fidelity (WiFi)) and point-to-point delay;
– at the network layer: rejection rate in a router and sojourn time in a router;
– at the transport layer: number of Transmission Control Protocol (TCP) connections that can be simultaneously opened;
– at the application layer: response delay to a HyperText Transfer Protocol (HTTP) request.

The control of the performance of a network is achieved through the control of each of its components.

Before a user makes a decision on their choice of telecommunication facility (e.g. MultiProtocol Label Switching (MPLS) connection or WiFi local network), it is normal for them to want to know the parameters characterizing the transmission quality (delay, loss ratio, etc.) offered by this facility, in order to know whether this facility truly matches their needs. In other words, one must know the performance of this facility.

In case it is the user who has full responsibility for the telecommunications facility (typically a LAN), it is then up to them to assess the performance of the system.

When a user is dealing with a service provider, there will be a contractual relation, which is called in generic terms service-level agreement (SLA), between the QoS that the provider has to offer and the traffic that a user can submit. For instance, a delay less than 100 ms for traffic not exceeding 50 Mbps in average and 100 Mbps in peak rate.

1.2. Performance versus resources provisioning

The performance of a system depends strongly on the available resources. For example, a video broadcast of 2 Mbps cannot be achieved if the available bandwidth is only 1 Mbps. The performance of a system and the resource provisioning are two faces of the same problem, which is the matching between the requirement of an application and the resources that have to be committed. For instance, a video traffic with a sustainable rate of 5 Mbps and peak rate of 10 Mbps can certainly be supported by a dedicated bandwidth of 10 Mbps, the QoS will be perfectly guaranteed with zero loss ratio and no waiting delay. However, the dedicated 10 Mbps resource has to be paid. An extremely economic solution is to provide only 5 Mbps (the mean rate), with a buffer absorbing traffic submitted at higher instantaneous rates. The committed resources (5 Mbps and some random access memory (RAM) for buffering) should be less expensive than the dedicated bandwidth of 10 Mbps, but the QoS will be less good, with probable long delay and high packet loss ratio. An intermediary solution would be a bandwidth of B ∈ (5, 10) Mbps with an adequate amount (L) of buffer. Performance evaluation aims to find a good couple of (B, L) for a given requirement (delay, loss ratio).

1.2.1. Performance indicators

The choice of relevant performance indicators intrinsically depends on the application. For example, a file transfer can allow a relatively large transmission delay, but claims lossless quality, whereas for videoconferencing, the standard end-to-end delay is 150 ms but it is tolerant to a limited amount of losses.

The performance of a computer network, or more generally the one of a network-based computer system, is characterized by a set of parameters:

– throughput;
– delay (end-to-end delay, delay at an intermediary node, etc.);
– load (utilization ratio);
– number of packets in the node (router, switch, etc.);
– loss ratio on a wireless link, rejection ratio in a router.

1.2.2. Resources provisioning

The resources involved in networking systems fall mainly in one of the following three categories:

– processing: it is usually the central processing unit (CPU) power, with impacts on the performance indicators such as number of packets routed by a router, number of HTTP requests processed by a Web server, etc.;
– storage: this can include both a volatile storage (RAM) or permanent storage (disk, flash memory). These resources act often as a buffer to absorb the random variations of the submitted work. Their provisioning affects indicators such as loss ratio in a router;
– transmission: from a modeling point of view, this resource plays a similar role to that of the processing capacity, with the particularity that, here, the “processing” consists of messages transmission (Internet Protocol (IP) packet, Ethernet frame, etc.). It affects indicators such as delay or loss ratio (considered jointly with the storage resource).

1.3. Methods of performance evaluation

Two classes of methods are generally used to assess the performance of a system: studies done directly on the system itself versus studies carried out through a model.

1.3.1. Direct study

This approach requires the availability of the physical system to be studied, or, at least, that of a conform copy. The advantage of a study conducted directly on the system itself (or a copy of it) is its intrinsic faithfulness to the reality. There is no distortion by the modeling process (see below) which, by definition, cannot be an exact copy. However, this method is very expensive and often impossible:

– it is, for example, very difficult to stop an operational network to turn it into a test bed. The only reasonably feasible operations are measurements gathering;
– this study is by definition infeasible when conceiving a new system (which does not exist yet). We may note that it is precisely at this phase that there is a need to compare multiple design options for various possible scenarios.

There are indeed very few cases where we can carry out studies directly on physical systems, for reasons of availability, safety, etc. Therefore, in many cases, we have no choice but to try modeling. This book only considers approaches based on modeling.

1.3.2. Modeling

Failing to work directly on the system itself, it is necessary to work on a model. There are two main approaches:

– physical modeling in which we build a scale model (reduced model) of the system. This approach is often rather expensive. We will not deal with these kind of models and focus instead on abstract modeling, by using mathematical formalism and/or computer technology;
– abstract modeling: this modeling can be done in two ways:
- mathematical model (analytical model) that is carried out using mathematical frameworks such as the Queueing theory (or graph theory, etc.);
- computer-based model that is achieved by using appropriate programming language. In practice, we often use specialized software, for instance the OMNeT++ simulation environment.

Model building requires an abstraction of the real system and often has to abandon some details that are considered as not essential and thus negligible with respect to the goal of the study. The usefulness of a modeling depends on the following two aspects:

– relevance of the abstraction, i.e. if the choice of neglecting certain details is justified;
– faithfulness to the selected components, i.e. if the latter is correctly modeled.

We will not deal directly with the issue of model building in general, i.e. the manner of choosing the details to be modeled. This choice depends primarily on the comprehension of the system and the goal of the study. Modeling is basically an art that one progressively learns to master through experience.

There is no general rule for assessing the quality of a model because it depends on the target of the study, which has a strong impact on the choice of abstraction and granularity of the modeling. It is worth noting that a model is usually not a true copy of the target system. Results from a modeling study should therefore always be examined with a critical spirit regarding their degree of representation.

1.4. Modeling

1.4.1. Shortcomings

Since we have chosen to work with modeling, we must first present its flaws so that readers are sufficiently aware of the precautions to take either in case they have to undertake a study by modeling or if they are simply users of results coming from a model.

A model, either analytical or computer-based, is a more or less simplified transcription of a real system to a virtual representation of the latter, by means of mathematical formalism or computer coding. There is therefore always a risk of lack of fidelity to the original. It may be due to an error of assessment in modeling. It can also be due to the fact that the tool used is not powerful enough to describe the system in detail, which is the case of mathematical modeling. In the case of computer modeling, although the tool itself is sufficiently powerful to reproduce in detail a computer system, it is often unwise to reproduce all the details and therefore there is still a need to fix the granularity by choosing the most relevant details. Moreover, a computer model is achieved through programming, so it may contain programming errors, so-called bugs.

1.4.2. Advantages

As aforementioned, it is often too expensive and/or simply not possible to directly work on the real system (neither even a copy nor a reduced model of it). Consequently, in spite of all the shortcomings and precautions to be taken, modeling remains the most accessible and effective means to study physical systems. The principal advantage of modeling lies in its force of parametrization and modularity, which makes it possible to generate various situations at will:

– it is by definition the only accessible approach during the design phase of a (future) system. This allows us to study and compare several options, without building physical prototypes;
– we can also build a model for a running operational system to carry out tests on the model in order to find a better adjustment of parameters of the running system without deteriorating the operation of the system itself;
– we can finally put the model of a system under various conditions, in order to locate its limits and to find remedies for undesired situations. A network operator can, for example, study different scenarios about the foreseeable needs of its (potential) users, in order to determine the best strategy for each scenario and the resources to be committed.

1.4.3. Cost of modeling

Modeling claims significant efforts in materials resources as well in human involvements.

Any modeling must begin with a good understanding of the target physical system. This one can be obtained, for an existing system, by means of long observation campaigns. The research of similar experiences constitutes a very useful alternative and/or complementary way. For a system in design phase, the observation campaign is replaced by a significant effort of documentary investigation. A good comprehension can be obtained by analysis of similar theoretical and/or case studies reported in the scientific and/or industrial literature.

Mathematical modeling is inexpensive in terms of material resource but claims the availability of high skill experts in mathematics. Moreover, it is not always guaranteed that we can establish a mathematical model that is faithful enough to the real system. Most often, an analytical model is a quite approximative model.

Computer simulation requires, for the model building, a specialized programmer and long hours of development. The simulation phase requires adequate computational resources as well as long hours of human involvement devoted to the simulations run then results analysis. In addition, commercial simulation softwares are often rather expensive.

1.5. Types of modeling

We propose here a survey of different types of modeling for physical systems of various kinds and as a function of various investigation goals. A model can be:

– static or dynamic: depending on the situation, we can study the system at a specific point in time or to study its evolution in time. The first is called static modeling and the second dynamic modeling. The modeling of a computer network generally belongs to the category of dynamic modeling;
– stochastic or deterministic: depending on the existence or not of random elements, a model can be deterministic or stochastic. Computer network models are generally stochastic;
– continuous or discrete: in a continuous model, the state of the system evolves in a continuous way in time; whereas in a discrete model, the changes only take place punctually at a set of instants. Computer networks are discrete systems.

In conclusion, in this book we will deal mainly with dynamic, stochastic and discrete modeling.

1.6. Analytical modeling versus simulation

This book presents two methodological approaches of modeling, which are:

– analytical modeling, i.e. by establishing and solving mathematical models. These models are mainly probabilistic models. Queueing theory is among the most used methods;
– computer-based simulation, i.e. the building of a model by using a software tool and then by conducting statistical study of the system behavior through virtual experiments based on this model. The most suited approach for computer networks is the discrete event simulation (DES).

This book presents the Queueing theory and DES. These two tools are often jointly used for the modeling and the analysis of the computer systems for performance evaluation and/or resource dimensioning. They are two complementary approaches, both of which are often needed to conclude a study. Indeed:

– analytical modeling leads to firm and general results. Once we get an analytical model, if we want to study a new scenario, it suffices to take the adequate parameters of the target scenario and then recompute to get the new results. However, the constraints of mathematical tractability very often lead to a model which is too simplified compared to the reality;
– with simulation, the descriptive power provided by the programming tool makes it possible to model in a very detailed way the target system. However, to draw a conclusion with a minimum statistical quality, it is necessary to carry out a certain number of simulation campaigns. Moreover, the conclusion is valid only for a given situation. If we want to study a new scenario, it is necessary to run again simulation campaigns.

The two approaches are thus complementary: analytical modeling makes it possible to get a macroscopic outline of the problem, whereas simulation makes it possible to carry out more in-depth study on certain details. The link between two approaches is strengthened by the fact that simulations have to incorporate analytical modeling in certain parts of its implementation: it is indeed far from effective to reproduce all the behaviors of a system in the least amount of detail; certain parts can be replaced by their analytical models.

PART 1

Simulation

2

Introduction to Simulation

In this chapter we present generalities about discrete event simulation, as well as its relationship with mathematical modeling.

2.1. Presentation

Simulation is one of the most used tools for quantitative studies, both in academic research and in the industrial world. It helps to study the dynamic behavior of complex systems such as computer networks.

There are two families of simulation techniques, continuous simulation versus event-driven simulation (or discrete event simulation):

– continuous simulation studies the dynamic evolution of a system (physical, mechanical or chemical phenomena) which is described by parameters with continuous values. In fact, it is not really possible, with computer technology, to track the evolution of these parameters in a continuous way. Instead, the evolution is made by small steps (in space and in time). We can, for example, simulate the damage caused to the structure of a car following its collision against a wall;
– discrete event simulation (DES) is suitable for systems whose evolution takes place only punctually on a set of specific instants consecutively to the occurrence of some events. For example, the occupancy state of the buffer of a router changes only when the router receives or sends a packet. In this book, we deal solely with discrete event simulation (DES).

The purpose of a simulation is to reproduce, by software, a physical system (in this book, we speak mainly of computer networks) so that we can undertake statistical experiments on it. Through these experiments, we can evaluate the performance of the target system, as if we performed experiments on the real physical system. For example, we can check if it is possible to guarantee a certain quality of service (QoS) for a given level of traffic through a given path within Internet.

As aforementioned, we do not deal with the issue of model building itself. The manner of adequately choosing the salient details depends on each one’s understanding of the system and the goal of the investigation. It is an art that one progressively learns to master. This book deals with the tools and techniques allowing the building of a computer-based model on one hand, and, on the other hand, its exploitation for statistical studies. There are three principal steps in a simulation study:

1) Modeling (partially using mathematical models) of the target system and its environment. Chapter 3 is devoted to the identification and the reproduction of the system and its environment with their respective probability distributions;
2) Computational building of the model, often by using a language dedicated to simulation. Chapter 4 presents some techniques which are specific to the programming of simulation models;
3) Statistical experiments. This involves correctly running simulation and gathering data, then making accurate analysis and interpretation of the data. Chapter 5 presents some statistical techniques for the control of simulation and the analysis of the data produced by simulations.

In Chapter 6, we provide an introduction to the OMNeT++ simulation language through which we try to illustrate various theoretical concepts presented in this book. This chapter is also designed to help understand and make first steps with OMNeT++, an open-source generic simulation tool.

Among numerous books on simulation, the author likes to mention a very reduced sample [FIS 73, JAI 91, FIS 01, BAN 05, ASM 07]. The books [WEH 10, INC 07] provide some concrete applications to computer networks. Francophone readers can also consult [HEC 03, BEC 06].

2.2. Principle of discrete event simulation

2.2.1. Evolution of a event-driven system

Simulation aims to reproduce the in-time evolution of the target system. Discrete simulation applies to systems whose evolution can be described by a succession of events. Each event leads to some evolution of the state of the system; the evolution is thus event-driven. The state, say s, of a system is, by definition, the set of all of the parameters allowing to describe the system at this moment. The state of a system at a given instant t, s(t), determines the system at that instant.

EXAMPLE 2.1.– We are interested in the filling of the buffer of a communication facility, say a router. The state of this system (buffer) at the instant t, s(t), can be described by the number of customers (packets) in the buffer. s(t) evolves under the joint effect of two processes:

– the arrival process (reception of a packet) which increases s(t) by 1;
– the service process (transmission of a packet) which decreases s(t) by 1.
3) the exact value of s(t2)−, and thus s(t2)+, cannot be predicted at t1. Indeed, between t1 and t2, there may be other arrivals, two of the plausible events are:
4) in the second case (an arrival at t3), there are two consequences:
- we must consider other possible arrivals between t3 and t2. Each new arrival increases the state by 1 assuming that the last such one occurs at tl;

We see here that the system’s evolution can be described if its current state is known. We also observe the two driving forces which make it evolve, namely the arrivals and the services.

In this sample scenario, the scheduling policy is assumed to be FIFO. If it is not the case, the evolution becomes more complex. In addition, the state of the system may not be able to be described solely by the number of the customers. Let us imagine a system with two classes (2 levels of priorities for example), the state of the system must be described, in this case, by (n1, n2), which respectively give the number of packets in each class.

– the source generates the first customer at instant 0 (event 1), which is immediately sent to the (fifo) queue (event 2). Then, after having been serviced (event 3), the customer is sent to the sink (event 4 which follows the event 3) with no time lap;
– for the second customer, we can see a similar sequence of events with (5 → 6) → (9 → 10);
– during the service of the second customer, there has been arrival of the third customer. The latter has to wait (the events 7 and 8).

2.2.2. Model programming

The previous description of the evolution of the system suggests a possibility of carrying out simulations on computers. The two key-points are given in the next sections.

Figure 2.1.Sequence of the first 10 events of an M/M/1 queue

2.2.2.1. Scheduler

The evolution of an event-driven system will be seen through change of its state. This means that we can represent this evolution by a chained list called Scheduler. Each element in this chained list contains at least the identity of the event, the instant of its occurrence, as well as the action to be taken. This list will be sorted in a chronological way, through the occurrence instants of the events. In this way, the evolution of a system can be completely described. We thus obtain a means of simulating the chronological evolution of a system.

At any instant, the scheduler contains all of the events already known to occur at a given instant in the future. Thus, it is also referred to as Future Events Set (FES) or Future Events List (FEL).

2.2.2.2. Object-oriented programming

With the power of programming languages, we can achieve a description as detailed as possible of the behavior of the target system and its environment. Usually, it is based on object-oriented programming (OOP) languages, such as Java or C++ because of their capacity in terms of modular design. Indeed, most of the physical systems are too complex to be modeled by a simple probability distribution. For an accurate description, we have to use OOP languages. The latter is particularly suitable for the modeling of computer networks. Indeed, computer networks consist of protocol entities (which are perfectly defined algorithmically) which are then assembled according to profiles (for example the IP/TCP family).