Large-Scale Experiments on Cluster

Liang Wang

Research output: ThesisMaster's thesisTheses

Abstract

Evaluation of large-scale network systems and applications is usually done in one of three ways: simulations, real deployment on Internet, or on an emulated network testbed such as a cluster. Simulations can study very large systems but often abstract out many practical details, whereas real world tests are often quite small, on the order of a few hundred nodes at most, but have very realistic conditions. Clusters and other dedicated testbeds offer a middle ground between the two: large systems with real application code. They also typically allow configuring the testbed to enable repeatable experiments. In this paper we explore how to run large BitTorrent experiments in a cluster setup. We have chosen BitTorrent because the source code is available and it has been a popular target for research.
In this thesis, we first give a detailed anatomy on BiTorrent system, such as its basic components, logical architecture, key data structures, internal mechanisms and implementations. We illustrate how this system works by splitting the whole distribution process into small scenarios. Then we performed a series of experiments on our cluster with different combination of parameters in order to gain a better understanding of the system performance. We made our initial try in discussing "How to design a rational experiment" formally. This issue did not receive as much attention as it should in the previous research work.
Our contribution is two-fold. First, we show how to tweak and configure the BitTorrent client to allow for a maximum number of clients to be run on a single machine, without running into any physical limits of the machine. Second, our results show that the behavior of BitTorrent can be very sensitive to the configuration and we re-visit some existing BitTorrent research and consider the implications of our findings on previously published results. As we show in this paper, BitTorrent can change its behavior in subtle ways which are sometimes ignored in published works.
Original languageEnglish
Publication statusPublished - 2011
MoE publication typeG2 Master's thesis, polytechnic Master's thesis

Fields of Science

  • 113 Computer and information sciences

Cite this

Wang, Liang. / Large-Scale Experiments on Cluster. 2011. 80 p.
@phdthesis{571f6a6e916d457c9ed158a2f0cbee86,
title = "Large-Scale Experiments on Cluster",
abstract = "Evaluation of large-scale network systems and applications is usually done in one of three ways: simulations, real deployment on Internet, or on an emulated network testbed such as a cluster. Simulations can study very large systems but often abstract out many practical details, whereas real world tests are often quite small, on the order of a few hundred nodes at most, but have very realistic conditions. Clusters and other dedicated testbeds offer a middle ground between the two: large systems with real application code. They also typically allow configuring the testbed to enable repeatable experiments. In this paper we explore how to run large BitTorrent experiments in a cluster setup. We have chosen BitTorrent because the source code is available and it has been a popular target for research.In this thesis, we first give a detailed anatomy on BiTorrent system, such as its basic components, logical architecture, key data structures, internal mechanisms and implementations. We illustrate how this system works by splitting the whole distribution process into small scenarios. Then we performed a series of experiments on our cluster with different combination of parameters in order to gain a better understanding of the system performance. We made our initial try in discussing {"}How to design a rational experiment{"} formally. This issue did not receive as much attention as it should in the previous research work.Our contribution is two-fold. First, we show how to tweak and configure the BitTorrent client to allow for a maximum number of clients to be run on a single machine, without running into any physical limits of the machine. Second, our results show that the behavior of BitTorrent can be very sensitive to the configuration and we re-visit some existing BitTorrent research and consider the implications of our findings on previously published results. As we show in this paper, BitTorrent can change its behavior in subtle ways which are sometimes ignored in published works.",
keywords = "113 Computer and information sciences",
author = "Liang Wang",
year = "2011",
language = "English",

}

Large-Scale Experiments on Cluster. / Wang, Liang.

2011. 80 p.

Research output: ThesisMaster's thesisTheses

TY - THES

T1 - Large-Scale Experiments on Cluster

AU - Wang, Liang

PY - 2011

Y1 - 2011

N2 - Evaluation of large-scale network systems and applications is usually done in one of three ways: simulations, real deployment on Internet, or on an emulated network testbed such as a cluster. Simulations can study very large systems but often abstract out many practical details, whereas real world tests are often quite small, on the order of a few hundred nodes at most, but have very realistic conditions. Clusters and other dedicated testbeds offer a middle ground between the two: large systems with real application code. They also typically allow configuring the testbed to enable repeatable experiments. In this paper we explore how to run large BitTorrent experiments in a cluster setup. We have chosen BitTorrent because the source code is available and it has been a popular target for research.In this thesis, we first give a detailed anatomy on BiTorrent system, such as its basic components, logical architecture, key data structures, internal mechanisms and implementations. We illustrate how this system works by splitting the whole distribution process into small scenarios. Then we performed a series of experiments on our cluster with different combination of parameters in order to gain a better understanding of the system performance. We made our initial try in discussing "How to design a rational experiment" formally. This issue did not receive as much attention as it should in the previous research work.Our contribution is two-fold. First, we show how to tweak and configure the BitTorrent client to allow for a maximum number of clients to be run on a single machine, without running into any physical limits of the machine. Second, our results show that the behavior of BitTorrent can be very sensitive to the configuration and we re-visit some existing BitTorrent research and consider the implications of our findings on previously published results. As we show in this paper, BitTorrent can change its behavior in subtle ways which are sometimes ignored in published works.

AB - Evaluation of large-scale network systems and applications is usually done in one of three ways: simulations, real deployment on Internet, or on an emulated network testbed such as a cluster. Simulations can study very large systems but often abstract out many practical details, whereas real world tests are often quite small, on the order of a few hundred nodes at most, but have very realistic conditions. Clusters and other dedicated testbeds offer a middle ground between the two: large systems with real application code. They also typically allow configuring the testbed to enable repeatable experiments. In this paper we explore how to run large BitTorrent experiments in a cluster setup. We have chosen BitTorrent because the source code is available and it has been a popular target for research.In this thesis, we first give a detailed anatomy on BiTorrent system, such as its basic components, logical architecture, key data structures, internal mechanisms and implementations. We illustrate how this system works by splitting the whole distribution process into small scenarios. Then we performed a series of experiments on our cluster with different combination of parameters in order to gain a better understanding of the system performance. We made our initial try in discussing "How to design a rational experiment" formally. This issue did not receive as much attention as it should in the previous research work.Our contribution is two-fold. First, we show how to tweak and configure the BitTorrent client to allow for a maximum number of clients to be run on a single machine, without running into any physical limits of the machine. Second, our results show that the behavior of BitTorrent can be very sensitive to the configuration and we re-visit some existing BitTorrent research and consider the implications of our findings on previously published results. As we show in this paper, BitTorrent can change its behavior in subtle ways which are sometimes ignored in published works.

KW - 113 Computer and information sciences

M3 - Master's thesis

ER -