Performance Analysis of PublishSubscribe Systems--688IT编程网

a r X i v :0707.0365v 1 [c s .D C ] 3 J u l 2007Performance Analysis of Publish/Subscribe Systems Heithem Abbes 1,Christophe C´e rin 2,Jean-Christophe Dubacq 2,and Mohamed Jemni 1

´Ecole Sup´e rieure des Sciences et Techniques de Tunis,Unit´e de recherche UTIC 5,Av.Taha Hussein,B.P.56,Bab Mnara,Tunis,Tunisia Tel:(+216)71496066Fax:(+216)71391166heithem.u.tn,mohamed.u.tn 2LIPN —UMR CNRS 7030—Institut Galil´e e —Universit´e Paris-Nord 99,avenue Jean-Baptiste Cl´e ment,93430Villetaneuse,France Tel:+33(0)149403578Fax:+33(0)in,jean-christophe.dubacq}@lipn.univ-paris13.fr Abstract.The Desktop Grid oﬀers solutions to overcome several challenges and to answer increas-ingly needs of scientiﬁc computing.Its technology consists mainly in exploiting resources,geographically dispersed,to treat complex applications needing big power of calculation and/or important storage ca-pacity.However,as resources number increases,the need for scalability,self-organisation,dynamic reconﬁgurations,decentralisation and performance becomes more and more essential.Since such prop-erties are exhibited by P2P systems,the convergence of grid computing and P2P computing seems natural.In this context,this paper evaluates the scalability and performance of P2P tools for discover-ing and registering services.Three protocols a

re used for this purpose:Bonjour,Avahi and Free-Pastry.We have studied the behaviour of theses protocols related to two criteria:the elapsed time for registra-tions services and the needed time to discover new services.Our aim is to analyse these results in order to choose the best protocol we can use in order to create a decentralised middleware for desktop grid.Key words:Peer-to-Peer systems,Desktop grid,Performance evaluation,Zero-Conﬁguration,Bonjour,Avahi,mDNSResponder,Free-Pastry 1Introduction The exploitation of new instruments,such as,very high energy accelerators,telescopes and satellites in astrophysics,big data bases of imagery in biology and medicine,numerous sen-sors in geology,generated an important expansion of the needs in scientiﬁc computation.Thus,it becomes necessary to establish new computing infrastructures.On the other hand,the computer networks equipments knew these last years,an important development of

transmission speed performances and devices became equipped by powerful processors and important storage capacities.

These factors advantaged the emergence of new infrastructures,such as grid computing,to respond to computation needs with an economic cost.One variant of grid computing is the Desktop Grid where nodes are merely desktop PCs.This category constitutes the setting of our work.

Research in grid has developed some speciﬁc software (middleware)for the management of data and resources.The most Desktop Grid middlewares are centralised.In this setting,our work consists in conceiving a decentralised grid computing middleware based on peer-to-peer systems.To realise this,we would proﬁt from existing decentralised peer-to-peer systems.

The service discovery in the Grid is among the principle challenges.For instance,Globus middleware implemented the service publish/discovery mechanism based on Monitoring and

Discovery of Services(MDS-2)[7,6]which uses centralised register server.Although MDS-2 solved the scalability problem using hierarchical architecture,it is still vulnerable to single point of failure.Moreover,adaptation to the dynamic feature of servers is another challenge for MDS-2.Furthermore,another alternative consists of using decentralised approach for service discovery[3,15].Recently,P2P communities have developed a number of fully decen-tralised protocols,such as Bonjour[12,17,27],Avahi[21]and Pastry[11,22]for registering, routing and discovering in P2P networks.The core idea behind these protocols is to build self-organised overlay networks when nodes join the grid.On the other hand,it is important to know the performance and the limits of such systems.In this context,several experiments have been done in this work to analyse the performance of Bonjour,Avahi and Free-Pastry.

We choose Bonjour and Avahi(two popular middlewares running on a local area network) because our working context is the connectivity issues that we are faced to when we try to share resources belonging to diﬀerent institutions.In this paper,we assume that we have a high level middleware able to virtualise the network(we have no more problems with ﬁrewalls and NAT)and we are able to run Bonjour and Avahi on top of such middleware. Instant Grid/Private Virtual Cluster(see[10,24])is one of the candidate for network virtualisation.Its main requirements are:1)simple network conﬁguration2)no degradation of resource security3)no need to re-implement existing distributed applications.Under these assumptions,it is reasonable to check if Bonjour and Avahi can scale up.

This paper is organised as follows.In section2,we remind the notion of large scale distributed systems by focusing on grid computing and peer-to-peer systems.Then,in the section3,we illustrate the notion of desktop grid.In section4,we highlight the advantage of peer to peer systems in building a new decentralised middleware for grid computing.In section5,we describe the experimental setup used to evaluate the performance of Bonjour, Avahi and Free-Pastry.In section6and section7,we provide numerical results obtained from several experiments done on Grid’5000(we have used up to308machines).Weﬁnish this paper with some prospective and a conclusion,respectively in section8and9.

2Distributed large scale systems

2.1Grid Computing

In[4],Foster and Kesselman deﬁne grid computing as follows:“A computational grid is a hardware and software infrastructure that provides dependable,consistent,pervasive,and inexpensive access to high-end computational capabilities”.

Thus,a Grid Computing or a Computational grid is a hardware and software infrastruc-ture allowing the sharing of a big number of heterogeneous resources thanks to connection between several sites.The resources sharing objective is to resolve problems confronted in organisations which are often multi-sites and require an important volume of data and an important computation power.These organisations are called virtual organisations(VO).A computational grid is analogous to the electric network which permits to any subscriber,at any time,to accede instantaneously to the electric resource whatever is its origin or location, via standardised interface[8,7].But the grid computing oﬀers more services than the elec-tric grid and should guarantee criteria of reliability,security and access transparency while

taking account of constraints of the high throughput and the choice of the quality of service (QoS).

2.2Peer-to-peer System

One deﬁnition of a peer-to-peer system is:“peer-to-peer refers to a class of systems and applications that employ distributed resources to perform a critical function in a decentralised manner”[1].

Exchanges between systems can carry on the information,the processors cycles,the mem-ory or theﬁles storage on disk.Contrary to the client/server model,each node is a network entity which has the roles of the server and the client at the same time.With peer-to-peer, the personal computer can be part of the network.The peer-to-peer concerns a class of applications which require hardware or human resources available on the Internet.We dis-tinguish two types of peer-to-peer systems:1)ﬁles sharing systems such as Gnutella,Napster and Kazaa,which knew a great success on Internet and2)intensive computation oriented systems,equivalent to computational grids,such as SETI@Home[20],XtremWeb[26]and XtremWeb-CH[25].

3Desktop Grid

Grids aim at providing a powerful infrastructure with quality-of-service(QoS)guarantees to average size,homogeneous resources and certiﬁed communities.In contrast,Peer-to-Peer sys-tems focus on constructing a very large infrastructure from larger communities of entrusted, anonymous individuals

and volatility resources.However,the convergence of the two systems seems natural[14,5].In fact,P2P research focuses more and more on providing infrastruc-ture and diversifying the set of applications;Grid research is starting to pay attention to increasing scalability.Desktop Grid combines the two concepts.

In this context,we aim at developing a new decentralised desktop grid middleware using the features oﬀered by several peer-to-peer tools as Bonjour,Avahi and Free-Pastry.We will not build a new middleware from scratch,but we would choose the most adequate protocol from these three ones to build a decentralised middleware.Remark that others protocols exist such as CAN[9]and CHORD[13],but we choose these three protocols because Bonjour and Avahi are two implementations of Zero-conﬁguration which already proved good success in local area networks and for small organisations,whereas Free-Pastry,which is very similar to CAN and CHORD,is based on DHT(Distributed Hash Table).

In next section,we expose how we can build a Desktop Grid by using Peer-to-Peer technology.

4Using Peer-to-Peer techniques to build Desktop Gridpeer

To provide a powerful Desktop Grid,it is important to have an important number of re-sources.Theref

ore,it is necessary to integrate resources made available by several institu-tions.The bottleneck,that limits the scalability of such systems,is the centralisation charac-ter of existing tools(see[26,25]for the XtremWeb platforms or[16]for the Boinc platform). Thus,it is primordial that grids need moreﬂexible distributed mechanisms allowing them

to be eﬃciently managed.Such characteristics are presented by Peer-to-Peer systems,which proved their performance and ability to manage very big number of interconnected peers in a decentralised manner.In addition,theses systems support high volatility of resources.

Below,we describe three Peer-to-Peer systems,Bonjour,Avahi and Free-Pastry,which are the candidates of our experiments tests.

4.1Bonjour

Bonjour,also known as zero-conﬁguration networking,enables automatic discovery of com-puters,devices,and services on IP networks.Bonjour uses industry standard IP protocols to allow devices to automatically discover each other without the need to enter IP addresses or conﬁgure DNS servers.Furthermore,Bonjour can allocate IP addresses without a DHCP server,can translate between names and addresses without a DNS server and can locate or advertise services without us

ing a directory server.

As a technical level,zero-conﬁguration is a combination of three technologies:link-local addressing,Multicast DNS,and DNS Service Discovery.Link local addressing is viewed a safety net.When DHCP fails or is not available,link-local addressing lets a computer make up an address for itself,so that it can,at least,communicate on the local link,even if wider communication is not possible.Like link-local addressing,Multicast DNS is a safety net, so that when conventional DNS servers are unavailable,unreachable,badly conﬁgured or otherwise broken,computers and devices can still refer to each other by name in a way that is not dependent on the correct operation of outside infrastructure.DNS Service Discovery is built on top of DNS.It works not only for with Multicast DNS(for discovering local services) but also with good old-fashioned,wide-area Unicast DNS(for discovering remote services).

4.2Avahi

Avahi is a system which facilitates service discovery on a local network.This means that you can plug your laptop or computer into a network and instantly be able to view other people you can chat with,ﬁnd printers to print orﬁndﬁles being shared.Avahi is mainly based on mDNS implementation for

Linux.It allows programs to publish and discover services and hosts running on a local network with no speciﬁc conﬁguration.

Avahi is an Implementation of DNS Service Discovery and Multicast DNS speciﬁcations for Zero-conﬁguration Networking.It uses D-Bus for communication between user applica-tions and a system daemon.The daemon is used to coordinate application eﬀorts in caching replies,necessary to minimise the traﬃc imposed on networks.

4.3Free-Pastry

Free-Pastry is a generic,scalable and eﬃcient substrate for peer-to-peer applications.Free-Pastry nodes form a decentralised,self-organising and fault-tolerant overlay network within the Internet.Free-Pastry provides eﬃcient request routing,deterministic object location and load balancing in an application-independent manner.Furthermore,Free-Pastry provides mechanisms that support and facilitate application-speciﬁc object replication,caching,and fault recovery.

Free-Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet.It can be used to support a variety of peer-to-peer applications,including global data storage,data sharing,group communica-tion and naming.

Each node in the Free-Pastry network has a unique identiﬁer(nodeId).When presented with a message and a key,a Free-Pastry node eﬃciently routes the message to the node with a nodeId that is numerically closest to the key,among all currently live Free-Pastry nodes.Each Free-Pastry node keeps track of its immediate neighbours in the nodeId space, and notiﬁes applications of new node arrivals,node failures and recoveries.Free-Pastry takes into account network locality;it seeks to minimise the distance messages travel,according to a scalar proximity metric like the number of IP routing hops.Free-Pastry is completely decentralised,scalable,and self-organising;it automatically adapts to the arrival,departure and failure of nodes.

5Description of the experimental setup

Our goal is to study the scalability and the time response of the tools described in the pre-vious section.In fact,we focus on searching the maximum number of supported registration nodes and the response time to discover a given service.Note that the same benchmarks are applied for the three Peer-to-Peer systems(Bonjour,Avahi and Free-Pastry).The ex-perimental platform is Grid’5000,highly reconﬁgurable and controllable experimental grid platform gathering9sites geographically distributed in France.Every site hosts a cluster from256CPU to1K CPU.All sites are connected by RENATER(10Gb/s).

5.1Speciﬁc kernel on Grid’5000

Grid’5000[23]oﬀers an infrastructure with standard kernels.To run our experimental test, it is necessary to customise one kernel to support Avahi,Bonjour and Free-Pastry.Thus,we create a speciﬁc kernel containing the entire needed package to run our codes(registration and discovering codes for each system).After that,using the two tools OAR and Kadeploy (see[19,18]),we reserve and we deploy this speciﬁc kernel in all the reserved machines.We use only one site and all machines are made with AMD Opteron processors with a1Gb/s network card.

5.2Sequential registrations

In this test,theﬁrst step is to reserve N nodes on Grid’5000(N will vary from100nodes until a value for witch we observe a saturation of the registration service).The number N represents the maximum nodes that can be used for the experiment.Each node requests a registration for a given service at given time.Initially,all nodes have the needed codes to request a service but are inactive.Letδbe the activation time.We activate sequentially all the requests(and we receive back an acknowledgement).Indeed,the k th request will be activated at time k×δ.We increaseδto analyse the behaviour of the system when the delay between events becomes larger.

Obviously,at the beginning the number of registration is small,thus the time of regis-tration will be fast.We increase N until the saturation he registration service no

longer responds for a new registration).We aim at analysing the scalability of the system without overloading the network:in this test,only one multicast appears at a given time.

5.3Simultaneous registrations

In theﬁrst test,the registrations are done sequentially.This leads to a limited number of communications to exchange information.In this experiment,we stress the scalability of the system and its capacity to manage the communications between the registered nodes. Therefore,we request N(the number of reserved nodes)simultaneous registrations and we compute the time to complete the registration step.If we obtain a“reasonable”response time, we increase N until the saturation value.In others words,we are looking for the maximum registered nodes that the system handle when the network is overloaded by several multicast packet headers at the same time.

5.4Periodic registrations

It’s also important to study the eﬃciency of the system when there are some nodes with the high vola

tility property.In such case,the system needs to be updated by sending the global state to each node.

To simulate such behaviour,we register N nodes then we cancelledψservices(N>ψ) and we register them again randomly.It is clear that the value ofψinﬂuences the eﬃciency of the system.Therefore,we modify this value to obtain the maximum value for the volatility of nodes.

5.5Real registrations

In the periodic registration experiment,we simulate only one disconnection/registration and this does not correspond to the real behaviour of the operational grid systems since discon-nections are more frequent.In this test the same set of nodes is connected/disconnected for several times.In this context,we are approaching the behaviour of P2P systems and we measure the eﬃciency of such systems if they interact as real grid system.

5.6Browsing services

The other important metric is the time needed to browse a given service.Indeed,in all the previous tests,we compute the registration time.We need also to compute the discovering time which is the elapsed time between the end of the registration of a unique service and the date at which a browser node has discovered the service.

Note that the response time depends on the replicas number of a given services and the registered nodes.The browsing program listen any new a new registration or deleting services.With the four setup mentioned before,we can analyse the performance of the discovery service of the system.We have also the possibility to increase the number of browsers.We draw the chart where point(i,j,k)is the response time i for browser j when we use a total of k browsers.

688IT编程网

Performance Analysis of PublishSubscribe Systems

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表