a r X i v :0707.0365v 1 [c s .D C ] 3 J u l 2007Performance Analysis of Publish/Subscribe Systems Heithem Abbes 1,Christophe C´e rin 2,Jean-Christophe Dubacq 2,and Mohamed Jemni 1
1
´Ecole Sup´e rieure des Sciences et Techniques de Tunis,Unit´e de recherche UTIC 5,Av.Taha Hussein,B.P.56,Bab Mnara,Tunis,Tunisia Tel:(+216)71496066Fax:(+216)71391166heithem.u.tn,mohamed.u.tn 2LIPN —UMR CNRS 7030—Institut Galil´e e —Universit´e Paris-Nord 99,avenue Jean-Baptiste Cl´e ment,93430Villetaneuse,France Tel:+33(0)149403578Fax:+33(0)in,jean-christophe.dubacq}@lipn.univ-paris13.fr Abstract.The Desktop Grid offers solutions to overcome several challenges and to answer increas-ingly needs of scientific computing.Its technology consists mainly in exploiting resources,geographically dispersed,to treat complex applications needing big power of calculation and/or important storage ca-pacity.However,as resources number increases,the need for scalability,self-organisation,dynamic reconfigurations,decentralisation and performance becomes more and more essential.Since such prop-erties are exhibited by P2P systems,the convergence of grid computing and P2P computing seems natural.In this context,this paper evaluates the scalability and performance of P2P tools for discover-ing and registering services.Three protocols a
re used for this purpose:Bonjour,Avahi and Free-Pastry.We have studied the behaviour of theses protocols related to two criteria:the elapsed time for registra-tions services and the needed time to discover new services.Our aim is to analyse these results in order to choose the best protocol we can use in order to create a decentralised middleware for desktop grid.Key words:Peer-to-Peer systems,Desktop grid,Performance evaluation,Zero-Configuration,Bonjour,Avahi,mDNSResponder,Free-Pastry 1Introduction The exploitation of new instruments,such as,very high energy accelerators,telescopes and satellites in astrophysics,big data bases of imagery in biology and medicine,numerous sen-sors in geology,generated an important expansion of the needs in scientific computation.Thus,it becomes necessary to establish new computing infrastructures.On the other hand,the computer networks equipments knew these last years,an important development of
transmission speed performances and devices became equipped by powerful processors and important storage capacities.
These factors advantaged the emergence of new infrastructures,such as grid computing,to respond to computation needs with an economic cost.One variant of grid computing is the Desktop Grid where nodes are merely desktop PCs.This category constitutes the setting of our work.
Research in grid has developed some specific software (middleware)for the management of data and resources.The most Desktop Grid middlewares are centralised.In this setting,our work consists in conceiving a decentralised grid computing middleware based on peer-to-peer systems.To realise this,we would profit from existing decentralised peer-to-peer systems.
The service discovery in the Grid is among the principle challenges.For instance,Globus middleware implemented the service publish/discovery mechanism based on Monitoring and
Discovery of Services(MDS-2)[7,6]which uses centralised register server.Although MDS-2 solved the scalability problem using hierarchical architecture,it is still vulnerable to single point of failure.Moreover,adaptation to the dynamic feature of servers is another challenge for MDS-2.Furthermore,another alternative consists of using decentralised approach for service discovery[3,15].Recently,P2P communities have developed a number of fully decen-tralised protocols,such as Bonjour[12,17,27],Avahi[21]and Pastry[11,22]for registering, routing and discovering in P2P networks.The core idea behind these protocols is to build self-organised overlay networks when nodes join the grid.On the other hand,it is important to know the performance and the limits of such systems.In this context,several experiments have been done in this work to analyse the performance of Bonjour,Avahi and Free-Pastry.
We choose Bonjour and Avahi(two popular middlewares running on a local area network) because our working context is the connectivity issues that we are faced to when we try to share resources belonging to different institutions.In this paper,we assume that we have a high level middleware able to virtualise the network(we have no more problems with firewalls and NAT)and we are able to run Bonjour and Avahi on top of such middleware. Instant Grid/Private Virtual Cluster(see[10,24])is one of the candidate for network virtualisation.Its main requirements are:1)simple network configuration2)no degradation of resource security3)no need to re-implement existing distributed applications.Under these assumptions,it is reasonable to check if Bonjour and Avahi can scale up.
This paper is organised as follows.In section2,we remind the notion of large scale distributed systems by focusing on grid computing and peer-to-peer systems.Then,in the section3,we illustrate the notion of desktop grid.In section4,we highlight the advantage of peer to peer systems in building a new decentralised middleware for grid computing.In section5,we describe the experimental setup used to evaluate the performance of Bonjour, Avahi and Free-Pastry.In section6and section7,we provide numerical results obtained from several experiments done on Grid’5000(we have used up to308machines).Wefinish this paper with some prospective and a conclusion,respectively in section8and9.
2Distributed large scale systems
2.1Grid Computing
In[4],Foster and Kesselman define grid computing as follows:“A computational grid is a hardware and software infrastructure that provides dependable,consistent,pervasive,and inexpensive access to high-end computational capabilities”.
Thus,a Grid Computing or a Computational grid is a hardware and software infrastruc-ture allowing the sharing of a big number of heterogeneous resources thanks to connection between several sites.The resources sharing objective is to resolve problems confronted in organisations which are often multi-sites and require an important volume of data and an important computation power.These organisations are called virtual organisations(VO).A computational grid is analogous to the electric network which permits to any subscriber,at any time,to accede instantaneously to the electric resource whatever is its origin or location, via standardised interface[8,7].But the grid computing offers more services than the elec-tric grid and should guarantee criteria of reliability,security and access transparency while
taking account of constraints of the high throughput and the choice of the quality of service (QoS).
2.2Peer-to-peer System
One definition of a peer-to-peer system is:“peer-to-peer refers to a class of systems and applications that employ distributed resources to perform a critical function in a decentralised manner”[1].
Exchanges between systems can carry on the information,the processors cycles,the mem-ory or thefiles storage on disk.Contrary to the client/server model,each node is a network entity which has the roles of the server and the client at the same time.With peer-to-peer, the personal computer can be part of the network.The peer-to-peer concerns a class of applications which require hardware or human resources available on the Internet.We dis-tinguish two types of peer-to-peer systems:1)files sharing systems such as Gnutella,Napster and Kazaa,which knew a great success on Internet and2)intensive computation oriented systems,equivalent to computational grids,such as SETI@Home[20],XtremWeb[26]and XtremWeb-CH[25].
3Desktop Grid
Grids aim at providing a powerful infrastructure with quality-of-service(QoS)guarantees to average size,homogeneous resources and certified communities.In contrast,Peer-to-Peer sys-tems focus on constructing a very large infrastructure from larger communities of entrusted, anonymous individuals
and volatility resources.However,the convergence of the two systems seems natural[14,5].In fact,P2P research focuses more and more on providing infrastruc-ture and diversifying the set of applications;Grid research is starting to pay attention to increasing scalability.Desktop Grid combines the two concepts.
In this context,we aim at developing a new decentralised desktop grid middleware using the features offered by several peer-to-peer tools as Bonjour,Avahi and Free-Pastry.We will not build a new middleware from scratch,but we would choose the most adequate protocol from these three ones to build a decentralised middleware.Remark that others protocols exist such as CAN[9]and CHORD[13],but we choose these three protocols because Bonjour and Avahi are two implementations of Zero-configuration which already proved good success in local area networks and for small organisations,whereas Free-Pastry,which is very similar to CAN and CHORD,is based on DHT(Distributed Hash Table).
In next section,we expose how we can build a Desktop Grid by using Peer-to-Peer technology.
4Using Peer-to-Peer techniques to build Desktop Gridpeer
To provide a powerful Desktop Grid,it is important to have an important number of re-sources.Theref
ore,it is necessary to integrate resources made available by several institu-tions.The bottleneck,that limits the scalability of such systems,is the centralisation charac-ter of existing tools(see[26,25]for the XtremWeb platforms or[16]for the Boinc platform). Thus,it is primordial that grids need moreflexible distributed mechanisms allowing them
to be efficiently managed.Such characteristics are presented by Peer-to-Peer systems,which proved their performance and ability to manage very big number of interconnected peers in a decentralised manner.In addition,theses systems support high volatility of resources.
Below,we describe three Peer-to-Peer systems,Bonjour,Avahi and Free-Pastry,which are the candidates of our experiments tests.
4.1Bonjour
Bonjour,also known as zero-configuration networking,enables automatic discovery of com-puters,devices,and services on IP networks.Bonjour uses industry standard IP protocols to allow devices to automatically discover each other without the need to enter IP addresses or configure DNS servers.Furthermore,Bonjour can allocate IP addresses without a DHCP server,can translate between names and addresses without a DNS server and can locate or advertise services without us
ing a directory server.
As a technical level,zero-configuration is a combination of three technologies:link-local addressing,Multicast DNS,and DNS Service Discovery.Link local addressing is viewed a safety net.When DHCP fails or is not available,link-local addressing lets a computer make up an address for itself,so that it can,at least,communicate on the local link,even if wider communication is not possible.Like link-local addressing,Multicast DNS is a safety net, so that when conventional DNS servers are unavailable,unreachable,badly configured or otherwise broken,computers and devices can still refer to each other by name in a way that is not dependent on the correct operation of outside infrastructure.DNS Service Discovery is built on top of DNS.It works not only for with Multicast DNS(for discovering local services) but also with good old-fashioned,wide-area Unicast DNS(for discovering remote services).
4.2Avahi
Avahi is a system which facilitates service discovery on a local network.This means that you can plug your laptop or computer into a network and instantly be able to view other people you can chat with,find printers to print orfindfiles being shared.Avahi is mainly based on mDNS implementation for
Linux.It allows programs to publish and discover services and hosts running on a local network with no specific configuration.
Avahi is an Implementation of DNS Service Discovery and Multicast DNS specifications for Zero-configuration Networking.It uses D-Bus for communication between user applica-tions and a system daemon.The daemon is used to coordinate application efforts in caching replies,necessary to minimise the traffic imposed on networks.
4.3Free-Pastry
Free-Pastry is a generic,scalable and efficient substrate for peer-to-peer applications.Free-Pastry nodes form a decentralised,self-organising and fault-tolerant overlay network within the Internet.Free-Pastry provides efficient request routing,deterministic object location and load balancing in an application-independent manner.Furthermore,Free-Pastry provides mechanisms that support and facilitate application-specific object replication,caching,and fault recovery.
Free-Pastry performs application-level routing and object location in a potentially very large overlay network of nodes connected via the Internet.It can be used to support a variety of peer-to-peer applications,including global data storage,data sharing,group communica-tion and naming.
Each node in the Free-Pastry network has a unique identifier(nodeId).When presented with a message and a key,a Free-Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key,among all currently live Free-Pastry nodes.Each Free-Pastry node keeps track of its immediate neighbours in the nodeId space, and notifies applications of new node arrivals,node failures and recoveries.Free-Pastry takes into account network locality;it seeks to minimise the distance messages travel,according to a scalar proximity metric like the number of IP routing hops.Free-Pastry is completely decentralised,scalable,and self-organising;it automatically adapts to the arrival,departure and failure of nodes.
5Description of the experimental setup
Our goal is to study the scalability and the time response of the tools described in the pre-vious section.In fact,we focus on searching the maximum number of supported registration nodes and the response time to discover a given service.Note that the same benchmarks are applied for the three Peer-to-Peer systems(Bonjour,Avahi and Free-Pastry).The ex-perimental platform is Grid’5000,highly reconfigurable and controllable experimental grid platform gathering9sites geographically distributed in France.Every site hosts a cluster from256CPU to1K CPU.All sites are connected by RENATER(10Gb/s).
5.1Specific kernel on Grid’5000
Grid’5000[23]offers an infrastructure with standard kernels.To run our experimental test, it is necessary to customise one kernel to support Avahi,Bonjour and Free-Pastry.Thus,we create a specific kernel containing the entire needed package to run our codes(registration and discovering codes for each system).After that,using the two tools OAR and Kadeploy (see[19,18]),we reserve and we deploy this specific kernel in all the reserved machines.We use only one site and all machines are made with AMD Opteron processors with a1Gb/s network card.
5.2Sequential registrations
In this test,thefirst step is to reserve N nodes on Grid’5000(N will vary from100nodes until a value for witch we observe a saturation of the registration service).The number N represents the maximum nodes that can be used for the experiment.Each node requests a registration for a given service at given time.Initially,all nodes have the needed codes to request a service but are inactive.Letδbe the activation time.We activate sequentially all the requests(and we receive back an acknowledgement).Indeed,the k th request will be activated at time k×δ.We increaseδto analyse the behaviour of the system when the delay between events becomes larger.
Obviously,at the beginning the number of registration is small,thus the time of regis-tration will be fast.We increase N until the saturation he registration service no
longer responds for a new registration).We aim at analysing the scalability of the system without overloading the network:in this test,only one multicast appears at a given time.
5.3Simultaneous registrations
In thefirst test,the registrations are done sequentially.This leads to a limited number of communications to exchange information.In this experiment,we stress the scalability of the system and its capacity to manage the communications between the registered nodes. Therefore,we request N(the number of reserved nodes)simultaneous registrations and we compute the time to complete the registration step.If we obtain a“reasonable”response time, we increase N until the saturation value.In others words,we are looking for the maximum registered nodes that the system handle when the network is overloaded by several multicast packet headers at the same time.
5.4Periodic registrations
It’s also important to study the efficiency of the system when there are some nodes with the high vola
tility property.In such case,the system needs to be updated by sending the global state to each node.
To simulate such behaviour,we register N nodes then we cancelledψservices(N>ψ) and we register them again randomly.It is clear that the value ofψinfluences the efficiency of the system.Therefore,we modify this value to obtain the maximum value for the volatility of nodes.
5.5Real registrations
In the periodic registration experiment,we simulate only one disconnection/registration and this does not correspond to the real behaviour of the operational grid systems since discon-nections are more frequent.In this test the same set of nodes is connected/disconnected for several times.In this context,we are approaching the behaviour of P2P systems and we measure the efficiency of such systems if they interact as real grid system.
5.6Browsing services
The other important metric is the time needed to browse a given service.Indeed,in all the previous tests,we compute the registration time.We need also to compute the discovering time which is the elapsed time between the end of the registration of a unique service and the date at which a browser node has discovered the service.
Note that the response time depends on the replicas number of a given services and the registered nodes.The browsing program listen any new a new registration or deleting services.With the four setup mentioned before,we can analyse the performance of the discovery service of the system.We have also the possibility to increase the number of browsers.We draw the chart where point(i,j,k)is the response time i for browser j when we use a total of k browsers.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论