web下载官方下载Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)
xxx–xxx
Semantic approach to service discovery in a Grid environment
Simone A.Ludwig a,∗,S.M.S.Reyhani b
a School of Computer Science,Cardiff University,Cardiff CF243AA,UK
b Department of Information Systems and Computing,Brunel University,Uxbridge UB83PH,UK
Received5December2003;received in revised form26November2004;accepted12April2005
Abstract
The fundamental problem that the Grid research and development community is seeking to solve is how to coordinate distributed resources amongst a dynamic set of individuals and organisations in order to solve a common collaborative goal.The problem arises through the heterogeneity,distribution and sharing of the resources in different virtual organisations.Interoperability is a main issue for applications to function with the Grid.This paper proposes a matchmaking framework for service discovery in Grid environments based on three selection stages which are context,semantic and registry selection.It provides a better service discovery process by using semantic descriptions stored in ontologies which specify both the Grid services and the application knowledge.The framework permits Grid applications to specify the criteria a service request is matched with and enables interope
rability for the matchmaking process.A proof of concept is done with a prototype implementation,and an enhancement of the matchmaking process is achieved with a similarity metric which allows quantifying the quality of a match.
A qualitative and quantitative evaluation of the prototype system is given with an analysis and performance measurements to quantify the scalability of the prototype.
©2005Elsevier B.V.All rights reserved.
Keywords:Interoperability;Ontology;Service discovery;Grid
1.Introduction
In mid1990s Ian Foster and Carl Kesselman proposed a distributed computing infrastructure for advanced science and engineering which they called “The Grid”.The vision behind the Grid is to supply computing and data resources over the Internet seam-lessly,transparently and dynamically when needed,∗Corresponding author.Tel.:+442920879184;
fax:+442920874598.
E-mail address:simone.ludwig@cs.cardiff.ac.uk(S.A.Ludwig).such as the power Grid supplies electricity to end users. The Grid originated from trying to solve the informa-tion and computational challenges of science[1].
Resource discovery and as a result also service dis-covery is an important issue for the Grid in answer-ing the questions of how a service requesterfinds the resources/services needed to solve its particular prob-lem and how a service provider makes potential service requesters aware of the computing resources it can offer.Service discovery is a key concept in a distributed Grid environment.It defines a process for locating ser-vice providers and retrieving service descriptions.The
1570-8268/$–see front matter©2005Elsevier B.V.All rights reserved. doi:10.1016/j.websem.2005.04.001
2S.A.Ludwig,S.M.S.Reyhani/Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)xxx–xxx
problem of service discovery in a Grid environment arises through the heterogeneity,distribution and sh
ar-ing of the resources in different Virtual Organisations (VOs).The two different approaches implemented in the early stages of the Grid software(GLOBUS toolkit, GT[2])were:
•Monitoring and Discovery Service(MDS),•Grid Information Service(GIS).
Although these approaches deal only with resource discovery,service discovery can be seen as an extension of resource discovery.
The MDS[3]was initially designed as a cen-tralised way to obtain Grid service information via an LDAP(Lightweight Directory Access Protocol)server. Later designs in MDS-2have moved to a decentralised approach where Grid information is stored and indexed by index servers that communicate via a registration protocol[4].Users can then query directory servers. The assignment of content to servers and the overlay topology of those servers is done in an ad hoc fashion.
GIS is a service that allows storing information about the state of the Grid infrastructure[5].One approach for describing the data is to use a hierarchi-cal model.This is the approach which is currently in place as GISs have been built on top of directory ser-vices.The question arises whether these systems and the hierarchical model will provide sufficient perfor-mance and expressiveness.An alternative solution is to use a relational data model,which arguably is more difficult to implement and s
cale,but allows for more expressiveness with a relational query language.
Due to the lack of expressive and efficient match-making in Grid environments Condor[6]was used. Condor which is used for high-throughput computing is a matchmaking framework which was developed with classified advertisement(ClassAd)for solving resource allocation problems in a distributed environ-ment with decentralised ownership of resources[7]. This framework provides a bi-lateral match where both resource providers and consumers specify their matching policy and requirements.
A symmetric requirement is then evaluated for each request–resource pair to determine whether there is a match or not.
The Open Grid Services Infrastructure(OGSI)[8] defines a set of conventions and extensions on the use of Web Service Definition Language and XML Schema to enable stateful Web services.It introduces the idea of stateful Web services and defines approaches for creating,naming,and managing the lifetime of instances of services;for declaring and inspecting ser-vice state data;for asynchronous notification of service state change;for representing and managing collec-tions of service instances;and for common handling of service invocation faults.Recently,the WS-Resource Fra
mework(WSRF)[9]was proposed as a refactor-ing and evolution of OGSI aimed at exploiting new Web services standards,specifically WS-Addressing, and at evolving OGSI based on early implementation and application experiences.WSRF retains essentially all of the functional capabilities present in OGSI,while changing some of the syntax(for example,to exploit WS-Addressing)and also adopting a different termi-nology in its presentation.
Until recently,research on Grids has focused on designing and building Grid middleware that addresses the core problem of Grids which are resource man-agement and services in a distributed environment. Such services include security and data management. Argonne National Laboratory(ANL)has developed an open-source Grid middleware called GLOBUS[2] which has become the de facto Grid middleware for research and possibly production purposes.From the evolution of the Grid software it can be seen that it went from a middleware approach,where many differ-ent tools were combined in a toolbox,to a service-based approach which focuses on application-level issues. The approach proposed in this paper follows this direc-tion by taking this service-based view and presents a framework which is developed on the application level. The approach applies semantics to Grid services and to the applications in order to achieve interoperabil-ity within Grid environments.The interactions such as service requests with services from the applications and the Gri
d are matched semantically.As there are many different Grid implementations and applications, which want to make use of the Grid,available,therefore there is a need for semantics to make them interoper-able with each other.In order to connect applications such as the High Energy Physics(HEP)experiments to the Grid two interoperability layers are necessary. One interoperability layer is attached to the applica-tion layer and the other to the collective layer.Thefirst interoperability layer serves as a dictionary,allowing the different HEP applications to specify their service
S.A.Ludwig,S.M.S.Reyhani/Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)xxx–xxx3
needs in their“own”application context.The second interoperability layer allows the definition of semantic service description in order to allow a moreflexible and dynamic service discovery process[10].
This paper is organised as follows.In Section2 related efforts are summarised and the differences to the proposed approach are discussed.Section3gives an introduction to the background of semantics and ontologies.The framework of the semantic service dis-covery approach for Grid environments with
a detailed description of the components is shown in Section4. Section5presents a portal prototype implementation and explains the tools used.In Section6an enhance-ment of the matchmaking process by means of a sim-ilarity metric is done.Section7presents an evaluation of the system by an introduction of a similiarity metric andfinally,Section8concludes this paper.
2.Related efforts
During the past few years lots of effort and research have been placed in thefield of resource matching which are described in the following paragraphs.The different approaches are based on resource matching, resource mapping and selection,and developing infras-tructural middleware.
myGrid[11]is a multi-organisational project aim-ing to develop the necessary infrastructural middleware (e.g.provenance,service discovery,workflow enact-ment,change notification and personalisation)that operates over an existing Web services&Grid infras-tructure to support scientists in making use of complex distributed resources.The myGrid project is to provide access to its bioinformatics archives and analysis tools through Web service technologies using open specifi-cations.
Deelman et al.[12]address the problem of automat-ically generating job workflows for the Grid.They have developed two workflow generators.Thefirst one maps an abstract workflow defined in terms of a
pplication-level components to the set of available Grid resources. The second generator takes a wider perspective and not only performs the abstract to concrete mapping but also enables the constriction of the abstract workflow based on the available components.The system operates in the application domain and chooses application com-ponents based on the application metadata attributes.
The GRIP(Grid Interoperability Project)[13] addresses the problem of resource description in the context of a resource broker being developed,which is able to broker for resources described by several Grid middleware systems,GT2,GT3and Unicore.The approach is based on a semantic solution to resource description.The semantics of the request for resources at an application level needs to be preserved in order to allow appropriate resources to be selected by inter-mediate agents such as brokers and schedulers.The matchmaking is based on a semantic translation of the different resource description schemas.
Tangmunarunkit et al.[14]have designed and proto-typed an ontology-based resource selector that exploits ontologies,background knowledge,and rules for solv-ing resource matching in the Grid to overcome the restrictions and constraints of resource descriptions in the Grid.Traditional resource matching,as done by the Condor Matchmaker[6]or Portable Batch System [15],matchmaking is based on symmetric,attribute-based matching.In order to make the matchmaking moreflexible and also to con
sider the structure of VOs the framework consists of ontology-based match-makers,resource providers and resource consumers or requesters.Resource providers periodically advertise their resources and capabilities to one or more match-makers using advertisement messages.The user can then activate the matchmaker by submitting a query asking for resources that satisfy the request specifica-tion.The query is then processed by the TRIPLE/XSB deductive database system[16]using matchmaking rules,in combination with background knowledge and ontologies tofind the best match for the request.
All these related projects are trying to overcome the interoperability problem which Grid systems face. However,all of them,except of the myGrid project and the abstract workflow mapping project,are concerned with applying semantics to resources in order to have a more powerful matchmaking technique.The myGrid project focuses on the application-level by providing a platform with existing Web services and Grid infras-tructure to support scientists in making use of complex distributed resources,whereas the project of Deelman et al.is concerned of mapping complex workflows onto Grid environments.Although the Grid community has produced a number of middleware systems–Globus, Legion[17]and NetSolve[18],to name a few–many areas of the Grid concept remain to be investigated.
4S.A.Ludwig,S.M.S.Reyhani/Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)xxx–xxx
The approach proposed in this paper is also con-cerned with application-level issues and requirements. The main requirements which have driven the devel-opment were high degree offlexibility and expressive-ness,support for subsumption and datatypes and aflexi-ble and modular structure implemented with latest Web technologies.The main difference to the approaches proposed by others is the concept of a three-step discov-ery process consisting of application context selection, services selection and registry selection.It allows to capture the application and Grid services semantics separately and it supports application developers and Grid services developers to register application and ser-vices semantics separately.For the discovery process, this separation allows a classification of the applica-tion semantics in order tofind service descriptions in the Grid services ontology.
3.Background to semantics and ontologies
Ontologies contain categories,lexicons contain word senses,terminologies contain terms,directo-ries contain addresses,catalogs contain part numbers, and databases contain numbers,character strings a
nd BLOBs(BinaryLarge OBjects).All these lists,hierar-chies and networks are tightly interconnected collec-tions of signs.But the primary connections are not in the bits and bytes that encode the signs,but in the minds of the people who interpret them.The goal of various metadata proposals is to make those mental connec-tions explicit by tagging the data with more signs. Those metalevel signs themselves have further inter-connections,which can be tagged with metametalevel signs.But meaningless data cannot acquire meaning by being tagged with meaningless metadata.The ultimate source of meaning is the physical world that uses signs to represent entities in the world and their intentions concerning them[19].
The so-called Rich Text Format(RTF)is seman-tically the most impoverished representation for text ever devised.Formatting is an aspect of signs that makes them look pretty,but it fails to address the more fundamental question of what they mean.To address meaning,the markup languages in the SGML(Stan-dard Generalized Markup Language)[20]family were designed with a clean separation between formatting and meaning.When properly used,SGML and its suc-cessor XML(Extensible Markup Language)[21]use tags in the text to represent semantics and put the for-matting in more easily manageable style sheets.That separation is important,but the semantic tags them-selves must have clearly defined semantics.However, most XML manuals do not provide guidelines for rep-resenting semantics.
Ontologies are increasingly seen as a key technology for enabling semantics-driven knowledge processing. Communities establish ontologies,or shared concep-tual models,to provide a framework for sharing a pre-cise meaning of symbols exchanged during communi-cation.A prerequisite for widespread use of ontologies is a joint standard for their description and exchange.
RDF(S)(Resource Description Framework Schema)[22]is an ontology/knowledge representa-tion language which contains classes and properties (binary relations),range and domain constraints (on properties)and subclass and subproperty(sub-sumption)relations.RDF(S)is a relatively primitive language,however,more expressive power would clearly be necessary and desirable to describe resources in sufficient detail.Moreover,such descriptions should be amenable to automated reasoning if they are to be used effectively by automated processes[23].
These considerations led to the development of the Ontology Inference Layer(OIL)[24]and later to the design of DAML+OIL[25].DAML+OIL is a more recent proposal for an ontology representation lan-guage that has emerged from work under DARPA’s Agent Markup Language(DAML)initiative along with input from leading members of the OIL consortium. DAML+OIL is based on the original OIL language, but differs in a number of ways.DAML+OIL pro-vide a greater interoperability on the semantic level. In this way,DAML+OIL extends the RDF(S)basic primitives for providing a more expres
sive ontology modeling language and some simple terms for creat-ing inferences.In particular,DAML+OIL has moved away from the original frame-like ideas of OIL and it is an alternative syntax for a description logic.
The question arises how semantics help the service discovery process.Service discovery in Grid environ-ments to date are only based on particular keyword queries from the user.This,in majority of the cases leads to low recall and low precision of the retrieved services.The reason might be that the query keywords are semantically similar but syntactically different from
S.A.Ludwig,S.M.S.Reyhani/Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)xxx–xxx5
the terms in service descriptions.Another reason is that the query keywords might be syntactically equivalent but semantically different from the terms in the ser-vice description.Another problem with keyword-based service discovery approaches is that they cannot com-pletely capture the semantics of a user’s query because they do not consider the relations between the key-words.One possible solution for this problem is to use retrieval based on semantics.
4.Semantic service discovery framework
This section describes the semantic service dis-covery framework for a Grid environment.It gives a description of the components of the framework and shows how the matchmaking process is done.
4.1.Framework requirements
The fundamental problem the Grid research and development community is seeking to solve is how to coordinate distributed resources amongst a dynamic set of individuals and organisations in order to solve a common collaborative goal.The degree of distribution of an application that can run within such an organi-sation can vary on a scale that runs from a centralised application that uses network resources,but where control and data resides at one location to an applica-tion made up of a number of autonomous components that collaborate to meet some overall application goal.Due to many different implementations of Grid software distributed all over the world there is a need to make these implementations interoperable.This leads to the following requirements of the matchmaking framework.Thefirstfive requirements are derived from the necessity of using semantics for the service discovery process and the last two requirements are derived from the need to implement a service discovery framework for Grid environments.
1.High degree offlexibility and expressiveness
Different advertisers would want to describe their Grid services with different degrees of complexity and completeness.The description tool or language must be adaptable to these needs.An advertise-ment may be very descriptive in some points,but leave others less specified.Therefore,the ability to express semi-structured data is required.2.Support for subsumption
Matching should not be restricted to simple ser-vice name comparison.A type system with sub-sumption relationships is required,so more complex matches can be provided based on these relation-ships.
3.Support for data types
Attributes such as quantities should be part of the service descriptions.The best way to express and compare this information is by means of data types.
4.Matching process should be efficient
The matching process should be efficient which means that it should not burden the requester with excessive delays that would prevent its effective-ness.
5.Appropriate syntax for the Grid
The matchmaker must be compatible with Grid/Web technologies and the information must be in a format appropriate for a Grid environment.
6.Flexible and modular structure
The framework should beflexible enough to allow Grid applications to describe their context semantics and Grid services to describe their ser-vice semantics in a modular manner.
7.Lookup of matched services
The framework should provide a mechanism to allow the lookup and invocation of matched ser-vices.
Starting from these requirements a framework has been developed which is based on semantic service descriptions and it fulfils the requirements as follows. An important element of semantic matchmaking is a shared ontology.Shared ontologies are needed to ensure that terms have clear and consistent semantics. Otherwise,a match may be found or missed based on an incorrect interpretation of the request.The frame-work supportsflexible semantic matchmaking between advertisements and requests based on the ontologies defined.Minimising false positives and false negatives is achieved wi
th three selection stages in combination with well-defined ontologies.The selection stages are:•Context selection,where the request is matched within the appropriate application context.
•Semantic selection,where the request is matched semantically.
•Registry selection,where a lookup is performed.
6S.A.Ludwig,S.M.S.Reyhani/Web Semantics:Science,Services and Agents on the World Wide Web xxx(2005)xxx–xxx
The design of having application and Grid service ontologies separate allows a modular design.Further-more,it encapsulates the application knowledge from the Grid service knowledge.This allows other appli-cations to specify their application semantics separate from the Grid service semantics.The Grid service ontology is specified by Grid developers and the appli-cation ontology is developed by the application users. The matchmaking engine should encourage providers and requesters to be precise with their descriptions. To achieve this,the service provider follows an XML-based description,which is the ontology language DAML+OIL.To advertise and register its services the
service requester generates a description in the specified DAML+OIL format.Defining the ontologies and the selection stages precisely allows the match-making process to be efficient.Semantic matchmaking is based on DAML+OIL ontologies.The advertise-ments and requests refer to DAML+OIL concepts and the associated semantics.By using DAML+OIL, the matchmaking process can perform implications on the subsumption hierarchy leading to the recognition of semantic matches despite their syntactical differ-ences between advertisements and requests.The use of DAML+OIL also supports accuracy,which means that no matching is recognised when the relation between the advertisement and the request does not derive from the DAML+OIL ontologies used by the registry,where the lookup of the service is performed.
4.2.Matchmaker description
The semantic matchmaking framework in Fig.1 consists of service requesters(Grid applications),ser-vice providers(Grid services)and a service discovery matchmaker.The matchmaking process is designed with respect to the criteria listed in Section4.1.The processing of a received service request by the match-making engine is explained as follows[26].Depend-ing on the matching modules and the defined appli-cation and services ontologies,a semantic match is performed.Every pair of request and advertisement has to go through several different matching modules of the matchmaking process.Thefi
nal match with the service registry is performed in the registry module.Informa-tion is provided to the service requester by sending contact details and related capability descriptions of the relevant service
provider.
Fig.1.Semantic service discovery matchmaker–registration.
Fig.1shows the interactions of a service registra-tion process.First,the service providers need to register their services for the matchmaking process.The service provider registers its service semantics in the Grid ser-vice ontology(1)and the necessary contact details in the service registry(2).Service semantics comprises of a service name,a service description,service attributes (input/output)and metadata information.Furthermore, the service requester specifies the context semantics of the application in the application ontology(3).
The interactions of a service request are shown in Fig.2.The Grid application sends out a request to the service discovery matchmaker(1).The request has to go through the context matching modulefirst.Here,the request is matched within the appropriate context of the application ontology.This means that depending on the service request,which came from one of the applica-tions,the appropriate context ontology is chosen and thefirst match is performed.Additional parameters are attached to the request and forwarded to the seman-tic matching module(2).In this module the semantic match is performed.Semantic matchmaking allows the service request to be matched using t
he semantics (metadata)of services.Having all necessary semantic data,a service lookup is done using a service registry
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论