Optimising java RMI programs by--688IT编程网

springboot 原理解析Optimising Java RMI Programs by Communication

Restructuring

Kwok Cheung Yeung and Paul H.J.Kelly

Department of Computing

Imperial College,London,UK

Abstract.We present an automated run-time optimisation framework that can

improve the performance of distributed applications written using Java RMI whilst

preserving its semantics.

Java classes are modiﬁed at load-time in order to intercept RMI calls as they

occur.RMI calls are not executed immediately,but are delayed for as long as

possible.When a dependence forces execution of the delayed calls,the aggre-

gated calls are sent over to the remote server to be executed in one step.This

reduces network overhead and the quantity of data sent,since data can be shared

between calls.The sequence of calls may be cached on the server side along with

any known constants in order to speed up future calls.A remote server may also

make RMI calls to another remote server on behalf of the client if necessary.

Our results show that the techniques can speed up distributed programs signiﬁ-

cantly,especially when operating across slower networks.We also discuss some

of the challenges involved in maintaining program semantics,and show how the

approach can be used for more ambitious optimisations in the future.

1Introduction

Frameworks for distributed programming such as the Common Object Resource Broker Architecture(C

ORBA)[9]and Java Remote Method Invocation(RMI)[12]aim to pro-vide a location-transparent object-oriented programming model,but do not completely succeed since the cost of a remote call may be several orders of magnitude greater than a local call due to marshalling overheads and relatively slow network connections. This means that developers must explicitly code with performance in mind,leading to reduced productivity and increased program complexity.

The usual approach to optimising distributed programs in general has been to op-timise the connection between the communicating hosts,ﬁne-tuning the remote call mechanism and the underlying communication protocol to cut the overhead for each call to a minimum.Although this leads to a general speed-up,it does not help the per-formance of programs that are slow due to their using manyﬁne-grained methods instead of a few coarse-grained methods).Our approach towards solving this problem has been to consider all communicating nodes as part of one large program, rather than many disjoint ones.

We delay the execution of remote calls on the client for as long as possible until a dependency on the delayed calls blocks further progress.At this point,the delayed calls are executed in one step,after which the blocked operation may proceed.By delaying

the execution of remote calls,we build up a knowledge of the context in which calls were made on the client.This enables us toﬁnd opportunities for optimisations between calls that would have been lost had the calls been executed immediately.

1.1Contributions

–We present an optimisation tool which can improve performance of Java/RMI ap-plications by combining static analysis of application bytecode with run-time opti-misation of sequences of remote operations.This tool operates on unmodiﬁed Java RMI applications,and runs on a standard JVM.

–By aggregating sequences of remote calls to the same server,the total number of message exchanges is reduced.By avoiding redundant parameter and result trans-fers,total amount of data transferred can also be reduced.When calls to different servers are aggregated together,results can be forwarded directly from one server to another,bypassing the client in some cases.

–We show how run-time overheads can be reduced by caching execution plans at the servers.

–We demonstrate the use of the tool using a number of examples.

The framework presented here provides the basis for a programme of research aimed at extending ag

gressive optimisation techniques across distributed systems,and deploying the results in large-scale industrial systems.We conclude with a discussion of the potential for the work,and the challenges that remain.

1.2Structure

We begin in Section2with a discussion of related work.We then cover the runtime optimisation framework used to implement our optimisations at a high-level in Section 3.We proceed to cover the optimisations performed in Section4,and the challenges involved in maintaining the semantics of the original application in Section5.We then present some performance results in Section6andﬁnish off with some suggestions for future work in Section7and conclude in Section8.

2Related Work

Most work on optimising RMI has concentrated on reducing the run-time overhead of each remote call by reducing the amount of work done per-call or by using more lightweight network protocols.Examples include the UKA serialisation work[14], KaRMI[13],and R-UDP[10].Similar work has been done on CORBA by Gokhale and Schmidt[7].

Asynchronous RPC[11,15]aims to overlap client computation with communica-tion and remote execution,replacing results with‘promises’,which block the client only when actually used.

A more ambitious approach is the concept of caching the state of a remote-object locally[10].This works well provided that most operations on cached objects are reads.

However,a write operation incurs high penalties for all users of the cached object, since the client has to wait for invalidation of all copies of the object toﬁnish before proceeding.Theﬁrst request for invalidated data will also incur an extra delay as the server fetches it from the client that performed the last update.

A later implementation of remote-object caching[5]implements the notion of re-duced objects where only a subset of the remote-object state is cached on the server. The subset that is cached depends on the properties of the invoked methods—e.g.if a called method only accesses immutable variables,then those variables can be cached on the client without needing to deal with consistency issues.

Neither of these approaches to RMI optimisation conﬂict with our aggregation op-timisations,and although we have not done so ourselves,these optimisations could theoretically be combined.It may be

argued that our optimisations are made redundant under certain if the aggregated calls are cached locally).

The concept of aggregating numerous small operations into a single larger operation is very old,and appears in numerous other contexts,especially in the hardware domain. In the context of RPC mechanisms,concepts such as stored procedures in database sys-tems or commands in IBM’s San Francisco[3]project are also capable of aggregating calls,but these are explicit mechanisms.Implicit call aggregation is much rarer and harder to implement.One example would be the concept of batched futures[2]in the Thor database system.

3The Veneer Framework

The RMI optimisations are based on top of Veneer,which is a generalised framework that we have developed for the purpose of easing the development of run-time optimisa-tion techniques.This framework is written in standard Java,using the BCEL[4]library for bytecode generation and the Soot[16]library for program analysis.Veneer is not tied to any particular JVM implementation,which is essential since it is likely be used in a heterogeneous environment.We refer to Veneer as a‘virtual JVM’,since it behaves like a highly conﬁgurable Java virtual machine,without actually being one.

The framework presents a simpliﬁed model of the Java run-time environment,work-ing with what appears to be a simple interpreter,called an executor.A basic executor is shown in Figure1,which executes a method with no modiﬁcations whatsoever.

When a method that we are interested in is called,control passes to our executor instead of the original method.The executor is initialised with an execution plan,which is essentially a control-ﬂow graph of the method,with executable code-blocks forming the nodes.The executor sits in a loop which executes the current block,then sets the current block to the next block in line to be executed.

The power of this framework lies in the fact that the plan is aﬁrst-order object that we can change while the executor is still running,effectively modifying the code that will be executed.The executor has full control over the process of method execution between blocks,such that we can perform operations such as jumping to arbitrary code-blocks,modifying local variables or timing operations if necessary.

We minimise the interpretive overhead by delegating as much work as possible to the underlying JVM,and by making the code-blocks as coarse as possible.There is also

an option to permit blocks to run continuously without returning to the executor,though certain block types will always force a return.

The mapping of byte-code to code-blocks in the plan and the methods affected by our framework are determined by a plug-in policy class.The policy class also contains numerous call-back methods that are invoked on certain events,such as the initial load-ing of a class.

public class BasicExecutor extends Executor{

public int execute()throws Exception{

while(block!=null

&&!lockWasReleased()){

int next=-1;

try{

ute(this);

Block(next);

}catch(ExecuteException e){

//Pass control to exception handler

ExceptionHandler(e);

//Propagate exception if no handler

if(block==null)

throw e.getException();

locals[1]= e.getException();

}

return next;

}

Fig.1.Structure of a basic executor

4Optimisations

In this section we detail the RMI optimisations that have been implemented.The exam-ples used to illustrate the optimisations are deliberately simpliﬁed for clarity.

4.1Call Aggregation

Delaying calls to form call aggregates is the core technique upon which this project is based.It is an important optimisation in its own right,and furthermore can also open up further optimisation opportunities.For example,consider the following code fragment:

void m(RemoteObject r,int a){

int x =r.f(a);

int y =r.g(x);

int z =r.h(y);

System.out.println(z);

}

This program fragment incurs three remote method calls,with six data transfers.However,for this example,we can do better:

–Since all three calls are to the same remote object,they can be aggregated into a single large call,such that the number of times that call overhead is incurred is reduced to one (see Figure 2).

–x is returned as the result of the call to f from the remote server,but is subsequently passed back to it during the next call.The same occurs with the variable y .If the values of x and y were retained by the remote object between remote method calls,then the number of communications could be reduced from six to four.

–The variables x and y are unused by the client except as arguments to remote calls on the remote object from which they originated.x and y may therefore be considered as dead variables from the client’s point of view,and there is no need for their value to be passed back to the client at all,thereby further reducing the total number of remote transactions down to just two messages with payloads of size int .

With call aggregation No call aggregation Fig.2.Example of call aggregation

Client-side Implementation We have created a Veneer policy that only affects meth-ods that are statically determined to contain potentially remote method calls.Calls are deemed to be potentially remote if they are invoked via an interface,and i.RemoteException or one of its super-classes on the throw list.A run-time check is later used to ensure that the potential remote call is actually re-mote.Note that it is not sufﬁcient just to check that the receiver of the call implements

The client runs under the control of the Veneer framework using this policy.If the executor encounters a conﬁrmed remote call during the course of execution,then it places the call within a queue and proceeds to the next instruction.Sequences of ad-jacent calls to the same remote object are grouped together into remote plans.Remote plans also contain metadata regarding the calls,such as variable liveness and data de-pendencies.Calls to other remote objects will not force execution unless the target of the call is deﬁned by a previous delayed call,leading to a control dependency.However, even this condition is relaxed by server forwarding,detailed in Section4.2.

When a non-remote block is encountered with delayed calls remaining in the queue, a decision has to be made whether or not to force execution of the calls.In general,it is safe to execute the current block without forcing if there are no dependencies be-tween the current instruction and the delayed operations.If dependencies exist or if it is impossible to tell,then we must force execution.

We detect data dependencies by noting attempts to access data returned by RMI calls.Since the results of RMI calls are constructed by deserialising the data returned by the server,there can be no other references to the returned data except for the local that the result of the remote call was placed in.We therefore regard local code that accesses locals that should contain the results of RMI calls as being dependent on the delayed calls.

This scheme is rather conservative,such that even simple assignments from one lo-cal variable to another can force the execution of the delayed plans.We hope to improve this in the future using improved static analysis.Also,it cannot detect indirect data de-pendencies—for example,if the RMI call modiﬁes a remote database which the client proceeds to access using another API,then that access will go unnoticed.

When executing local code in the presence of delayed remote calls,we must ensure that the variables used by the delayed calls are not overwritten or modiﬁed by the local code.This is done by making a copy of all locals supplied to the delayed calls that may be touched by the local code.

On forcing execution,the queue of delayed remote plans is traversed,with plans being sent one-by-one,along with the set of data used by the plan,to the corresponding remote proxy on the server-side via standard RMI invocation to be executed.The proxy call may either return successfully or throw an exception.

If the call returns successfully,then the variables deﬁned by the plan that are still live are copied back into the locals set of the executing method.If an exception was thrown,then the executor goes through the normal process ofﬁnding a handler for the exception within the method,and propagating it up the call chain if one is not found.

The same Veneer policy also runs a remote proxy server on startup,whichﬁrst registers itself in a naming service via JNDI.The proxy keeps track of all remote objects present on the JVM by inserting a small callback into the constructors of all remote classes at load time1.

688IT编程网

Optimising java RMI programs by

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

Optimising java RMI programs by

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

java正则表达式选择题

非零金额正则表达式

半小时正则表达式