The Weaves Reconfigurable Programming Framework
Srinidhi Varadarajan
Department of Computer Science
Virginia Tech, Blacksburg, VA 24061-0106
(srinidhi@cs.vt.edu)
1 Abstract
This research proposes a language independent intra-process framework for object based composition of unmodified code modules. Intuitively, the two major programming models - threads and processes - can be considered as extremes along a sharing axis. Multiple threads through a process share all global state, whereas instances of a process (or independent processes) share no global state. Weaves provide the generalized framework that allows arbitrary (selective) sharing of state between multiple control flows through a process. In the Weaves framework a single process has the same level of complexity as a workstation, with independent “sub-processes”, state sharing and scheduling, all of which is achieved without requiring any modification to existing code bases. Furthermore, the framework
allows dynamic instantiation of code modules and control flows through them.
In effect, weaves create intra-process modules (similar to objects in OOP) from code written in any language. Applications can be built by instantiating Weaves to form Tapestries of dynamically interacting code. The Weaves paradigm allows objects to be arbitrarily shared – it is a true superset of both processes as well as threads, with code sharing and fast context switching time similar to threads. Weaves do not require any special support from either the language or application code - practically any code can be weaved. Weaves also include support for fast automatic checkpointing and recovery with no application support. This paper presents the elements of the Weaves framework and results from our implementation that works by reverse-analyzing source-code independent ELF object files. The current implementation has been validated over Sweep3D, a benchmark for 3D discrete ordinates neutron transport [Koch et al., 1992], and a user-level port of the Linux 2.4 family kernel TCP/IP protocol stack. Performance results show that the context switch overhead in the Weaves framework is almost identical to threads.
2 Introduction
With the growing complexity of software code bases, application designers are increasingly moving to
wards component based models, where applications are built by compositional modeling. The success of the object oriented programming paradigm, with its support for encapsulation and compartmentalization of functionality, has played a major role in facilitating this mode of application development. However, to make effective use of the OO paradigm, codes need to be written in a language that supports OOP. While this is the approach of choice for the next generation of applications, we argue that it is infeasible for several areas. Areas such as scientific computing have vast repositories of legacy non-OO codes. Rewriting and validating these applications in an OO language is an enormous software engineering endeavor.
Furthermore, with the development of increasingly sophisticated sensors, compute devices have begun interacting with the real world. This interaction presents a new dimension to application adaptivity. Given the vast stream of data that flows from sensor arrays, it may not be possible to apriori determine all the possible modes in which an application may interact with the environment. In this scenario, applications
would need to evolve both in code and state to meet the needs of a dynamically changing environment, a notion we call reconfigurable programming.
This proposal presents a new language independent intra-process framework called Weaves, for object based composition of unmodified code modules. In effect, weaves create intra-process modules (similar to objects in OOP) from code written in any language. Applications can be built by instantiating Weaves to form Tapestries of dynamically interacting code. The Weaves paradigm allows objects to be arbitrarily shared - it is a true superset of both processes as well as threads, with code sharing and fast context switching time similar to threads. The main feature of Weaves is that it does not require any special support from either the language or application code - practically any code can be weaved. Our current prototype implementation uses post-compiler analysis on ELF object files, supporting the vast majority of UNIX platforms and language compilers that use ELF format.
The Weaves framework being very general, presents several new directions for developing high-performance codes. First, it enables run time composition of legacy applications, without requiring that any code be rewritten or even modified. Second, weaves enables run-time adaptivity. Run-time adaptation requires support for checkpointing and recovery and support for dynamic insertion and removal of code modules. Currently, the majority of applications implement their own user-level checkpointing, which adds a lot to the complexity and maintainability of these code bases. The Weaves framework includes transparent support for fast checkpointing and recovery with no applicatio
n support. Our post-compiler analysis automatically determines the necessary state that needs to be saved and restored and presents a simple interface to this functionality. Finally, weaves support run-time flow migration, which enables automatic load-balancing of composed codes on parallel machines. The ability to migrate parts of an application as opposed to the whole gives finer grain control over load-balancing decisions.
This paper is organized as follows. In section 2 we motivate the needs for a transparent component-based framework. Section 3 describes the elements of the Weaves framework, including support for tuple spaces, automatic checkpointing and recovery, and code migration. In section 4, we present implementation details and evaluation results from the current prototype. Section 5 presents ongoing work in categorizing adaptivity, augmenting our declarative tapestry configuration with a high-level functional composition specification language, and using Weaves to provide run-time systems support for migration of parallel communication codes in a computational grid environment.
3 Motivation
The primary motivating application for the Weaves framework is a large-scale network emulation test-bed called the Open Network Emulator (ONE). The goals of the ONE project are
1. To provide a protocol development environment that closely models real world networks. This
requires scalability to the order of hundreds of thousands of virtual network nodes, where a network node is an end-host, a router or a network switching device.
2. To support execution of unmodified protocol and application code.
3. To integrate direct code execution and protocol simulation within a single framework. In the
traditional model, direct code execution operates in real time and hence lacks the controllability of virtual time simulation.
The above goals present several interesting research challenges. First, given the scalability goal - support for hundreds of thousands of virtual nodes - we need to analyze its impact on the architecture of the ONE. Even on a cluster computer with hundreds of physical nodes, we need the ability to emulate hundreds of virtual nodes – each with multiple applications and a protocol stack - on a single physical CPU. At this level of scalability, process context switching time becomes a major bottleneck.
The second major challenge stems from the need to support the direct execution of unmodified application and protocol code and network simulation. A direct code execution environment (DCEE) eli
minates the verification and validation problems associated with converting the event driven simulated version of the protocol to a real-world code implementation. However, this does not mean that DCEEs solve all problems, and network simulation is no longer necessary. Distributed network protocols, such as routing protocols are usually modeled as state machines. This makes it easier to convert the model into the event driven form needed for network simulation. An iterative design process is then used to refine the protocol and finally it is implemented in code. It is easy to see why a DCEE will be a cumbersome hindrance to this process. At each stage, the state machine has to be converted to code form before it can be tested - an unnecessary burden. The goal of the ONE is to present an integrated environment that supports both direct code execution of network protocols as well as event-based simulation.
Support for the above goals argues for a component-based design. However, component technologies such as CORBA and COM/DCOM require significant buy-in, in the form of commitment to a particular style of programming or an implementation technology. This is a serious hurdle for large, legacy code bases, which cannot be easily ported to a new modeling framework. The primary requirement here is for a transparent framework that supports modeling, composition, execution, adaptation, and control of high performance codes, which leads us to the Weaves reconfigurable programming framework.
4 The Weaves Reconfigurable Programming Framework
The design goal of the ONE requires support for multiple virtual hosts – each with a protocol stack and multiple applications - within a single process. To operate within the confines of the direct code execution paradigm, the above design goal necessitates a new programming paradigm. We will illustrate this need by a series of examples culminating in a paradigm that can support the compositional model shown in Figure 2.
Figure 1: Design goal of the Weaves framework. We need to support multiple unmodified network applications linked to protocol stacks, within a single process.
Before we attempt to design a programming model that can support the interaction shown in Figure 2, let us start with a case depicted in Figure 3. Here, two telnet applications are linked against a single IP stack, emulation one virtual end-host.
Figure 2: A simple compositional model with 2 telnet applications linked with a single IP stack. The composition creates a single virtual host.
In order to model the composition shown in Figure 3, let us start with a simple case, which we call process per virtual node model as shown in Figure 4a. This model allows a single network application – say telnet – to be linked to an IP stack. The external references of the telnet application will be bound to the stack, which then uses an underlying PDES layer. Multiple such processes can be run on a work
station to create a virtual network. This simple approach has several problems. First, we cannot link multiple telnet applications to the same IP stack without modifying the application. Since the IP stack in a network node is shared between all applications running at that node, traffic from the various applications can interfere with each other. For instance take a network node that supports both real time applications such as videoconferencing along with best effort applications such as FTP. The traffic from the real time application interferes with the best effort FTP application within the IP stack, resulting in possible less than adequate performance for the real-time application. This effect cannot be studied in the simple process per virtual node model presented here.
The second major problem with the process per virtual node model arises from scalability concerns. A large virtual network will involve several tens to hundreds of virtual nodes, each of which is a process. Inter-process context switch time will be a major concern.
Finally, the PDES algorithm needs to synchronize virtual time between all the processes that constitute the virtual network. If we use a process per virtual node, we increase the number of instantiations of the PDES algorithm and hence the number of independent time lines. This increases the chances of virtual nodes going out of synchronization, causing temporal rollbacks. If we can support multiple virtual nodes within a process, all virtual nodes within the process use a common timeline and hence d
o not cause unnecessary rollbacks. This reduces the scope and scale of temporal rollback to distinct physical processors participating in the simulation.
Virtual Host 1Virtual Host 2 (a) Process per virtual node model
(b) Threads model
Figure 3: The composition shown in figure 3 modeled under (a) process per virtual node model and (b) threads model. Neither of these models can achieve the composition shown in figure 3.
To address the above concerns, let us build a new model based on threads as shown in Figure 4b. In the threads model, the composition shown in figure 3 can be achieved by modifying the telnet application source to create two threads, or inserting a piece of stub code that creates two threads, with the start function of each thread set to the entry point - main() - of the telnet application. As before, the emulated IP stack linked to the application will resolve external references to the IP layer.
The major problem with this approach arises from updates to global variables, in particular telnet is not thread safe. Since threads share global variables, a telnet thread modifying a global variable will inadvertently change the state of the other – unrelated - telnet thread causing erroneous behavior. Ideally, we need 2 copies of all the global variables used in the telnet application. In programs that are explicitly threaded in design, sharing of global state in intentional. In our case, this sharing is neither intentional nor necessarily desirable. On the other hand, to create a shared IP stack, we need to share global state within the IP stack between the threads running through the IP stack. We also need to serialize access to global state through the use of mutex locks, which involves significant modifications to the source code. Even if we could make these modifications, we cannot ensure that the resultant code is deadlock free. In effect, we are back to our original verification and validation problem faced in network simulation. There is no easy way to formally ensure that the threaded version of the network application is equivalent to the unthreaded version.
The threads example illustrates the crux of our problem – conflicting needs – the need to avoid sharing global state between threads of the telnet application and the need for sharing global state between the threads running through the IP stack. The intuition here is that we need a programming model that allows arbitrary sharing of global state. Such a model can subsume both the thread and process model
s, since it can allow both complete sharing of global state as in threads as well as no sharing as in processes.
The solution to this problem through the use of a compiler-based approach evolves from our earlier work on EtheReal, a real time switch that provides bandwidth guarantees for switched Ethernet networks. One of the major components of EtheReal is an automatic fault detection and recovery mechanism that operates within the temporal confines of a real-time system. The fault detection and recovery mechanism relies on the fast construction of a distributed spanning tree – an algorithm that is critical for Ethernet routing. While the prototype implementation of EtheReal over a 3 switch network showed the feasibility of the approach, the size of the prototype network was too small to verify scalability. Secondly, we needed a controlled testbed environment to verify the correct operation of the spanning tree algorithm, in particular its ability to withstand the message loss during spanning tree construction.
springframework jar包下载To test the scalability and correctness of the spanning tree code, we decided to build a simulator that partially implements a direct code execution environment. The simulator simulates a point-to-point switched Ethernet network with multiple virtual Ethernet switches and exposes an interface similar to the internals of the Linux operating system. This allowed us to directly execute the spanning tree code
from the EtheReal implementation on the simulator, with one modification. All global state variables in the fault recovery code in the simulation are array versions of their counterparts in the implementation. Each virtual switch accesses its copy of the global variables by indexing into the array with its virtual switch identifier. In effect, the simulator uses a single copy of the spanning tree code and creates virtual instances of the algorithm by switching the state associated with the code. The principle here is similar to multi-tasking. If we can capture the current state of a state machine, it should be possible to replace the state to create a new instantiation of the machine.
If we take the next step along these lines, we could avoid the task of manually replacing global variables with arrays, by automating the process. This leads us to the first step towards Weaves reconfigurable programming framework.
4.1 Defining the Framework.
The major components of the reconfigurable programming framework are:
1. Module: A module is any object file or collection of object files defined by the user. Modules
have:
a. A data context, which is the global state of the module scoped within the object files of
the module.
b. A code context, which is the code contained within the object files that constitute the
module. The code context may have multiple entry point and exit point functions.
2. Bead: A bead is an instantiation of a module. Multiple instantiations of a module have
independent data contexts, but share the same code context.
3. Weave: A weave is a collection of data contexts belonging to beads of different modules. The
definition of a weave forms the core of the reconfigurable programming framework.
Traditionally, a process has a single name space mapped to a single address space. Weaves allow users to define multiple namespaces within a single address space, with user-defined control over the creation of a namespace.
4. String: A string is a thread of execution that operates within a single weave. Similar to the
threads model, multiple strings may execute within a single weave. However, a single string cannot operate under multiple weaves. Intuitively, a string operates within a single namespace.
Allowing a string to operate under multiple namespaces would violate the single valued nature of variables.
5. Tapestry: A tapestry is a set of weaves, which describes the structure of the composed
application. The physical manifestation of a tapestry is a single process.
The above definitions have equivalents in object-oriented programming. A module is similar to a class and a bead - which is an instantiation of a module – is similar to an object. Tapestries are somewhat similar to object hierarchies. The major exception is that interaction between beads within a tapestry involves runtime binding. We chose to use our own terminology to (a) avoid overloading the semantics of well-known OOP terms and (b) avert the implication that the framework requires the use of an OOP language.
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论