An Open Source Distributed Application for Network--688IT编程网

IV Seminário Fluminense de Engenharia

Niterói, RJ, Brasil, 8-10 de novembro de 2005

An Open Source Distributed Application for Network Data

Sharing

Douglas Teixeira, Luiz Claudio Magalhaes e Maria Luiza Sanchez

Centro Tecnológico, Escola de Engenharia – Universidade Federal Fluminense

Rua Passo da Pátria, 156 – Bloco D – São Domingos – CEP: 24.210-240

Niterói, RJ – Brasil

{vidal,mluiza,schara}@midiacom.uff.br

RESUMO

Este artigo descreve uma aplicação construída para o compartilhamento de informações em Redes.

É uma ferramenta útil construída sobre o protocolo de transporte UDP para compartilhamento em tempo real de qualquer tipo de informação, como textos digitados por usuários ou informações presentes localmente nas máquinas. Fácil de adaptar e de implementar mudanças. Além disso, discutimos nossas experiências pessoais sobre programação concorrente em Redes usando UDP como comportamentos inesperados do Sistema Operacional. Apresentamos os mecanismos desenvolvidos para a obtenção de boa usabilidade e desempenho, mantendo a consistência e a confiança na troca de informação entre os processos concorrentes.

Palavras-Chave: Protocolos de comunicação; Aplicação distribuída em redes.

1. INTRODUCTION AND BACKGROUND

The work presented is part of larger project on Interactive Television within the free-software initiative funded by the Brazilian Government. This project has been developed in the MIDIACOM lab from the Telecommunications Engineering Department from the Federal Fluminense University. This paper presents a C implementation that has become an efficient and applicable tool for network data sharing over UDP (STEVENS, W.RICHARD, 2001). The project is a fully distributed network application that allows users to interact in real time for dynamic information sharing. Also, it allows us

ers to change, with simple code modifications, the type of the shared data. Instead of sharing specific user information it can share the local machine running processes or external users who are logged in, etc. Although the implementation is short, the resulting application is quite complex due to the implemented mechanisms applied to overcome some OS (Operating System) difficulties. During our work, we had to deal unexpected OS behaviors related to the environment we have chosen to work The OS is Slackware Linux with kernel version 2.4.26 and also 2.6.9 for test. Reporting those experiences and our solutions can offer some help to programmers who may be dealing with the same issues.

Internet has become an important subject for studies and emerging applications for different types of needs. However, most applications use TCP because it offers stronger network semantics. The reason for this is that the underlying Internet protocol does not offer reliable delivery. Today, both bandwidth available to Internet applications and their

complexity have increased. UDP gives applications programmers more liberty to change and build their own approaches in the transport level than TCP does. Thus, depending on the purpose, UDP can be used as a tool for applications programmers build their own protocols.

In terms of functionality, our application is very similar to the NETBIOS network service (wwwbiosguide, 2005). However, NETBIOS is a network identification service while our application implements a database working similar to it for the machines identification. A sample of a complex distributed application using UDP can be seen in (ZHUAN, GEELS, STOICA AND KATZ; SHELLEY, DENNIS, ION AND RANDY, 2003), where a keep-alive framework was built with high level of reliability and efficiency. In (L. MAGALHAES AND R. KRAVETS, 2001) there is a suite of protocols, each one for different purposes, all implemented over UDP. In (L. MAGALHAES AND R. KRAVETS, 2001), besides being also related to the previous work, it is specified a new transport protocol for multimedia applications, relaying mobility and packet priority, therefore, control and flow

mechanisms are also implemented. We can find in (BIRMAN, KENNETH P, 1995) a good reference for implementing distributed network applications, where some very interesting discussions can be found about the evolution of distributed OS and related subjects. The basic knowledge about distributed OS and network applications can be found in (TANENBAUM, ANDREW S, 2000). Furthermore, a good reference for distributed programming can be found in (GALLMEISTER, BILL O, 1995).

This paper is laid out as follows: our application is presented in Section 2; in Section 3 we discuss the related problems we had to deal and how we developed two internal mechanisms to solve those problems; finally, Section 4 concludes this paper with some discussion about how the application can be applied for other data sharing purposes and its impact on the stations.

2. APPLICATION OVERVIEW

The initial idea was the creation of a client-server for users’ dynamical data network registration and querying. We developed one single application which allows the user both to register the data and to query machines for registered data. Therefore, the final result was a fully distributed single application that allows the dynamical share of any kind of information at the local machine and searc

hes the entire network for other registered information. This search can be done by giving the needed information or by giving the machine name. The application was written in C, using the Slackware distribution of Linux, version.10, with kernel version 2.4.26 and gcc compiler.

In figure 1 we represent the block diagram of the whole application. The program starts with a menu where the user can choose one of the following options: Input local data, Delete local data, Change local data, Search and Quit. If no information is stored, change and delete local data have no effect. If “input local data” is chosen, the user will be asked to input whatever he wants. Then, the program returns to the same menu, waiting for a new option. At the same time a child process starts to answer any request for existing data via querying id packets. Changing and deleting local data options stand available.

If the Search option is chosen, there will be a broadcast probing for machines which are running the application and have any data ready to share. Those machines will answer the querying machines directly through the IP address. The information retrieved will be displayed and stored showing the information and the machine’s name.

The local information to be shared resides in the following structure:

struct packet_data{

char name_host[50];char data_buffer[INFO_SIZE]; char ip[16]; char id}.

Name host, ip and id are used by the program and do not need any kind of interaction from user. The structure stores the local machine name and IP address as well as the data (data_buffer ) with configurable length, to be filled by the user at running time or not, and an “id ” char to identify the type of packet. There are only three types of packets defined: probing, answer and kill .

Probing packets are sent when user requires acknowledgment from other computers in the network, which means the “Search Network for Data” menu option. When selected, this option stops the main program and starts a child process. The child process will send broadcast packets in the network searching for terminals running the application that have data to share (application may run without a

ny data to share, though, it will not answer to querying calls). The child process will wait during a specified time chosen by the programmer (we are now using 5 seconds) for answers from network. After receiving the querying packet, it displays the machine’s name and the shared data and stores this information in a log archive. At this point it is important to specify that programmers have the freedom to filter the incoming data by specifying the machine’s name or the expected information. After that, the child broadcast process is killed and the main program returns to ...

...

its activity. Data answered by the broadcast probing is only stored at local buffers until more

data is received by the same answering machine. By doing that, we guarantee network data

consistency every time an application performs broadcast search. Furthermore, probing

packets do not follow the above packet structure above, since it has no local data to send.

Therefore they are smaller and carry only the necessary information for network machines to

answer it: host name, id and ip, they do not need any interaction from application

programmer and the packet is declared as “pkt_data_broad” in the main program. Sending

the ip address in a high-level packet layer is necessary because of an OS peculiarity. We

have overcame this lack of structure by explicitly storing the querying machine’s ip address:

broadcast datagrams cannot fill the inet_addr packet structure using the recvfrom function

call, therefore, processes that are receiving network broadcast datagrams cannot answer to

the sending process directly.

Answer packets are sent to reply querying calls. Since all kinds of packets sent to a

machine running a recvfrom function call can be accepted, when a query is made, the

application will only accept packets with the answer id, avoiding packets mismatch. Those

packets are sent by the child process application running in parallel with the program using fork function calls. This child process starts when the application has any data to share. After

the user has entered data via “Input Local Data” menu option, the main program returns to

the loop’s starting point and then searches in the local buffer for data, if there is data

present, the child process starts and remains quiet listening for probing or kill packets. When

probing packets arrive, the application retrieves from the sending machine its ip address

using the “ip” structure member and creates another child process for the answering

proceedings. The new child process will then send to the querying machine its own name, ip

address, shared data and answer id in the answering packet. When the child process

receives a “kill” packet, it will change or delete the local shared data. Every time a probing

packet arrives in the waiting child process, another child process is created and then killed to

assure immediate answer and fast return to the waiting state

Kill packets are sent every time user chooses “Delete local data” or “Change local

data” menu options. They are loopback packets intended only to finish or change the current

state of the child application. This can be done killing the child process in the main

application; however, one last child will still be live, waiting for a probing packet which will

retrieve to the querying process an inaccurate answer. Sending a loopback packet changes

immediately the child process state.

Finally, when user chooses the “Finish program” menu option, we assure that not

only the main process is terminated but also any answering child processes that may be

remaining. Therefore it is not recommended to suddenly interrupt the program causing

inconsistency of the shared information.

3. APPLIED MECHANISMS FOR UNDESIRABLE BEHAVIORS

The first problem our application had to deal was to construct a model for receiving

multiple calls and treat them separately in a child process working in parallel with the main

process. Of course, TCP offers us the listen call which, when associated with a simple fork,

provide us concurrent processes communication. When using UDP, a simple instance of fork associated with the recvfrom function call could be sufficient. However, that not happened the way we expected. Figure 2 demonstrates what is the desired behavior we expected, and in Figure 3 is shown what really happens if no further treatment is done in the child process. To overcome that situation, we have built a simple model detailed in Figure 4 where every time an identified probing packet arrives, a new process is created to answer it while the primary child process returns forcedly to the beginning of the internal loop. The child answering processes are killed as soon as they finish their tasks.

The second issue treated was some unexpected events occurring in the main application when using some stdio library functions, such as getc and getchar and the recfrom functions [9]. It is quite intuitively that, when programming with more than one descriptor, it is necessary to pay attention to which descriptor we are listening in a given moment. Despite being careful enough to guarantee that, we could not manage to mix the receiving child process with the user menu prompt and with the broadcast querying operation. The trouble occurred after two well defined events: fist, the user registers any data (“Input local data” option), after that, the user chooses to query the network (“Search Network for Data” option ). These two selections triggered an unstoppable loop. To solve that, we have done the following: after the selection of the querying option, we interrupted only the m

ain process where runs the user menu (keeping the answering child process running in parallel), and started another child process for the probing packets sending. In that way, it was possible to keep the application running normally without affecting the user.

4. CONCLUSION

We have developed a single fully distributed network application for data sharing over the UDP protocol. We have tested the application in a network with four stations running it among normal daily traffic: machine A was an Athlon 2800 Hz with 1 GB or RAM memory; machines B and C where two Athlons 2600 Hz with 512 MB RAM and machine D was a Duron 1000 Hz with 256 MB RAM. Tests were done as follows: each machine would run the application automatically. They were sharing its running processes table inside of a string with length varying between 3 and 3.5 KB. The entire

data packet

have about 5 KBytes, and probing packets have 80 Bytes. Each machine were querying the network with intervals of 4 seconds: 3 seconds required for the application to listen all incoming packets and 1 second of waiting time for each new request. The machines’ processes tables were updated each time a request was made.

Figure 2: Expected performance Figure 3: Obtained performance Figure 4: Performance after treatment

Using the top[10] Linux function, we were able to analyse each machine memory and CPU expenses. After 30 minutes we obtained the following means:

Machine CPU usage

(%) Memory usage (%)

A 0.3 0.08

B 0.3 0.1

C 0.3 0.1

D 0.392 0.228

As it was expected, machines with better processing and more memory would handle better the application. However, even for machine D both CPU and memory usages were low compared to other processes (such as the top processes which used in mean more than 1% of CPU and 0.4 % of memory in machine D). This proves that our approach was not heavy to the system, even under a stress situation.

Although the application was developed for user dynamic information sharing, it can be expanded for any kind of data share. A possible implementation is to start the application with the data buffer alrea

dy filled with some kind of machine information, as it was done in the test. Implementing a timer to update the shared buffer is useful to guarantee the reliability of data with the desirable time granularity.

5. References

BIRMAN, KENNETH P (1995). Building Secure and Reliable Network Applications. Department of Computer Science Cornell University. Digital version

GALLMEISTER, BILL O (1995). POSIX 4: Programming for the Real World. O'Reilly

L. MAGALHAES AND R. KRAVETS (2001). “MMTP: Multimedia Multiplexing Transport Protocol” The First Workshop on Data Communications in Latin America and the Caribbean SIGCOMM-LA 2001).

L. MAGALHAES AND R. KRAVETS (2001). “Transport Level Mechanisms for Bandwidth Aggregation on Mobile Hosts” The 9th International Conference on Network Protocols (ICNP 2001).

ROSS, KEITH W. E KUROSE, JAMES F (2001). Computer Networking: a top-down approach featuring the Internet. Boston: Addison Wesley, Inc.

STEVENS, W.RICHARD (2001). Unix Network Programming: Interprocess Communications with Advanced Programming in the UNIX Environment. Prentice Hall PTR.

TANENBAUM, ANDREW S (2000). Distributed Operating System. 2ND Edition.

ZHUAN, GEELS, STOICA AND KATZ; SHELLEY, DENNIS, ION AND RANDY (2003). Exploring tradeoffs in failure detection in routing overlays. Report No. UCB/CSD-3-1285. Computer Science Division – University of California.

688IT编程网

An Open Source Distributed Application for Network

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林算法的改进方法

基于随机森林算法的风险预警模型研究

Python中的随机森林算法详解

随机森林发展历史

如何使用随机森林进行时间序列数据模式识别(八)

随机森林回归模型原理

如何使用随机森林进行时间序列数据模式识别(六)

如何使用随机森林进行时间序列数据预测(四)

如何使用随机森林进行异常检测(六)

随机森林算法和grandientboosting算法 -回复

随机森林方法总结全面

随机森林算法原理和步骤

随机森林的原理

随机森林重要性

随机森林算法

机器学习中随机森林的原理

随机森林算法原理

使用计算机视觉技术进行动物识别的技巧

基于crf命名实体识别实验总结

transformer预测模型训练方法

最新文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

随机森林结合直接正交信号校正的模型传递方法

标签列表

688IT编程网

An Open Source Distributed Application for Network

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林算法的改进方法

基于随机森林算法的风险预警模型研究

Python中的随机森林算法详解

随机森林发展历史

如何使用随机森林进行时间序列数据模式识别(八)

随机森林回归模型原理

如何使用随机森林进行时间序列数据模式识别(六)

如何使用随机森林进行时间序列数据预测(四)

如何使用随机森林进行异常检测(六)

随机森林算法和grandientboosting算法 -回复

随机森林方法总结全面

随机森林算法原理和步骤

随机森林的原理

随机森林 重要性

随机森林算法

机器学习中随机森林的原理

随机森林算法原理

使用计算机视觉技术进行动物识别的技巧

基于crf命名实体识别实验总结

transformer预测模型训练方法

最新文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

随机森林结合直接正交信号校正的模型传递方法

标签列表

随机森林重要性