Like Hadoop and unlike most DG, PAR is designed to be used exclusively on private resources. PAR’s ideal scale is then smal-ler than what DG systems usually target, but this permits a lower latency. For simplicity, PAR uses pull-driven task distribution. This removes the need for a complex software component (called a sche-duler) and also allows to scale smoothly even in large, dynamic and heterogeneous environments. In addition, PAR never requires administrator privileges and is only run on-demand.

3 Example use

      The first example experiment consists of computing Alpha Carbons Root Mean-Square Deviation after optimal superposition, noted CαRMSDopt hereafter, on one thousand ab initio generated structures for the protein target 256B. Distances between proteins are computed using the software from (Zhang and Skolnick (2004)). The second experiment performs Molecular Replacement (MR), a method of solving the phase problem in X-ray crystallography using homologous structures, on a set of 192 decoys for the protein target 1m6t. We present the time elapsed with and without using PAR. PAR in parallel mode uses several cores of a given computer while the distributed mode uses distinct computers. The current imple-mentation of PAR is known to work well with up to 16 and 64 CPUs in parallel and distributed mode respectively.

       Prior to timing experiments, needed programs and data were copied to each machine by the user. During experiments, PAR was started in server mode with a list of commands to execute. Workers were started soon after the server, but could have joined the compu-tation later if we were not interested in the shortest completion time. The Unix ’time’ command was used and averaged over two trials to measure the real time spent by PAR to complete all tasks. Unlike previous job crushers, PAR server’s life cycle is only tied to the application’s execution time (no Unix daemon involved) and PAR runs only in user-space.

        Results are shown in Figure 1. The first bar is the real time elapsed when not using PAR. The second bar is the time spent when using PAR in parallel mode, following bars are durations in distributed mode. On a CPU-intensive task and when using 16 CPUs, the speedup obtained by PAR can be as high as 14.01 in the parallel case and 15.54 in the distributed one. Lower performance of the parallel version is attributed to Python’s problem with multithread applications (the Python interpreter uses a global lock mechanism shared by all threads). We can see that the application scales remarkably well. The overhead due to communications between workers and the master is very small, this allows for an effective use of the parallel hardware with minimum effort required on the user’s side.

4 Future developments

      PAR can be used on network of Unix-like workstations. It can take advantage of a Network shared File System (NFS). However,because of poor NFS performances, data-intensive tasks should be computed on top of a Distributed File System (DFS). As DFS are still rare even within clusters, we envisage to plug in such a func-tionality into PAR. A prototype has been implemented but is still in experimental stage.

      PAR should integrate fault-tolerance policies, in order to be used safely even with more workers over longer periods, and with minimal overhead.

      Furthermore, compression could be added to speedup communi-cations. Encryption would be similarly easy to add and would allow PAR to be used over untrusted networks.

      Finally, features can be added for large-scale experiments. For example, requesting groups of jobs instead of one at a time wouldlower the load on the server part. Allowing PAR to run both as a server and as a client would allow it to be deployed in layers, which could be used to connect several clusters together and incre-ase scalability. Requests and contributions from users are also considered.

上一篇:移动破碎英文文献和中文翻译
下一篇:风力发电技术英文文献和中文翻译

数控机床制造过程的碳排...

新的数控车床加工机制英文文献和中文翻译

抗震性能的无粘结后张法...

锈蚀钢筋的力学性能英文文献和中文翻译

未加筋的低屈服点钢板剪...

台湾绿色B建筑节水措施英文文献和中文翻译

汽车内燃机连杆载荷和应...

网络语言“XX体”研究

麦秸秆还田和沼液灌溉对...

老年2型糖尿病患者运动疗...

我国风险投资的发展现状问题及对策分析

互联网教育”变革路径研究进展【7972字】

新課改下小學语文洧效阅...

安康汉江网讯

ASP.net+sqlserver企业设备管理系统设计与开发

LiMn1-xFexPO4正极材料合成及充放电性能研究

张洁小说《无字》中的女性意识