GSoC/GCI Archive
Google Summer of Code 2009

OAR

Web Page: http://wiki-oar.imag.fr/index.php/Google_summer_of_code

Mailing List: oar-gsoc@lists.gforge.inria.fr

OAR is a Resource Management System (RMS) for high performance computing clusters. It is based upon an original design that emphasizes on low software complexity by using high level components. The global architecture is built upon the scripting languages Perl and Ruby, a relational database engine Mysql/Postgresql and a parallel/scalable remote execution tool for clusters TakTuk (http://taktuk.gforge.inria.fr/). OAR is used as the base of other sub-projects like CiGri (http://cigri.imag.fr) and ComputeMode (http://comutemode.imag.fr). CiGri is a simple computing grid for parametric applications grabbing idle cpus. It is plugged on OAR for grid scientific experiments. Moreover, OAR is the RMS of the French experimental grid platform Grid5000 (http://grid5000.fr). OAR projects' objective is to prove that it is possible today to build a complex system for resource management using such tools without sacrificing efficiency and scalability.

Projects

  • Advanced scheduler for the OAR/CiGri The current scheduler of OAR/CiGri present two strong limitations: it is FIFO-based and does not support parallel tasks. When grids run at high loads, users might be blocked for days, even if they need very few resources. Besides, the lack of support to parallel tasks prevent the execution of traditional HPC applications. The goal of this project is to develop a new configurable and extensible scheduler supporting parallel tasks, fairsharing, several queues with priorities and job interlacing.
  • OAR Fault tolerance OAR is an Open Source batch scheduler which provides a simple and flexible exploitation of a cluster. The aim of this project is to make OAR fault tolerant. Indeed, two nerve centers have to be reinforced. OAR server and the database (mysql and postgres). At this time, OAR system is not protected towards scalability. It can become very embarrassing. The idea is to search, compare, test and implement the best solution for server and database fault tolerance.
  • OAR LiveCD and Virtualization I'm proposing to create a tool (that will be called Kameleon) whose purpose will be to generate OAR appliances or iso images. OAR project already have a script that creates a LiveCD from the debian packages. I'm intending to start from this script and add the following features: support for rpm based distributions, support for various output formats (cow, iso, tgz), auto-config for virtual cluster (i.e. ZeroConfig).