Current short-term plan

This plan will focus on these aspects:

Provide something usable for year 2002-2003

The main focus of this work is to provide a simple tool that can be readily used. There has been work in Wayne State University (WSU) to develop a scheduler that basically uses some functionalities of LSF, which is the system currently used for job submission.

The model for year 2001-2002 has been the following:

The new scheduler developed at WSU does the following:

Therefore the main task is to provide assistance to manage job submission on multiple files, and to allow the files to reside on the local disk of the nodes of the farm. This work has been developed at Wayne State University by  Vishist Mandapaka, a student, under the supervizion of Prof. Claude Pruneau. The scheduler has the following strong point:

Freeze the user interface for job submission for the years to come

The scheduler at WSU has a user interface different from pure LSF. We have been reviewing this interface so that it will allow us to leave it unchanged even if we change the underlying scheduler (i.e. if we pass to Condor and other tools).

Note that changing both the underlying scheduler and the user interface would be traumatic for STAR. STAR users are already submitting jobs, and we cannot expect them to change their scripts and habits in one day. A period of transition has to be allowed. The following has to be taken into consideration:

By changing the user interface only, and leaving the underlying scheduling system (i.e. Wayne scheduler works on LSF which is being used directly by users now) farm managers can assist users to make modification to their script and tuning the system accordingly to the change in usage patterns. Afterwards the actual scheduler can be changed, but the interface can be kept. This has the following advantages:

Job description is done through an XML file, and the detailed description can be found here.

Define an architecture that allows migration to other GRID tools

The Wayne scheduler is designed as a stand-alone project, and not with the GRID in mind. Since, as we said, this is a temporary solution, we need a software architecture that allows us to integrate other components (e.g. the file catalog, the database catalog, ...) and to change the underlying implementation (e.g. Condor instead of LSF).

In order to do that, we created a modular architecture in which different parts of the scheduler are decoupled, so that those functions can be assigned to different GRID tools.

Basically the process of scheduler is divided into the following parts:

The proposed architecture is described in more detail here.

Provide user interface for scheduler policy management

One thing the farm managers will want to be able to change is the policy according to which a job is associated the input files, divided in different jobs and dispatched to the machines. The policy manager in the future will:

To provide this, the proposed architecture isolates the policy behavior in a Java interface. Exploiting the ability of java to load a class by it's name, the policy can be changed without having to redeploy. This allows the policy manager to test the new policy, change it gradually and go back to the previous policy without disruption.

The details on the Policy class can be found here.


Gabriele Carcassi - page was last modified