Job Scheduling

Abstract

Web services are now established as viable component architecture for constructing large, distributed systems. Good progress has been made on data grid components, in particular the Storage Resource Manager Interface specification, for which multiple, interoperable implementations are being developed. The logical next step is to begin to prototype the deployment of computational components, in particular meta-scheduler Web Services. Both the STAR experiment and Jefferson Lab are ready to begin this prototyping effort, having already deployed SRMs and other XML technology. These two groups propose, by studying the architecture components of the STAR scheduler[1] or the JLab AUGER system[2], to develop a common meta-scheduler Web Service interface definition, analogous to the SRM data management interface specification, as part of their respective Year 3 PPDG activities. The prototype implementations will immediately serve the respective communities, and will serve as valuable input to a larger community effort.

Introduction

Web services are typically described using WSDL, with remote service invocations being conveyed via XML encoded messages. These two aspects are ideal for constructing loosely coupled, independently evolving components. Firstly, because the WSDL describes all aspects of the call without regard to implementation, a specification described in this way is by construction an interoperability specification. Secondly, since XML passes data by name and not by position, it supports adding new features without breaking backward compatibility (i.e., schema evolution, with only a few constraints on naming). In particular, optional arguments or returned values can be easily added without breaking old clients or servers; changes can be transparent.

The SRM interface specification, developed collaboratively and now supported by multiple independent implementations, has demonstrated the value of this approach in a real world data grid component. Additional data grid components are needed, and are beginning to be targeted. In the coming year, both STAR and Jefferson Lab will also need to deploy computational grid components, and would like to take a similar approach for these as has worked so well for the SRM.

The STAR Scheduler project is the core software used by the STAR collaboration in order to submit jobs on a farm, site or the Grid. Initially using a simple description with lists of physical files, the tool has now interfaced with the STAR File and Replica Catalog allowing for the use and discovery of files or data collections statically distributed on the fabric. Coupled with a converging XML based U-JDL (user job description language), this tool is part of a main strategy to slowly move users to a submission model based on Grid resources. The U-JDL approach was chosen to allow the user to advertise to the Scheduler his intent (which dataset would be processed for example) rather than specifying the way the job has to be run, the meta-scheduler therefore serve the function of a basic resource broker and discovery.

The current envisioned approach for Grid submission relies on the use of Condor-G (in development). A Web-based job monitoring and statistics gathering (including usage, option tracking etc …) is provided as a first approach for job monitoring. In an early development phase, it has lower priority as the future direction would greatly benefit from a design work and strengthening.

While the architecture is currently well defined, it is clear however that its separate components would need to be consolidated and re-engineered to account for the road map laid out by emerging technology such as GT3 and the OGSA effort. Additionally, some efforts are required to implement features such as dynamic flow control for the required input for a job and testing of interoperability, to name only those.

Jefferson Lab has recently deployed Auger, a replacement for the JOBS batch job system. Auger is slated to have an XML based web services interface this coming year. Auger serves an overlapping set of functionality as the STAR’s scheduler, but without the connection to the experiment data and replica catalog (yet).

Since both sites are currently engaged in XML and web services for meta-scheduler based developments, the JLab and STAR team therefore propose to work together to achieve this goal in a few steps:

Evaluation of the activities of the partnering team, with a view towards developing a common understanding of the problem to be solved
Work together on the definition and strengthening of a high level user Job Description Language (U-JDL) as needed to describe both multiple parallel-jobs (lattice calculations etc…) and large numbers of serialized jobs (statistically driven data mining). This work will be presented to PPDG as an input to a baseline for a high level job description language.
Use of Web services as the implementation / interoperability framework for a computational grid (loosely coupled, hence very scalable). A component design will be needed before the migration phase.
Use of SRM as an example of a successful approach via interfaces (gaining wide support) for dynamic data management approach.

Many of those activities are either started or part of our mutual plans at this stage (job submission, JDL or U-JDL, Web service approach and XML schema to name only those) so working together will reduce the total effort and yield to re-usable solutions. Moreover, the different computational approaches of the two groups working on a common goal will likely lead to a stronger and better defined solution.

Specific development activities will become more clearly defined as this activity goes along, but certain components can already be anticipated, and are described in the following section.

Development Goals

The following is a list of development goals leading towards a web services based meta-scheduler system for both STAR and JLab.

1. Prototype web service component interface (near term)

a. Meta Job described in XML, with tags sufficient for near term needs (U-JDL definition)

b. inputs and outputs can reference SRM managed data or RC (replica catalog) entries

c. Batch capabilities will include all basic operations of PBS, LSF, Condor, … (submit, status, suspend, resume, cancel or kill) and will support both serial and parallel jobs (fixed size, not dynamic in first version). Will collaborate as much as achievable with similar efforts aimed toward a general batch Web Service interface definition.

   2. Implementation for single site over PBS and/or LSF and/or Condor.

   3. Basic meta-scheduler implementation

Will likely have the same interface as site implementation

b. Will use SRM / RC to co-locate job and data, then either dispatch to site meta-scheduler web service or use existing Web Service components as available (Condor-G Web Service for example)

c. Some form of load monitoring will be employed (not ultimate in first version), and multi-site fair share will be attempted Experience and components used by the STAR Scheduler may be employed.

Accepted Constraints on the Prototype
   1. Grid monitoring and job tracking may remain at a very rudimentary level in 1st year
   2. Network bandwidth estimates may be poor in 1st year
   3. Job graphs may not be supported in first year (multi-site dependencies)
   4. Proxy credentials may need to be long enough lived for entire batch job
   5. Executables must be available for target platform (no auto-compilation or bundle approach)

Additional external team effort will be seek for the consolidation of this project as required and as it generates interest within a larger audience.

Possible Future Extensions

· Job Portal as a client above the web services: interactively aides in generating meta-jobs (multi-file and parameter sweep), remembers last parameters so simple changes are easily generated, etc.

There exists the possibility that a number of client tools, and ultimately a grid meta-scheduler, could be shared by multiple experiments.

Dependencies

This specification and corresponding implementations, will build upon (use) the SRM interface for file migration. It will need at least some level of Replica Catalog lookup and publish, and if no standardization of that has occurred, an interim joint standard will be developed for use by STAR and JLab. Similarly, interim or standard definitions of web services based monitoring of batch sites (compute elements) for load will be needed. All such interfaces used will be carefully documented so that the implementations can be adapted to other systems. Finally, some standardization of a mechanism for multi-site logging of usage would be nice so as to achieve multi-site fair share (minor goal or multi-year goal, to be implemented if resources permit).

Milestones

10/01/2003: Deliver the requirements and definition for the U-JDL for batch job description; gather feedback from the PPDG collaboration.
10/20/2003: Design implementation completed and first version of WSDL for batch web service.
11/15/2003: First implementations of site (not meta) interface, to map WSDL, XML onto PBS, LSF and Condor submissions.
02/15/2004: First implementation of a meta-scheduler.
04/15/2004: More robust implementation, including file location using Replica Catalog interface, file and/or job migration using SRM, output publication using SRM and Replica Catalog, (at least rudimentary) load & file location based dispatch (support for) and job accounting information going to a distributed database (Web Service).