Job Scheduling

Abstract

Web services are now established as a viable component architecture for constructing large, distributed systems. Good progress has been made on data grid components, in particular the Storage Resource Manager interface specification, for which multiple, interoperable implementations are being developed. The logical next step is to begin to prototype the deployment of computational components, in particular batch web services. Both the STAR experiment and Jefferson Lab are ready to begin this prototyping effort, having already deployed SRMs and other XML technology. These two groups propose to develop a common batch web service interface definition, analogous to the SRM data management interface specification, as part of their respective Year 3 PPDG activities. The prototype implementations will immediately serve the respective communities, and will serve as valuable input to a larger community effort.

Introduction

Web services are typically described using WSDL, with remote service invocations being conveyed via XML encoded messages. These two aspects are ideal for constructing loosely coupled, independently evolving components. Firstly, because the WSDL describes all aspects of the call without regard to implementation, a specification described in this way is by construction an interoperability specification. Secondly, since XML passes data by name and not by position, it supports adding new features without breaking backward compatibility (i.e., schema evolution, with only a few constraints on naming). In particular, optional arguments or returned values can be easily added without breaking old clients or servers; changes can be transparent.

The SRM interface specification, developed collaboratively and now supported by multiple independent implementations, has demonstrated the value of this approach in a real world data grid component. Additional data grid components are needed, and are beginning to be targeted. In the coming year, both STAR and Jefferson Lab will also need to deploy computational grid components, and would like to take a similar approach for these as has worked so well for the SRM.

The STAR Scheduler project ^¹ is the core software used by the STAR collaboration in order to submit jobs on a farm, site or the Grid. Initially using a simple description with lists of physical files, the tool has now interfaced with the STAR File and Replica Catalog allowing for the use of files statically distributed on the fabric and in direct connection to what is defined as logical collections. Coupled with a converging XML based JDL (job description language), this tool is part of a main strategy to slowly move users to a submission model based on Grid resources. Its current implementation also include a basic resource broker and discovery using node based monitoring information.

The current approach for Grid submission is based on the use of Condor-G. Web-based job monitoring and statistics gathering is provided as a first approach for the job monitoring (will later also evolve toward a Web Service approach of the usage of the Scheduler).

While the architecture is currently well defined, it is clear however that its components would need to be consolidated and re-engineered to account for the road map laid out by emerging technology such as GT3 and the OGSA effort. Additionally, some efforts are required to implement features such as dynamic flow control for the required input for a job and testing of interoperability, to name only those.

Jefferson Lab has recently deployed Auger, a replacement for the JOBS batch job system. Auger is slated to have an XML based web services interface this coming year^². Auger serves an overlapping set of functionality as the STAR’s scheduler, but without the connection to the experiment data and replica catalog (yet).

Since both sites are currently engaged in XML and web services related batch developments, the JLab and STAR team therefore propose to work together to achieve this goal in a few steps:

Evaluation of the activities of the partnering team, with a view towards developing a common understanding of the problem to be solved
Work together on the definition and strengthening of a high level user Job Description Language (U-JDL) as needed to describe both multiple parallel-jobs (lattice calculations etc…) and large numbers of serialized jobs (statistically driven data mining). This work will be presented to PPDG as an input to a baseline for a high level job description language.
Use of Web services as the implementation / interoperability framework for a computational grid (loosely coupled, hence very scalable). A component design will be needed before the migration phase.
Use of SRM as an example of a successful approach via interfaces (gaining wide support) for dynamic data management approach

Many of those activities are either started or part of our mutual plans at this stage (job submission, JDL or U-JDL, Web service approach and XML schema for the named components) so working together will reduce the total effort and yield to re-usable solutions. Moreover, the different computational approaches of the two groups working on a common goal will likely lead to a stronger solution.

Specific development activities will become more clearly defined as this activity goes along, but certain components can already be anticipated, and are described in the following section.

Development Goals

The following is a list of development goals leading towards a web services based batch system for both STAR and JLab.

1. Prototype batch web service interface (near term)

job described in XML, with tags sufficient for near term needs
inputs and outputs can reference SRM managed data or RC (replica catalog) entries
Batch capabilities will include all basic operations of PBS and LSF (submit, suspend, resume, cancel or kill), and will support both serial and parallel jobs (fixed size, not dynamic in first version)

   2. Implementation for single site over PBS and/or LSF, and possibly Condor-G

   3. Basic meta scheduler implementation

Will have the same interface as site implementation

Will use SRM / RC to co-locate job and data, then dispatch to site batch web service.

Some form of load monitoring will be employed (not ultimate in first version), and multi-site fair share will be attempted Experience and components used by the STAR Scheduler may be employed.

Accepted Constraints on the Prototype
   1. Grid monitoring may be very rudimentary in 1st year
   2. Network bandwidth estimates may be poor in 1st year
   3. Job graphs may not be supported in first year (multi-site dependencies)
   4. Proxy credentials may need to be long enough lived for entire batch job
   5. Executables must be available for target platform (no auto-compilation or bundle approach)

Additional external team effort will be seek for the consolidation of this project as required and as it generates interest within a larger audience.

Future Extensions

1. Job Portal as a client above the web services: interactively aides in generating meta-jobs (multi-file and parameter sweep), remembers last parameters so simple changes are easily generated, etc.

There exists the possibility that a number of client tools, and ultimately a grid meta-scheduler, could be shared by multiple experiments.

Dependencies

This specification and corresponding implementations, will build upon (use) the SRM interface for file migration. It will need at least some level of Replica Catalog lookup and publish, and if no standardization of that has occurred, an interim joint standard will be developed for use by STAR and JLab. Similarly, interim or standard definitions of web services based monitoring of batch sites (compute elements) for load will be needed. All such interfaces used will be carefully documented so that the implementations can be adapted to other systems. Finally, some standardization of a mechanism for multi-site logging of usage would be nice so as to achieve multi-site fair share (minor goal or multi-year goal, to be implemented if resources permit).

Milestones

10/01/2003: Deliver the requirements and definition for the U-JDL for batch job description; gather feedback from the PPDG collaboration.
10/20/2003: Design implementation completed and first version of WSDL for batch web service.
11/15/2003: First implementations of site (not meta) interface, to map WSDL, XML onto PBS, LSF or Condor submissions.
02/15/2004: First implementation of meta scheduler.
04/15/2004: More robust implementation, including file location using Replica Catalog interface, file and/or job migration using SRM, output publication using SRM and Replica Catalog, (at least rudimentary) load & file location based dispatch (support for) and job accounting information going to a distributed database (Web Service).

1 http://www.star.bnl.gov/STAR/comp/Grid/scheduler/about1.html

2 See http://cc.jlab.org/announce/newsletter/CCNLJul03.html for recent Auger news