![]() |
STAR Computing GRID Monitoring MonaLisa The Queue Monitoring MonALISA Module |
We are developing a Queue Monitoring Module for the MonALISA monitoring Framework. This custom module will provide aggregate status info for the most popular queuing systems (CONDOR, LSF, PBS, SGE) using the same attributes.
We aim to make the provided info compatible with the GLUE Schema. In particular, the Computing Element (CE) of the GLUE Schema represents the entry point to a Queue. There is one Computing Element per Queue.
The following table lists the attributes of the Status object of the CE Element:
Attribute | Description |
---|---|
RunningJobs | Number of currently running Jobs |
WaitingJobs | Number of jobs that are in a state other that running |
TotalJobs | Total Number of Jobs in the CE (Running + Waiting) |
Status | States a Queue can be in (Production, Closed, Queing, ...) |
WorstResponseTime | Worst time between job submission till when job starts its execution (in secs) |
EstimatedResponseTime | Estimated time between job submission till when job starts its execution (in secs) |
FreeCPUs | Number of Free CPUs available to the Scheduler |
A presentation with ideas on Queue Monitoring was given at the PPDG collaboration Meeting and is available here .
In developing the ML Queue Monitoring Module we will take into account
previously done work by others. In particular, the EDG has made available
an MDS information provider
that calculates the same attributes. In a
way, our goal is to convert the MDS IP into a MonaLisa Monitoring Module.
There is also some work done in providing attributes of the LSF and PBS
systems to MonALISA. Although the attrbutes provided are not GLUE-like,
we certainly look into this work too.
cd $MonaLisa_HOME/Service/usr_code tar -xvf queueMonitor.tar
lia.Monitor.CLASSURLs=file:${MonaLisa_HOME}/Service/usr_code/LSF/,file:${MonaLisa_HOME}/Service/usr_code/queueMonitor/
*QueueMonitor{queueMon, localhost, LSF,CONDOR}%30The last two entries in the curly braces (LSF,CONDOR) are the Batch systems that you want to monitor their queues. The available options are LSF, CONDOR, PBS, SGE. Remember, though, that currently we only have implemented queue monitoring for LSF and CONDOR. The names of the Batch Systems are NOT case sensitive and (if you use more that one) they must be separated by commas.
cd $MonaLisa_HOME/Service/CMD ./ML_SER restart
Stratos Efstathiadis - page was last modified