1.10.50
- Upgraded log4j libs to 2.17.0 (LH)
1.10.49
- Bug fix for reading events numbers when inputting with the file list
- Dynamic config. class for different hosts
- Accounting groups added to config.
1.10.46
- Bug fix for NULL pointer crash in CondorOSGDispatcher when sandbox is not used(LH)
- Bug fix for CondorOSGDispatcher naming convention of wrapper(LH)
1.10.45
- Normalized naming of output files, which can be adjusted with the 'name' attribute of the job element(LH)
- Improved job wrapper process watcher, with regular expression to capture process(LH)
11.10.42
- Allow <shell> to use "env" to allow setting up pre-activation variables (JL/LH)
- Example: <shell>SINGULARITY_SHELL=/bin/tcsh /bin/sh -c 'exec singularity exec -e -B /direct -B /star -B /afs -B /gpfs /cvmfs/sdcc.bnl.gov/singularity/rhic_sl6_gcc48</shell>
11.10.43
- Added a dispatcher for native file transfer via condor(LH)
- Updated CSH application for native transfer
- Reinterpreted <input> <SandBox> <output> elements for condor native transfer dispatcher
- Added script box to check $JAVA_ROOT is set(JL)
- Added script to find the original SUMS version which submitted a session file(LH)
1.10.40
- Rebranding SUMS to "Simple Unified Meta Scheduler" for a wider user base(LH)
- Added a site in the config for EIC users(LH)
- Update jar, install script, and jar wrapper with the new name(LH)
1.10.39
- Added support for arbitrary shell execution of the generated script via the <shell> element(LH)
1.10.38
- Writing condor.log to /tmp/$USER of the submitter node(LH)
- Reshape of default pol's queues at BNL to short/med./long (5hours/3days/5days)(LH)
1.10.37
- Bug fix for infinite loop when nested in another script and domain is not defined(LH)
1.10.36
- Dropping of HPPS files when using Xrootd syntax(LH)
1.10.34
- Xrootd files also copied to $SCRATCH (LH)
- Reshape of copy back block for clearer errors (LH)
1.10.33
- Added attribute "restartable" to have the batch system auto resubmit evicted jobs (LH)
1.10.32
- Limit on loop asking for user feedback to keep files that don't exist (LH)
1.10.31
- Added copyInputLocally attribute of Job element for copying the job input to $SCRATCH and rewrite file list(JL/LH)
- Changed to making only one condor file per submission to reduce open streams(LH)
- Deprecation of the "minMemory" and "maxMemory" >Job< element attributes(JL)
- Added policy for nightly tests the policy name is "bnl_condor_nightlyTest"(JL/LH)
1.10.30
- Changed condor submission format moving csh to arguments of /bin.sh(LH)
- Changed parsing of condor job IDs form condor_submit(LH)
1.10.28
- Added sdcc nodes to cross experiment queue(LH)
- Added default memory limit for Slurm dispatcher(LH)
- Added slurm slr file initialization commands option "slrInitializationCommands" in config of slurm dispatcher(LH)
- Added slurm args option in config of slurm dispatcher(LH)
1.10.27
- Removed queue object memory limit(LH)
- Updated online site from star.bnl.gov to starp.bnl.gov
- Removed linking patch for sandboxed libs (JL)
1.10.26
- Add SLURM dispatcher (LH)
- Changed default policy for PDSF and added new policies for other SLURM queues (LH)
1.10.22
- Bug fix for using $SUBMITTINGDIRECTORY in output element toURL attribute (LH)
- Changed SGE gatekeeper to native condor access(LH)
1.10.21
- Improvement to make SGE Array dispatcher to work Generator element(LH)
1.10.20
- Printing the SUMS version number in the csh script(JL)
- Implemented resubmission for the SGE Array dispatcher(LH)
1.10.19
- Added an new dispatcher for SGE Arrays(LH)
1.10.18
- Updated condor dispatcher memory and disk resource requirement syntax for newer condor versions(LH)
- Added wrapper to clean up sandbox before copy back, so that sandbox files are not accidentally copied back (LH/JL)
Version 1.10.17
- Bug fix for using "paths" fileListSyntax with splitBy keyword (LH)
- Bug fix in parser for "file" element in Sandbox when used with template(LH)
- New policy for benchmarking CPU time at BNL called "bnl_condor_benchmark" sends all jobs to the same type of CPU(LH)
Version 1.10.16
- Added temp fix for gcc 4.8.2(JL\LH)
Version 1.10.15
- Added "splitBy" JDL attribute to <job> tag(LH)
- Added grid access policy to BNL online cluster(LH)
Version 1.10.14
- Allows star-submit-template to pass -u flags to star-submit using the -use option(JL)
- Bug fix to allow stdout and stderr to come back if root4star is killed(JL\LH)
- Added timing marks for input-file processing (LH)
- Added -v, -V, -version option to display version number(LH)
- Added -use the star-submit which works like -u(LH)
Version 1.10.12
- Removed dropping of non-MuDst files from xrootd fileListSyntax(LH)
Version 1.10.11
- Detached script pid so that signals will propagate to child processes such as root4star, giving a chance to close out open files.(JL/LH/DA/UF)
- Added streaming in SUMS wrapper script so users will not have to take care of this for themselves (LH/JL)
Version 1.10.10
- Updated c-code in the csh for getting $HOME to the latest code du jour.(JL)
Version 1.10.9
- In test for source of .cshrc, changed -e (for exists) to -r because in some cases it can not be read but exists
- Changed attribute in condorG file from "globusscheduler" to "grid_resource" for new version.
- Hardened local copy back of user output files by trying over again if the user copy back fails the first time (JL\LH)
Version 1.10.8
- Bug fix for using local <output fromScratch=... without having a grid GK set in the config file(LH)
- Bug fix for nFiles not respected when using input URL with wildcard(LH)
- New keyword "datasetSplitting" for splitting dataset at at lower granularity then the file level (LH)
- Replaced comparator used for "inputOrder" attribute sorting with numeric-lexicographic comparator (LH)
- Bug fix for NFS file copy selected before local file copy (LH)
- A policy was added to be used with the online nodes
Version 1.10.7
- Adjusted regular-expression which hangs on some catalog request strings(LH)
- Turned off printing stack trace in StatisticsRecorder if user's string is too long(LH)
Version 1.10.6
- Added option in Statistics-Recorder not to try and create the tables if they don't exist. (LH)
- Added object reflection method to queues and dispatchers to remove duplicate data and make the configuration smaller. (LH)
- Added support of the MinMemory keyword in the condor dispatcher (LH)
- JDL has new keyword added softLimits to make the min max events limits hard or soft (switch between two algorithms) (LH)
- When no non-local queues are used in the policy and no query has a site keyword, input files not matching the canonical name of the site will be dropped. (LH)
- JDL has added keywords in Job element (maxEvents, minEvents, and eventsPerHour) (JL/LH)
- Session file is new written as [jobID].session.xml~ and then rename without the ~, in case the disk gets full. (LH)
- Queues now can adjust priority of job (long queue at BNL has higher priority) (LH)
- Added wait command to end of eviction function in script (Yuri F.)
- Changed input tag so you can run command to make an input list (LH/Yuri)
- Added action capability in condorDAG dispatcher for values frequency='1' and position='BEFORE'
Version 1.10.4
- Bug fix for preferred copy when duplicate files are found (LH)
- Bug fix for case where job's $SCRATCH dir was already created by another user (JL/LH)
- RDL support for events per job (LH)
Version 1.10.3
- The file list can now use string:"blabla" syntax which will let any input string be used
- Help from star-submit command line gives all command line options (same as typing star-submit with no args)
- -print options to print out configuration tree
- Error trap for condor_submit : Permission denied.
- Bugs in the .report have been fixed (wrong run time calc for jobs without input files / queue names not shown)
- star-submit script now picks between 32 and 64 bit native libs
- Added UCM collector string to config file (set by site)
- Added logging of hostname to gird and local jobs
- Changed PeriodicRemove so long jobs don't start over and over again
- Changed dataset manipulation text.
- Added CondorFastDispatcher for grouping jobs together in one .condor file (submits faster).
Version 1.10.1
- Sums will not stop if session file and version do not match, it will try and make backup(LH)(pushed back to version 1.10.0)
- Sums MySQL driver updates(LH)(pushed back to version 1.10.0)
- Bug fix for >Action< >Exec< having more then one line (LH)
- Added new DBII (LDAP lookup) information service (LH)
- Added a generic buffered information service to wrap another information service (LH)
- InputFiles table in statistics recorder name changed to InputFiles_[yy](LH)
- All year tables for the current year will be automatically created if they do not already exist.(LH)
- Allowed file list to accept non-url formatted files in format: \dir\dir\file (LH)
- Speed-up in queue test for condor when using more then one queue.
- New Priority element in the JDL for condor dispatcher (LH/JL)
Version 1.10.0
- New >ResourceUsage< tag which moves (Min /Max) Memory, (Min /Max) Storage, and (Min /Max) WallTime out in the job tags as attributes.
- Added generator tag, to place SUMS generated files in directory other then pwd.(LH)
- Bug fix for when job has no input file and filesPerHour is set, the run time should then be filesPerHour * 60min
- Implemented the BEFORE action tag.
- Added [logControl] attribute to enable UCM logger (logger modules are delayed).
- Implemented default filelistsyntax on site by site bases in config file.
- Added one big file list for pre-staging files.
- Added queue test, only at PDSF, that checks if user has rights to submit to the batch system.(LH)
Version 1.8.10
- Made condor the default dispatcher at BNL (LH)
- Reshaped configuration format around gatekeeper (LH)
- Added a class of policy that will dynamically configure its self from an information service (LH)
- Added proxy setup for small site config (LH)
- Starting running the grid setup.csh script by default for all grid jobs.(LH\JL)
- Added buffer to vors information service in case of outage.(LH)
- Changed recovery of OSG variables to check local environment first.(LH)
- Made all local jobs run the users .cshrc.(LH)
- Modification to file splitting algorithm to get closer to optimal job size.(LH)
- Started dropping non-MuDst file from jobs with xrootd fileListSyntax (LH)
- Bug fixed, input from catalog and fileslist at the same time could cause program to exit.(LH)
- Added xrootddev file syntax, reshaped file syntax handling by site in config file.(JL/LH/PJ)
- Small change to virtual resources for BNL LSF cluster (LH\JL).
- Changed the ordering of the INPUTFILE[n] variables in SUMS from a lexigraphic sorting order to a numeric sort.(LH\GV)
- Bug fix for SUMS putting one too many slashes in xrootd url if it comes from the filelist input.(LH)
- Bug fix for min and max memory getting flipped when a job is resubmitted.(LH)
- Added CSH variable to display the number of times a job was resubmitted.(LH)
- Modified queue testing to work with new configuration(LH)
Version 1.8.9
- Complete ground up redesign of catalog and dataset processing (LH)
- Allowed xrootd to take HPSS files (PJ/LH)
- Added opportunistic condor policy bnl_condor_rcf (LH)
- Reshaped the bnl_condor_xxx policys to reflect changes done by the RCF to the pools (LH)
- Printed out location of log file, so users can find it faster (LH)
- Removed -m option when submitting with bnl_lsf_prod (LH)
- Added -debug [level] option to start submit command (LH)
- Added more pie charts for catalog queries on the statistics page (LH)
- Reshaped local sandbox to mirror directories and link only files (LH)
- Added "-u ie" option to ignore most errors when trying to submit a job (LH)
- Allowed the use of xrootd type syntax of filelist (PJ/LH)
- The default file name $JOBID.out or $JOBID.err will be added to the output or error if the user only specifies the path without the file name (LH)
- Changed the Process ID environment variable $PROCESSID to $JOBINDEX so it is not confused with the process ID on the OS or the LSF process ID (LH)
- Added environment variables for submitting node, submitting path, submitting time(GMT) and grid cert (LH)
- Added SUMS version information to session file, so that user must use same version to resubmit (LH)
Version 1.8.8
- Changed logger for java sdk logger to log4J (LH)
- Added several new policies (LH)
- Added a queue checking mechanism (LH)
- Change passive policy to send rootd and Xrood jobs to the local queue (LH)
- The resubmit command now works with just the job ID (no need to type full file name) but must still be used in the same directory (LH)
Version 1.8.7
- U-JDL change (SandBox added) (JL)
- Implementation of zip / local / packman sandboxes (LH)
- Added allowance for both rhic.bnl.gov and rcf.bnl.gov to be used as the site identifier. (JL \ LH)
- Changed FILEBASENAME to blank when not initialized. (LH)
- Multiple bug fixs to stop file splitting algorithm from hanging.(LH)
- Catalog queries are collected as statistics information.(LH)
- Added a -u grid option to force grid dispatching even when the destination is local.(LH)
- The -h option needed to be fixed (JL)
- The number of retries for submitting a job on the RCAS cluster at BNL has been dropped to zero.(LH)
Version 1.8.6
- Updated MySQL drivers so they are compatible with MySQL version 4.1 (LH)
- A �site [domainName] switch is now required to lunch the scheduler jar (see star-submit script)(LH)
- Reshape of layout of config file, changed much logic, converted old config files ,and added new classes (site, GateKeeper, BatchSystem, ConfigToolkit)(LH)
- The elements\objects logConf, statisticsConf, programLocations, and defaultPolicy have been moved inside the site object, see set/get function.
- The $SCRATCH is now located in programLocations. The job references it via queue --> batchSystem --> site --> ProgramLocations --> $SCRATCH
- The Condor-g and Condor_gLSF dispatchers have there globusscheduler priority set indirectly (not with set and get functions) by parsing the site structure.
- Improved the way domainName name is acquired in star-submit script (JL)
- Fix div by zero error (LH)
- Added an xrootd fileListSyntax option. (PJ)
- Set the default startup directory to $SCRATCH, with all dispatchers.(LH)
- Unified CondorGDispatcher and CondorGLSFDispatcher into the CondorGRSLDispatcher. (LH)
- Changed default limitation of the number of INPUTFILE(n) environment variables to 200 and added warning to outputstream.(LH)
- From UNIX the path is now recovered using the �echo $PWD� command instead of the java.io.File.getAbsolutePath() method. (JL + LH)
- The filelist syntax and the INPUTFILE(n) environment variables syntax have been made to be identical all the time, and jobs that use the rootd syntax are no longer sent to a specific node.(LH)
- Fixed bug in syntax of memory requirement for LSF dispatcher.(LH)
Version 1.8.2
- Fixed bug : nFiles ignored when used with the file list in the input tag. (LH)
- Appended to the fromScratch syntax a copy, link and register, these are mostly intended for grid use. (LH)
- U-JDL Schema changes for accommodating fromScratch syntax expansion. (JL)
- Change the behavior of the SGE, LSF and PDS dispatchers so that if the queue name is null or of zero length no queue option is used in the submit command.(LH)
- Stripped directory out of executable, because of inconsistency of directory path on some nodes.
- Modified the statistics collection module to send the queue ID instead of the queue name.
Version 1.8.0
- Grid config file has been reshaped with policies (Step1Policy, Step2Policy, Step3Policy) to better reflect our 3 step process for running on the grid.(LH)
- Object names in the RCAS config have been changed for less ambiguity.(LH)
- Gave each dispatcher its own set and get functions from there CSHApplication objects.(LH)
- Bug fix for: fromScratch tag was ignored during resubmitting or jobs. (LH)
- Behavior from fromScratch modified, a �./� before the file name means look in `pwd` and not in $SCRATCH for the files. (LH)
- This is the first deployed version of SUMS to include the RDL frame work. (DA,PH,JL,LB)
- A new Resource Strategy class working off of Resource Strategies was added, plus a base interface for Resource Strategies. (LH)
Version 1.7.9
- Filelist now supports files on distributed disk(LH)
- Bug fix for using quotes (&quot;) in catalog Queries (LH) (This patch failed and had to be pulled from the code.)
- In PDSF configuration, SGE dispatcher has been set as default (LH)
Version 1.7.7
- More reshaping of threads and streams used for dispatching. Fixes above normal rate of multiply dispatching seen in version 1.7.6. (JL)
Version 1.7.6
- Moved chmod to config file instead of hardcoded (+ full path specified) (LH)
- Logic reshape for Runtime().exec(). Especially, waitFor()
fixed and hack removed (JL)
Version 1.7.5
- star-submit-template would now get document starting with comments
or blank lines (JL)
- Our usage and monitoring JSP's / web pages and db still need updating.
- Fixed globus-url-copy and undid hard coding of globus-url-copy and cp (This may be modified yet again.)(LH)
- Added maxWallTime to the PBS Dispatcher (LH)
- Added SubmitTime, DispatchTime, and SubmitSuccessful(y/n) to report file. The report file has new layout with a global view of the scheduler. (LH)
- Added a new subsystem for killing and resubmitting jobs.(LH)
- Modified star-submit to work with resubmission syntax(LH)
- Modified star-submit-template to accept multiple -entities. (JL)
Added -simulate and -debug (JL)
- Added SGE Dispatcher (LH)
- Added a max elapse time to the LSF and SGE Dispatchers (at PDSF the defualt was moved 5000ms -> 9000ms ) (LH)
Version 1.7.0
- WARNING: The statistic table has a few more columns. Please,
check res/ext/createStatDB for more information (JL)
- Extend report file information (LH)
- Modified JobID scheme for better unicity (LH/JL)
- Modified CSH wrapper for other experiment support (JL)
- Restructured source code for more efficient development (LH)
- Swapped schema validation class with a class based on the sun multi schema validator (LH)
- Extended schema / code with minMemory, maxMemory, minStorageSpace, maxStorageSpace (JL/LH)
- Added new Queue objects, for more detailed definition of queues (LH)
- Modified PassivePolicy to take advantage of new queue objects and extended schema (LH)
- Changed PassivePolicy queue assignment algorithm (LH)
- Modified CondorGLSFDispatcher to take advantage of new schema (LH)
- Modified LSFDispatcher to take advantage of new schema (LH)
- Added the install.pl script (JL)
- Added CondorGPBSDispatcher (AW/JL)
- Added version number and date of submits to the log file (LH)
Version 1.6.2
- Display submit error to STDOUT (user feedback)
- Added XML error checking that works off of an internal xml schema.
- Added environment variable FILEBASENAME
- Changes EnvVariableLimit (still unused for now)
- Changes to pdsfConfig.xml
- Developper issue. Fixes in Ant script now takes care of the memory
leak and an issue with installing script files from Windows to
Unix.
Version 1.6.1
- Extended with a new mode nameEqualValueColumnSeparated for
handling PDSF style resources.
- Added EnvVariableLimit (unused for now)
- Added message when file are not found (was confusing to users when a list
was passed)
- Change a few message typos, updated URL
Version 1.6
- First GRID implementation ready
- Review of policies and dispatchers
- Enables to submit multiple processes with no inputs (see nProcesses)
- BUGFIX: the variable substitution didn't work when the variable was at the
end of the filenames
Version 1.5.2
- Enables trim of all letters of disk vault names (fixes PDSF resources)
- Allows to specify a directory as an output
Version 1.5.1
- Enables trim of the decimal part of disk vault names (fixes PDSF resources)
Version 1.5.0_01
User visible changes:
- Fixed a bug with LSF resources that would give: String index out of range:
-1
Version 1.5.0
User visible changes:
- The scripts produced by the scheduler will use the full path for /bin/cp
and /bin/rm
- Condor submission has been fixed
- Filelists are enabled within the request, by using the filelist:/path/name
URL
- For the queries, preferStorage is set automatically to local for big
requests, and to NFS for small requests (<= 100 files)
- Added orderInput keyword to order the input files within a list
Developer visible changes:
- XML scheduler configuration based on java bean XML serialization
- Source added in the deployed jar
Version 1.3.2
- Refactoring code for the CSH application.
- CondorGLSFDispatcher added: uses CondorG to dispatch on LSF, with some
extra variable added to the globus lsf-jobmanager to handle extended options.
- CondorGDispatcher added: to be used on the RCF Condor pool.
- LSF Resources revised.
- Bugfix - Checks whether the output directory exists before dispatching.
Version 1.3
- filesPerHour tag added: the scheduler can now decide on which queue to dispatch.
It will try to dispatch on a short queue, if minFilesPerProcess allows it.
- fileListSyntax tag added: we call "paths" the syntax the scheduler has been
using; a new one "rootd" is available. This is recognized by the MuDST maker,
and allows some advanced features. Consul the manual for details.
- Bugfix - the wildcard resolution wouldn't work if ls was aliased to something
else.
Version 1.2
- It is possible to discard the stdout and the stderr, by using discard="true"
- Entities are now allowed in the command line.
- Writes a report containing all the nodes, the number of files assigned to
each location and the number of processes
- Added minFilesPerProcess: it's doesn't provide the perfect solution, but it
might be sufficient.
- Bugfix - statistics are recorded in chunks (no more OutOfMemoryError at the
end)
- Bugfix - simulated submission wasn't formatted properly
- ADMIN - more data is being reported through the statistics
- ADMIN - file catalog implementation is set through the configuration file
- ADMIN - different LSF resource strategies for RCF and PDSF
- ADMIN - jobs can be dispatched on different queues depending on whether they
are using local files
Version 1.1
- Retries bsub up to 5 times. If it doesn't succeed goes ahead to the next process
- "name" property in the job tag (mapped to LSF Job name "-J")
- Bugfix - Catalog queries with < or > were not possible
- ADMIN - Log level in properties
- ADIMN - Added LSF resource usage
- ADIMN - New bsub retry configuration properties
- DEV - Process can have a different command line than the job request
- DEV - CondorG dispatching (experimental)
- DEV - SiteForwardPolicy
Version 1.0 RC 2
User
- Message displayed when directory for stderr doesn't exist
- Fixed a bug that prevent to specify the stdin
Version 1.0 RC 1
User
- Reversed order for process submission: displays the biggest number first and
counts to 0
- Allows preferStorage: when multiple copies of the same files are found, you
can specify which copy you prefer depending on it's storage type; if more than
one files are found in the preferred storage, a random one is chosen. (Consult
the manual at "input" element "preferStorage" attribute)
- Better comments in the script: each script also contain the bsub command used
to execute it; makes it easier to resubmit in case of problems.
- Job output specification and scratch space: each process will have a local
scratch directory to work into ($SCRATCH) and in the XML file you can use the
<output> tag to specify which files to bring back after the process is finished.
(Consult the manual on the "output" element)
- Name scheme changed: sched$JOBID.csh for the script and sched$JOBID.list for
the file list.
Developer
- Revised build and development environment
- Better separation of star catalog specifics
- LSFDispather refactoring
- Old code eliminated
- Inilitializer, Policy and Dispatcher are set from the properties file
- Log directory set in the properties file
- $SCRATCH directory set in the properties file
Version 1.0 beta 9
- LSF queue name and bsub extra option can be set through scheduler.properties
- Timeout on the bsub command
Version 1.0 beta 8
- job not submitted if queries and wildcards return no input
- Wildcard added for files on AFS/NFS
- attribute simulation changed to simulateSubmission
Version 1.0 beta 7
- Dispatching failures are now reported
- Dots are displayed while bsub is being called
- Dots are displayed while the catalog query is being executed
Version 1.0 beta 6
- changed filename for script and fileList
- nFiles tag added
- singleCopy tag added
Version 1.0 beta 5
- logging revised
- added the tag maxFilesPerProcess
- changed the queue on which to submit
Version 1.0 beta 4
- checks whether input files and output directories exist
- enforced XML grammar
Version 1.0 beta 3
- mail attribute added
- output stream declaration is enforced
- variable substitution for the I/O stream file names
- added exception logging
- added simulated submission
Levente Hajdu - page was last modified