Scheduler FAQ

Questions

  1. How do I submit a job?
  2. What is the quickest way to learn how to use the scheduler?
  3. I submitted a job, and it created lots of files!
  4. It's not working! My jobs! What do I do!!! ARGH!
  5. How do I re-submit a job that died?
  6. How do I pass MY job name?
  7. How do I control which library version I use in the scheduler?
  8. How do I use the file catalog query? What is the syntax to get these or those files?
  9. How do I use macros like doEvents.C or bfc.C with the scheduler?
  10. In which directory do I run?
  11. How do I tell the scheduler which queue to use?
  12. I am having wierd problems (i.e. no output from the scheduler, is not creating any scripts, is not submitting them)
  13. How do I know the options of a particular keyword in the catalog query? How do I know which "filetype" are available?
  14. How do I specify parameters while submitting a job?
  15. I getting all sort of illegal characters error. How can I use '&' '<' '>'?
  16. How can I learn a little bit about XML?
  17. I have heard that I can use different policies... Where are they listed?
  18. How do a give my output file the same base-name as my input file?
  19. How do I get more information about crashes from my log file?
  20. How can I resubmit jobs ?
  21. How do I kill a request or a job?
  22. How can find out if any of my job submissions failed?
  23. Why is the number of files selected smaller then the number of files returned by the file catalog?
  24. How can I switch from rootd to xrootd (eXtended rootd) ?
  25. Why do I get (/bin/cp: no match) at the end of my job and no output files back ?
  26. Is it guaranteed that one gets each file at most only once (No duplicates) ?
  27. How do I delete the large amount of files the scheduler produced when rm fails?
  28. What does it mean when the scheduler tells me: The configuration requested a limitation of the number of INPUTFILE(n) environment variables at 200, the rest will be commented out ?
  29. When using the STAR file catalog, when should the keyword "storage=xxx" be used?
  30. When running over the same files over and over again, the catalog is slow\down. Can I submit my jobs without re-querying the catalog?
  31. How do I better manage the thousands of files the scheduler writes?
  32. What is this !!!You need to update your XML!!! message?
  33. How do I run simulation jobs which do not have input files?
  34. How do I avoid stressing the file system with my jobs?

1. How do I submit a job?

First thing you have to prepare your XML description. When you have prepared your files, you can type:

star-submit jobDescription.xml

where jobDescritpion.xml is the name of the file.

2. What is the quickest way to learn how to use the scheduler?

You can use one of the "cut and paste" examples. Of course, you still have to change the command line and the input files. I am sorry I couldn't prepare the exact file for you... :-)

3. I submitted a job, and it created lots of files!

Yes, this is normal. For every process a script and a list of files are created. It's something that will be fixed in the final version. You can delete them easily, since they all start with script* and fileList. Remember, though, to delete them _AFTER_ the job has finished.

4. It's not working! My jobs! What do I do!!! ARGH!

Well, you shouldn't panic like that! You can send an e-mail to the scheduling mailing list, and somebody will help you.

5. How do I re-submit a job that died?

In the comment at the beginning of each script there is the bsub command used to submit the job. You can copy and paste it the command line and execute it. Be sure you are in the same directory of the script.

6. How do I pass MY job name?

You can use the name attribute for job like this:

<job ... name="My name" ... >

7. How do I control which library version I use in the scheduler?

In the command section you can put more than one command. You can actually put a csh script. So you can write:

<command>
starver SL02i
root4star mymacro.C
</command>

8. How do I use the file catalog query? What is the syntax to get these or those files?

The file catalog is actually a separate tool from the scheduler. When you write a query, the get_file_list.pl command is used. So, the full documentation for the query is available in the file catalog manual. You will be interested in the -cond parameter, which is the one you are going to specify in the scheduler.

9. How do I use macros like doEvents.C or bfc.C with the scheduler?

If you are asking this question, it's because you have been trying to submit something like:

<command>root4star -b -q doEvents.C\(9999,\"$FILELIST\"\)</command>

This won't work because doEvent interprets $FILEST as an input file and not as a filelist. But, if you put @ before the filename, doEvents (and bfc.C, ...) will interpret the filename correctly. So you should have something like:

<command>root4star -b -q doEvents.C\(9999,\"@$FILELIST\"\)</command>

10. In which directory do I run?

Before version 1.8.6 the job will start default location for the particular batch system.

If you are using LSF jobs will execute in the directory in which you are submitting the job, which is the same directory where the scripts will be created, which is also the same directory you should be in for resubmitting the jobs.

In version 1.8.6 and above the default directory in which the job starts is define by the environment variable $SCRATCH. This will most likely be a directory local to the worker node. The base directory path will be different for every site. The directory and all its contents will be deleted as soon as the job is finished. For this reason do not ever attempt to change this directory. All files that need to be saved need to be copied back to some other directory.

11. How do I tell the scheduler which queue to use?

You don't, but you can tell SUMS (a.ka. the STAR Scheduler) about your job, so that SUMS can choose the correct queue or modify the job to fit into an existing queue.
Remember: SUMS will have to work on different sites, on which the queue names, the number of queues and their limitations will be different across sites. SUMS knows these values through a configuration file maintained by facility or team experts knowledgeable of such limitations. Also, note that if the queue characteristics are modified or new queues open up these changes can be made transparently to you but still allowing your jobs to take advantage immediately.

If you do not give SUMS any hint, it will be forced to make assumptions based on the provided properties of your job and hence, chose a queue selection based on those assumptions. Queues typically have limitations on run time (CPU or wall clock), memory and storage space. If your job exceeds any of these limitations the job may be killed by the batch system.

SUMS uses the filesPerHour which can be defined in the job tag to figure out how long a job will run for. Here is an example of how you can define this:
<job filesPerHour="25">
This is a way of specifying that your job will be able to process on average 25 files in one hour. So a job with 50 files is assumed to run for two hours and a job with one file is assumed to run for 2.4 minutes. If a smaller number is used like filesPerHour="0.25" this means a job with one file will run for four hours and a job with 2 files will run for eight hours.

If the distribution of events per job is wide, making the run times of the jobs very different (ex. Some jobs run for one hour and others run over ten hours) it may be more desirable to switch over to the events keywords group. This will allow a more even number of events in each job. So instead of using the filesPerHour, minFilesPerProcess and, maxFilesPerProcess you can use eventsPerHour, minEvents, and, maxEvents. The events key words will drop files to try and meet the requirement, since most users will want as many statistics as possible even if there is more variation in run time it is recommended that the softLimits keyword be set to true, this will try to meet the requirements, but if it can't it will try to fudge it as best as it can get each job as near the ideal min and max as possible. For example if you set the max events to 1,000 and a single file has 5,000 events in it, with the softLimits not set or set to false, the file with 5,000 events will be dropped. With the softLimits set to true the file will be given its own job, because that is the best that can be done to get it as close to the maximum users requested limit of 1,000 as possible.

Here is an example of how it would be used:
<job eventsPerHour="2300" minEvents="2000" maxEvents="5000" softLimits="true" >

SUMS has a preferred queue. This is usually the shortest highest priority queue. So as the value of filesPerHour gets smaller SUMS will try to group less files in one job to try to get it under the time limit for its preferred queue. This will result in a greater number of jobs. If for some reason you want to force SUMS to send your jobs to a queue with a longer lifetime of jobs, you can do this by setting limits on the number of files inside your job by setting the minFilesPerProcess and maxFilesPerProcess. Especially, a combination of minFilesPerProcess and filesPerHour will clearly indicate the minimum time for a job. If the job time limit exceeds that of the preferred queue, SUMS will try the second most preferred queue and so on until if finds a queue that meets the users defined requirements. If no such queue exists, for example if the longest queue has a time limit of a week and your job request 12 days, the job will not be submitted and SUMS will print a warning. I recommend allowing a little bit of wiggle room between min and max FilesPerProcess because SUMS will complain for example if you ask for N files in each job and the number of files returned by the catalog is not devisable by N.

There will always be some variation in run time because of differences in worker node hardware, load, variation in event complexity, file system load and so on. So it is recommended to provide a little slack in the filesPerHour or eventsPerHour (in other words making the number a little lower). If you guess that your job can process 100 files per hour and in reality it can only process 10 and you ask for a maximum of 100 files per job your job will run for 10hours in reality and the scheduler will assume it will run for a maximum of 1 hours. Now assuming 10hours real run time and the scheduler assuming a run time of 1 hour, the scheduler could submit this job to a queue with a limit on run time of 2 or 3 or 4 or 5 hours, depending on what your site offers. Once the job runs over the hard run time limit of the queue the job will be forcibly stopped even if it has not finished. This is why it is important to get this number close to reality or at least not off by huge factors. Presumably before running your script over hundreds or thousands of file you have tested with one or two files. You should use this number to guess the filesPerHour or eventsPerHour. It is not an exact science but close is good enough.

Now if you're having a hard time guessing what the scheduler thinks your run time is and why it sent your job to a particular queue or even what queue it sent it to you can look at the *.report file. This is a text file with a table of your jobs and some key parameters. You have to stretch the window really wide to see the table properly, else the text will wrap around and distort. There is a smaller table below the job table which shows you the queues at your site in your scheduler's policy which the scheduler had to select from. If you want to view this before submitting your jobs you can use the simulateSubmission keyword.


Please refer to the SUMS manual for defining other properties of your job such as memory and storage space.

You can look at this example.

12. I am having wierd problems (i.e. no output from the scheduler, is not creating any scripts, is not submitting them)

Check you .cshrc file. First of all, as Jerome says:

***********************************************
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
**** DO NOT SET LSF ENVIRONEMNTS YOURSELF *****
***********************************************

Furthermore, you may want to have a look at this page

13. How do I know the options of a particular keyword in the catalog query? How do I know which "filetype" are available?

You can use the get_file_list.pl to know which values are can a particular have. For example:

[rcas6023] ~/> get_file_list.pl -keys filetype
daq_reco_dst
daq_reco_emcEvent
daq_reco_event
daq_reco_hist
daq_reco_MuDst
daq_reco_runco
daq_reco_tags
dummy_root
...

You can do the same for any keyword

14. How do I specify parameters while submitting a job?

You might want to have a look at the star-submit-template. Here is an example.

15. I getting all sort of illegal characters error. How can I use '&' '<' '>'?

This is an XML problem: '<', '&' and '>' are XML reserved, so you can't use them, but you can use 'something else' in their place. Use the following table for the conversion:

< &lt;
> &gt;
& &amp;

So, for example, if you need to put the following in a query:

runnumber<>2280003&&2280004
runnumber&lt;&gt;2280003&amp;&amp;2280004

Yes, it doesn't look so nice... :-( There is unfortunately not much I can do...

16. How can I learn a little bit about XML?

I suggest you have a look at this site: it has a lot of tutorials about XML, HTML and other web technologies. It's at a entry level and it has a lot of examples.

17. I have heard that I can use different policies... Where are they listed?

You can find a list of the installed policies here.

18. How do a give my output file the same base-name as my input file?

Use $FILEBASENAME as the stdout file name. This only works in version 1.6.2 and up. It only works if there is only one output file or one file preprocess.

Consult the manual if you need an example.

19. How do I get more information about crashes from my log file?

Emphasis is always placed on generating real detailed and meaningful user feedback at the prompt when errors occur. However more information can be found in the users log file. If the scheduler did not crash altogether it will append to the users log file, where the most detailed data of the internal workings can be found.

Every user that ever used the scheduler, even just once has a log file. It holds information about what the scheduler did with your job. To find out more about reading your log file click here.

20. How can I resubmit jobs?

 Note: This only refers to version 1.7.5 and above, as resubmission was only built-in at this version. If you submitted via an older scheduler version the resubmit syntax will not work and you will not have a session file.

 Note: When a job is resubmitted the file query is not redone. Scheduler uses the exact same files as the original job. Be careful when resubmitting old jobs as the path to the file may have changed.  

When a request is submitted by the scheduler, it produces a file named [jobID].session.xml. This is a semi-human readable file, that contains everything you need to regenerate the .csh and .list files and to resubmit all or part of your job.

If you wish to resubmit the whole job, the command is (replace [jobID] with your job Id):  star-submit -r all [jobID].session.xml

Example: star-submit -r all 08077748F46A7168F5AF92EC3A6E560A.session.xml

If you wish to resubmit a particular job number, the command is (where n is the job number): star-submit -r n [jobID].session.xml

To resubmit all of the the failed jobs, the command is: star-submit -f all [jobID].session.xml

Type : star-submit -h  for more options and for help. There are a lot more options available this is a very short overview of the resubmission options.

21. How do I kill a request or a job?

Note: This only refers to version 1.7.5 and above, as resubmission was only built-in at this version. If you submitted via an older scheduler version the resubmit syntax will not work and you will not have a session file. It is also recommended you read  20. How resubmit jobs as there is more information about the session file in there.

The command to kill all the job in the submission is (replace [jobID] with your job Id):  star-submit -k all [jobID].session.xml

If you wish only to kill part of the jobs, substitute the word all for a single job number (for job 08077748F46A7168F5AF92EC3A6E560A_4 the number is 4). A comma delaminated list may also be used (example: star-submit -k 5,6,10,22 [jobID].session.xml)  or a range (example: star-submit -k 4-23 [jobID].session.xml)

22. How can find out if any of my job submissions failed?  

Note: This only refers to version 1.7.5 and above.

This information is stored in the [jobID].report file in a nice neat table (I recommend you turn off word wrap to view this file). We are trying to put more and more information for users in this file with every new version. The file also stores information about queue selection. So it will probably answer such questions as "Why did my job have to go into the long queue as opposed to the short queue".

23. Why is the number of files selected smaller then the number of files returned by the file catalog?

This is because some files may not be able to be accessed, such as files on HPSS when not using xrootd. Duplicate files are also dropped so two or more of the same file returned at more then one location are not counted twice.

24. How can I switch from rootd to xrootd (eXtended rootd) ?

Switching between these two systems is done by specifing requested system in the attribute fileListSyntax in the element job. See the job element section and example.


Note: In the STAR framework (root4star), xrootd syntax is supported for libraries SL05f and above. Please be aware of this restriction as SUMS will generate the jobs and submit but they will not succeed at runtime.  

25. Why do I get (/bin/cp: no match) at the end of my job and no output files back?

In a non-grid context when you ask for data to be moved back from $SCRATCH using a tag like this one the:
<output fromScratch="*.root" toURL="file:/star/u/lbhajdu/temp/" />

Is translated into the cp command like you see below:

/bin/cp -r $SCRATCH/*.root /star/u/lbhajdu/temp/

If the cp command returns "/bin/cp: no match" it means it did not match anything. This is because no files where generated by your macro for it to copy. This can be verified by adding an "ls $SCRATCH/*" to the command block of your job right after your macro finishes to list what file it has produced in $SCRATCH.

Examine your macro carefully to see what it's doing. It could be writing your files to a different directory, or not writing any files at all or crashing before it gets a chance to write anything.

26. Is it guaranteed that one gets each file at most only once (No duplicates)?

You are guaranteed that duplicate files are dropped as long as you get them from the catalog. Duplicate files are not dropped from file list. The dropping of duplicates is based on the fdid of the file. The .dataset file holds the fdid of all the files so you can verify that it did in fact work.

The schedulers output to the screen tells you how many files where dropped because they where duplicates. The scheduler selects between duplicates to pick the one that can be accessed the fastest based on the storage type (there is a ranking). The user can over ride this with the preferStorage key word (see user manual). Xrootd may recover the file from a storage type other then the one stated.
 

27. How do I delete the large amount of files the scheduler produced when rm fails?

Use this command:
delete '*.*'
The single quotes are needed to prevent shell expansion.


28. What does it mean when the scheduler tells me: The configuration requested a limitation of the number of INPUTFILE(n) environment variables at 200, the rest will be commented out?

There are two ways people input files into there Macro.

1) The file list (.list file) created with every job passed by the $FILELIST environment variable.
2) From the variables $INPUTFILE[n] defined in the csh.

If you're using the $FILELIST environment variable to pass the input file(s) name and location to your executable this message has no effect on you and you should ignore it. Example:
root4star -q -b myMacro.C\(\"$FILELIST\"\)

On the other hand, if you are using $INPUTFILE0,$INPUTFILE1,$INPUTFILE2,...,$INPUTFILE[n] environment variables to pass the input file(s) name and location to your executable, be very afraid. There are only so many characters that will fit into the memory of a csh script. So we limit this at 200 files ($INPUTFILE199 typically) to avoid memory overflow. So even though the job will have more files in the file list the jobs environment will only have environment variables for the first 200.

If you want to use the $INPUTFILE[n] environment variables for passing the location and name of the input files, it is recommended you limit the number of input files in your job before the limit (200) by using the maxFilesPerProcess attribute like this:

<job maxFilesPerProcess="50" ...

29. When using the STAR file catalog, when should the keyword "storage=xxx" be used?

The "paths" and "rootd" file list syntax should use "storage!=hpss" because HPSS files are not mounted and these files will be dropped by the scheduler, and even if they where not they would not be accessible.

If you are using "fileListSyntax=xrootd" (currently only available at RCF) xrootd will determine how to give you access to the file. The storage element is less critical here. Using "storage!=local" would pick files for hpss and from NFS if they exists. If files are available on NFS you will get much faster access then the files on HPSS. To prevent excessive dropping of duplicates (load on scheduler and file catalog) you can use "storage=HPSS" because only one copy of each file exists on HPSS.

30. I'm running over the same files over and over again, the catalog is slow\down. Can I submit my jobs without re-querying the catalog?

If you have recently submitted jobs with the scheduler and you still have your .session.xml and your .dataset file and you don't need to change anything in your xml request (input , ouput, command) you can recompile your macro and resubmit the jobs again without doing another catalog query again.

To resubmit in this mode cd back to the folder you submitted the jobs from and use:

star-submit -all 2596CF4AB570DE769AC325EB21864616.session.xml

Of course replacing the .session.xml file above with your own .session.xml file generated for you by the scheduler right after it submitted all of your jobs. The .csh file will be overwritten or regenerated from the information in the .session.xml and the .list files will be rewritten with the information from the .dataset file so any changes to these files you try and make by hand will be lost.

Some additional notes:



31. How do I better manage the thousands of files the scheduler writes?

In scheduler version 1.10.0c and above there is an option to have the scheduler write the files somewhere other then your current working directory. A basic example of this capability would be to create a subdirectory off your current working directory and have the scheduler copy everything there. To do this first create the directory [mkdir ./myfiles]. Then add these tags inside the body of your xml files jobs tag (that is the same level as the command tag):

<Generator><Location>./myFiles</Location></Generator>

These tags have far more fine grain options fully documented inside the manual in section 5.1.

32. What is this !!!You need to update your XML!!! message?

In order to be able to expand the scheduler functionality further and implement many of the changes people have been asking for we will need to expand the xml job description this means turning some of the XML attributes into <elements> so sub-elements could be added later. We will try to be gentle, in the 1.10.0 version the scheduler tells you how to upgrade your xml and make it functionally equivalent.

33. How do I run simulation jobs which do not have input files?

Use the nProcesses option to set the number of jobs you want to submit. Use the filesPerHour to set the run time of the jobs. Note that because there are no input files the filesPerHour is the number of hours for which the job will run for.

34. How do I avoid stressing the file system with my jobs?

Here at BNL running under the scheduler, it is more efficient to write your output files to the $SCRATCH (area where your job starts) and then copy to your final location once, as opposed to having your job open files and write little chunks at a time to a networked space such as /star/u/*, /star/scratch or /star/institutions (to name those only). The storage itself will be more stable and your job won’t die from little network hiccups.

For this the STAR Scheduler (SUMS) provides a variable defined in your jobs called $SCRATCH. There are benefits to using this variable, pointing to a local storage.

  1. Since this storage space is local to the node, reads and writes won’t go over the network and so does not wait and/or timeout with network hiccups.
  2. You don’t have to worry about cleaning this space up when you’re done, let SUMS take care of that.
  3. The space (directory) is unique for each job so files with the same name won’t collide.
  4. Your job starts up in that location by default so you don’t need to do any magic to get to $SCRATCH and it’s defined in your running job’s environment (like $USER is for example).
  5. In /star/u/$USER your space is finite, if you’re creating a lot of intermediate files, you must have available to you the peak space your jobs need. With $SCRATCH each job gets its very own few gigs of space, regardless of how many jobs are running at once. All you need is enough space to copy back the final output you want to keep (usually not on on /star/u but an institution disk or a global user area)
  6. Ever have a job go crazy, just get stuck in a loop and start writing ginormous output files? Instead of filling up all your free space and causing other jobs to crash, the one rogue job will get killed after it has reached some reasonable output limit leaving your precious output space alone.

The number one reason user’s cd back to their home directories is because they don’t know how to bundle up their work and move it with their jobs. This Email aims to instruct/remind you on how to do that.

1) First, write and debug your macro (obviously this will be different for everyone). For the purpose of this explanation, let us assume you have written a macro named findTheQGP.C

2) Start preparing your scheduler XML, now instead of moving back to your home directory write your command block as if everything you need is available right from where your job starts up (this area will be $SCRATCH).

<command>
      stardev
      root4star -q -b findTheQGP.C\(234,\"b\",\"$JOBID\",\"$FILELIST\"\)
</command>
3) Now think about what files your job needs to run, we are going to add a new block called <sandbox> which will bring with us to $SCRATCH our macro and any additional files our macro depends on. Here is an example:

<SandBox installer="ZIP"> 
      <Package name="myQGPMacroFile">
          <File>file:./findTheQGP.C</File> 
          <File>file:./StarDb/</File> 
          <File>file:./.sl53_gcc432/</File>
      </Package>
</SandBox>
This block tells our job, that we need to take with us the macro findTheQGP.C and the folder StarDb as well as the whole tree: ./.sl53_gcc432/ (which likely would contain local STAR codes you may need and have compiled).

We did not use the full path although we could have. The ./ means here these files will be in the directory we are submitting from (the submission description is relative to where you will submit your jobs). The installer="ZIP" tells the scheduler to zip up all these files and unzip them before our job starts. You can add as many files this way as you want. Don’t forget about adding hidden files your job depends on you can check your working directory for these with the command ls –a.

When submitting your jobs you will notice the message:
Zip Sandbox package myQGPMacroFile.zip created where the name is defined by the name="myQGPMacroFile".

You will also notice the zip file in your submitting directory:

  [rcas6019] ~/temp/> ls -l myQGPMacroFile.zip
  -rw-r--r-- 1 lbhajdu rhstar 3342 May 9 11:57 myQGPMacroFile.zip


It should be noted that after this point any change made to the original macro, code I your tree etc … will NOT affect the running jobs. In other words, the package will not be zipped again for the set of jobs you have just submitted (see it as a container preserving your work). So if you made an error you should delete the zip file before trying another submission. That zip file will be created in the directory from which you submit your jobs.

You can put an ls command in your command block to verify that all of your files have been moved. You will see the scheduler wrapper placing your files inside the jobs log and an ls command will show you the files, just in case you’re not sure of the final configuration:

Archive: myQGPMacroFile.zip
inflating: /tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0/numberOfEventsList.C
creating: /tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0/StarDb/
creating: /tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0/StarDb/Calibrations/
creating: 
… /tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0/StarDb/Calibrations/tpc/tpcGridLeak.20060509.120001.C
--------------------------------
STAR Unified Meta Scheduler 1.10.8 we are starting in $SCRATCH : /home/tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0
--------------------------------
total 8
-rw-r--r-- 1 lbhajdu rhstar 2689 Jul 6 2006 numberOfEventsList.C
drwxr-xr-x 3 lbhajdu rhstar 4096 May 9 11:57 StarDb
/tmp/lbhajdu/FC770BA9B391C27970BDE1312D8BA50D_0
total 8

4) There is one final item to take care of, getting our output files back from $SCRATCH. But don’t worry there is a block for that as well. Let us assume we only want root files back (picoDST, histograms … whatever will be created by the job), then we could add something like this to our xml, and sure you can add as many of these as you like.

<output fromScratch="*.root" toURL="file:/star/data05/scratch/inewton/temp/" />


5) Besides the Stream to the output files. There is another stream open to the log files. By redirecting this to the local disk it provides the same benefits as writing the output files locally. 99% of the output comes from root4star or the program actually doing the work like an event generator for example. Simple Linux redirection is all that is needed here.

<command>
  ls
  stardev
  root4star -q -b findTheQGP.C\(234,\"b\",\"$JOBID\",\"$FILELIST\"\) 
  &gt;&amp; ${JOBID}.log
</command>

Note the redirection at the back of the root4star command. This takes all the output from root4stars output and error streams and redirects them to the file ${JOBID}.log which is located in the current working directory. In this case it would be $SCRATCH. So this also has to be copied back once done. And we can do that the same way as before:

<output fromScratch="*.log" toURL="file:/star/data05/scratch/inewton/" />


So let’s put this all together to see what the final xml might look like:

<?xml version="1.0" encoding="utf-8" ?>
<job maxFilesPerProcess="50"> <command> ls stardev root4star -q -b findTheQGP.C\(234,\"b\",\"$JOBID\",\"$FILELIST\"\) &gt;&amp; ${JOBID}.log </command> <SandBox installer="ZIP"> <Package name="myQGPMacroFile"> <File>file:./findTheQGP.C</File> <File>file:./StarDb/</File> </Package> </SandBox> <input URL="catalog:star.bnl.gov?filetype=daq_reco_MuDst,storage=local" nFiles ="400"/> <stderr URL="file:./shed$JOBID.error.out" /> <stdout URL="file:./shed$JOBID.out" /> <output fromScratch="*.MuDst.root" toURL="file:/star/data05/scratch/inewton/"/> <output fromScratch="*.log" toURL="file:/star/data05/scratch/inewton/"/> </job>


Levente Hajdu - last modified this page