<?xml version="1.0" encoding="UTF-8"?>
<!-- edited by Jerome LAURET (Brookhaven National Laboratories) -->
<!-- W3C Schema for the Star Unified Meta Scheduler             -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
	<xs:complexType name="stdType">
		<xs:annotation>
			<xs:documentation>I/O streams: "stdin" "stdout" "stderr" elements.</xs:documentation>
		</xs:annotation>
		<xs:attribute name="URL" type="xs:anyURI" use="required">
			<xs:annotation>
				<xs:documentation>
This complex type tells the scheduler to which file redirect the standard input, output 
or error. The URL must be of the file protocol (that means that is a local file, accessible 
via file system), and it should be visible on all machines (for example, a file on NFS or 
AFS).

Remember that the stdout and the stderr must be different for every process, 
otherwise all the process that the scheduler will divide your job in will overwrite the 
same file. To achieve that, you can use the $JOBID environment variable.

For the input element, tells the scheduler which input files to associate to the processes.

Network file. To specify a file that is accessible on all machines on the same file path, 
you should write "file:/path/name".

File on local disk. To specify a file that is resident on a target machine, that is a 
machine on which the scheduler is allowed to submit the job, you should write 
"file://host/path/name".

Filelist on Network disk. You can specify a text file that is going to contain a list 
of files on which to run your analysis. You should write "filelist:/path/name".

Catalog query. To specify a query to the file catalog, you should write 
"catalog:star.bnl.gov?query". catalog: tells that the URL is a catalog query; 
star.bnl.gov tells you are querying the catalog for star at BNL, and query is 
the actual query. The query is a comma separated keyword value pair
("keyword1=value1, keyword2=value2") that will be forwarded to the file 
catalog. The syntax is the same allowed for the command line interface of the 
file catalog at the -cond parameter.</xs:documentation>
			</xs:annotation>
		</xs:attribute>
		<xs:attribute name="discard" type="xs:boolean" use="optional">
			<xs:annotation>
				<xs:documentation>
The discard attributes tells the scheduler to discard the stream, that is, to get 
rid of it. This attribute is meaningfull only for stdout and stderr (and will be ignored 
otherwise).

Be careful when using this option: when using the GRID you don't know where 
your job is going to run, and the standard output/error are crucial to understand 
what went wrong.</xs:documentation>
			</xs:annotation>
		</xs:attribute>
	</xs:complexType>
	<xs:complexType name="mapType">
		<xs:annotation>
			<xs:documentation>
Describes any action to be done on a input/output. This complex type has a reference (ref) 
and a pointer to a reference. Those references should be viewed (and used) as a reference 
to the URI attribute. For example, output has an ID (attribute name ref), copy may refer to its 
value via an IDREF (attribute idref) and define a final target via the URI which becomes the 
reference (ref) for that action.</xs:documentation>
		</xs:annotation>
		<xs:attribute name="ref" type="xs:ID" use="optional">
			<xs:annotation>
				<xs:documentation>The reference for this object</xs:documentation>
			</xs:annotation>
		</xs:attribute>
		<xs:attribute name="idref" type="xs:IDREF" use="optional">
			<xs:annotation>
				<xs:documentation>A reference to another object</xs:documentation>
			</xs:annotation>
		</xs:attribute>
		<xs:attribute name="URI" type="xs:anyURI" use="optional">
			<xs:annotation>
				<xs:documentation>A URI describing the final product.</xs:documentation>
			</xs:annotation>
		</xs:attribute>
	</xs:complexType>
	<xs:element name="job">
		<xs:annotation>
			<xs:documentation>
The top of the Scheduler submission description schema is the "job" element. This element MUST be present and 
all other specifications relates to it. the "job" element however has many characteristics defined via attributes 
documented herein.</xs:documentation>
		</xs:annotation>
		<xs:complexType>
			<xs:choice maxOccurs="unbounded">
				<xs:annotation>
					<xs:documentation>
Previous schema used a sequence but this broke user's old XML not respecting strict ordering. 
To make a step forward, the model was changed from sequence to multiple choice. All optional
elements were reverted to non-optional (which makes no difference within a multiple choice
model)</xs:documentation>
				</xs:annotation>
				<xs:element ref="command"/>
				<xs:element name="stdin" type="stdType">
					<xs:annotation>
						<xs:documentation>Standard input</xs:documentation>
					</xs:annotation>
				</xs:element>
				<xs:element name="stdout">
					<xs:annotation>
						<xs:documentation>Standard output</xs:documentation>
					</xs:annotation>
					<xs:complexType>
						<xs:complexContent>
							<xs:extension base="stdType"/>
						</xs:complexContent>
					</xs:complexType>
				</xs:element>
				<xs:element name="stderr" type="stdType">
					<xs:annotation>
						<xs:documentation>Standard error</xs:documentation>
					</xs:annotation>
				</xs:element>
				<xs:element ref="input" maxOccurs="unbounded"/>
				<xs:element ref="output" maxOccurs="unbounded"/>
			</xs:choice>
			<xs:attribute name="simulateSubmission" type="xs:boolean" use="optional">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler whether to dispatch the actual jobs. If true, the file scripts are 
created, but they are not actually submitted. This is useful to check whether everything 
is functioning correctly.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="name" type="xs:string" use="optional">
				<xs:annotation>
					<xs:documentation>
Gives the job a name by which it can be identified in the underlying batch system.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="mail" type="xs:boolean" use="optional" default="false">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler whether to allow the submission of a job that will returns it's output 
by mail. If not this is not set, or is not equal to true, the scheduler will fail if a stdout 
wasn't specified. This option is here to prevent a user to accidentally send himself all 
the outputs by mail.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="nProcesses" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>[New field]</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="minFilesPerProcess" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler the minimum number of files each process should run on. The 
scheduler  will do its best to keep this requirement, but it's not guaranteed to succeed. 
If a correct distribution is not found, the user will be asked to validate it.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="maxFilesPerProcess" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler how many input files to assign to each process at maximum. 
This number should represent the number of files that your program, by design, is 
not allowed to have (e.g. after 150 files memory use has increased too much due to 
a memory leak). The actual number of files dispatched to the process is decided by 
the scheduler, which takes into account user requirements (i.e. minFiles, maxFiles 
and filesPerHour) and farm resources (i.e. length of the different queues). </xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="filesPerHour" type="xs:double" use="optional">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler how many input files per hour the job is going to analyze. 
This information is used by the scheduler to determine an estimate of the job 
execution time. This is necessary to determine the correct usage of resources 
(e.g. use the long or short queue). By combining the use of filesPerHour and 
minFilesPerProcess, you can basically tell the scheduler what is the minimum time 
required by your job, and force the use of long queues. If this attribute is not 
provided, the job is assumed to be instantaneous (e.g. the processes will be 
dispatched to the short queue no matter how many input files it has).</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="fileListSyntax" use="optional" default="paths">
				<xs:annotation>
					<xs:documentation>
This attribute tells the scheduler which syntax to use for the file list. There are 
only a few possible values imposed by the schema. There are currently: paths, 
rootd.

"paths" syntax returns both files on local disk and on distributed disk as a normal 
path used by the filesystem. This syntax is useful within scripts. The "paths" 
syntax looks like this /path1/file1 /path2/file2 /path3/file3 ...

"rootd" syntax returns files on distributed disk with paths, and files on local disk 
with the rootd syntax. It also appends the number of events contained in each 
file. This file syntax is designed to work with the MuDST makers, and has two 
advantages:   
(1) It allows root to access files that are on the local disk of a different node, making 
      it possible to guarantee the minFilesPerProcess    
(2) By giving the number of events in the files, the MuDST maker doesn't have to 
      pre-scan the files, slightly improving performance.
The "rootd" syntax looks like this /NFSpath1/file1 nEvents1 /NFSpath2/file2 nEvents2 
root://machine//path3/file3 nEvents3 root://machine//path4/file4 nEvents4 ...</xs:documentation>
				</xs:annotation>
				<xs:simpleType>
					<xs:restriction base="xs:NMTOKEN">
						<xs:enumeration value="paths"/>
						<xs:enumeration value="rootd"/>
					</xs:restriction>
				</xs:simpleType>
			</xs:attribute>
			<xs:attribute name="inputOrder" type="xs:string" use="optional">
				<xs:annotation>
					<xs:documentation>
This attributes tells the scheduler that you want your input files ordered according 
to the value of some catalog attribute. This is not going to provide the filelists always 
in sequence: there can always be gaps. It's only going to reorder the filelists after 
they are produced. This options is only possible if all the inputs are catalog queries.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="minStorageSpace" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>Tells the scheduler the minimal storage space (disk most likely) a job will 
need to run. A job should not be scheduled on a node having less space
than this specified number.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="maxStorageSpace" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>Tells the scheduler the maximum storage space (disk most likely) a job will 
need to run. If not specified the job may fail if it has not enough space.
This value may be used for advanced reservation of storage space.
This is necessary to determine the correct usage of  resources.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="minMemory" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>Minimum memory expected for an individual job (in MB). 
Setting this value will affect the scheduling priority.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="maxMemory" type="xs:long" use="optional">
				<xs:annotation>
					<xs:documentation>Maximum memory an individual job is expected to use (in MB). 
Setting this value will afefct scheduling priority.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
		</xs:complexType>
	</xs:element>
	<!--
Here is an example of the <job> element with all valid attributes
	<job simulateSubmission ="true" mail="true">
	....
	</job>
	-->
	<xs:element name="output">
		<xs:annotation>
			<xs:documentation>
The output element is used to specify the output produced by your code. With this tag, you will be able to write 
your output on a local scratch directory on the node the job will be dispatched to, and the scheduler will copy 
your output at the end of the job. This will make better use of I/O resources. The environment variable $SCRATCH 
will contain a path available for your job. This space is unique for each process your job will be divided into, and 
will be deleted after you job ends. With the output element you are able to specify which files you want to bring 
back. You don't need to bring everything back, of course, but the output won't be available anymore later. 

Remember that your job will be divided into different processes, and that all the processes should use different 
output filenames, or otherwise they will rewrite their outputs. You can always use the $JOBID to create unique 
filenames.</xs:documentation>
		</xs:annotation>
		<xs:complexType>
			<xs:sequence>
				<xs:element name="copy" minOccurs="0" maxOccurs="unbounded">
					<xs:annotation>
						<xs:documentation>
A physical file copy from A to B service.</xs:documentation>
					</xs:annotation>
					<xs:complexType>
						<xs:complexContent>
							<xs:extension base="mapType">
								<xs:attribute name="storageService" type="xs:string" use="optional"/>
							</xs:extension>
						</xs:complexContent>
					</xs:complexType>
				</xs:element>
				<xs:element name="register" type="mapType" minOccurs="0" maxOccurs="unbounded">
					<xs:annotation>
						<xs:documentation>
A registration service for datasets or files (physical or logical)</xs:documentation>
					</xs:annotation>
				</xs:element>
			</xs:sequence>
			<xs:attribute name="fromScratch" type="xs:string" use="optional">
				<xs:annotation>
					<xs:documentation>
With this attribute you specify either a file, a wildcard or a directory to be copied 
back. The file, wildcard or directory must be expressed relative to the $SCRATCH 
directory.

That is, to retrieve all the .root files your job saved in the $SCRATCH, simply use 
*.root in this attribute</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="toURL" type="xs:anyURI" use="optional">
				<xs:annotation>
					<xs:documentation>
Tells the scheduler where to copy the output. The URL must represent either a 
network file or directory.

Network file. To specify a file, you should write "file:/path/name". You can specify 
a file here only if the output you specified is a file (you are not allowed to copy a 
directory in one file). You can specify a different name so that the file will be brought 
back with the different name.

Network directory. To specify a directory, you should write "file:/path/".</xs:documentation>
				</xs:annotation>
			</xs:attribute>
		</xs:complexType>
	</xs:element>
	<!--
Some examples are:
	<stdin URL="file:/star/u/user/pion/central/myinput.in"/>
	<stdout URL="file:/star/u/user/pion/central/$JOBID.out"/>
	<stderr URL="file:/star/u/user/pion/central/$JOBID.err"/>
	<stdin URL="file:/star/u/user/scheduler/inputs/goldFullField.param"/>
	<stdout URL="file:/star/u/user/scheduler/gold/fullField/$JOBID.out"/>
	<stderr URL="file:/star/u/user/scheduler/err/goldFullField$JOBID.err"/>
	<stdout discard="true" />
	-->
	<xs:element name="input">
		<xs:annotation>
			<xs:documentation>
The input element is used to specify data input files. Input files can be specified by either a path and filename 
resident on network mounted disks, such as AFS or NFS; it can be a file on a local disk; it can be a query to the 
file catalog. We suggest that you use the latter, because it provides the system more flexibility on how to allocate 
your process. One can specifies more than one input file element. You can mix NFS files with local files and catalog 
queries. You can have more than one catalog query. To specify the location of the input files, you still use an 
URL.</xs:documentation>
		</xs:annotation>
		<xs:complexType>
			<xs:attribute name="URL" type="xs:string" use="required"/>
			<xs:attribute name="nFiles" type="xs:string" use="optional" default="100">
				<xs:annotation>
					<xs:documentation>The number of files returned by the query</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="singleCopy" type="xs:boolean" use="optional" default="true">
				<xs:annotation>
					<xs:documentation>
Specify if the query should return one copy for each file, or it should return multiple 
copies if they are available.

For example, suppose one file has two copies: one on rcas6060.star.bnl.gov and 
one on NFS. By selecting "true", only one of them is returned. By selecting "false", 
both of them can be returned. In the second case, you job will actually run on two 
copies of the same file.

By default only one copy of the same file is returned.</xs:documentation>
				</xs:annotation>
			</xs:attribute>
			<xs:attribute name="preferStorage" use="optional">
				<xs:annotation>
					<xs:documentation>
When multiple copies are available for one file, this attribute is used to choose which 
particular copy to get. This attribute has meaning if singleCopy is not set to false.
If more than one copy is available which the preferred storage (for example, a file
is available on two different machines), one copy is chosen at random.

This attribute was introduced because small jobs on small set of files are penalized 
when dispatched on local files: they have to wait for a particular machine to free 
up, and that might take a long time even if the rest of the farm is free. Executing on 
local files make each job faster, but it might increase the waiting time before the job 
gets executed. Therefore, NFS is recommended only for testing your analysis on a 
small set and local when you run on the entire set.

Remember that the query might return local files even if you chose NFS. If you want 
_only_ NFS or local files, then put "storage=NFS" inside your query.
					  </xs:documentation>
				</xs:annotation>
				<xs:simpleType>
					<xs:restriction base="xs:NMTOKEN">
						<xs:enumeration value="local"/>
						<xs:enumeration value="NFS"/>
						<xs:enumeration value="HPSS"/>
					</xs:restriction>
				</xs:simpleType>
			</xs:attribute>
		</xs:complexType>
	</xs:element>
	<!-- 
Some more examples of inputs are
	<input URL="file:/star/data15/reco/productionCentral/FullField/P02ge/2001/322/st_physics_2322006_raw_0016.MuDst.root" />
	<input URL="file:/star/data15/reco/productionCentral/FullField/P02ge/2001/*/*.MuDst.root" />
	<input URL="file://rcas6078.rcf.bnl.gov/home/starreco/reco/productionCentral/FullField/P02gd/2001/279/st_physics_2279005_raw_0285.MuDst.root" />
	<input URL="filelist:/star/u/user/username/filelists/mylist.list" />
	<input URL="catalog:star.bnl.gov?production=P02gd,filetype=daq_reco_mudst,storage=local" nFiles="2000" />
The file catalog is actually a separate tool from the scheduler. You can find the documentation 
for the file catalog, specifying all the keywords and options, a
	 http://www.star.bnl.gov/comp/sofi/FileCatalog.html.
	 
If the URL represents a file catalog, more attributes are available to better specify the query.
	-->
	<xs:element name="command" type="xs:string">
		<xs:annotation>
			<xs:documentation>
The command element doesn't have any attributes, and the data that it contains is the actual command script that 
will be submitted using a csh script. You can use environment variable to retrieve special information, such as the 
JOBID, the FILELIST, or more. But remember that the command line will be passed as it is, and therefore if csh doesn't 
perform the substitution (for example, because part of your command containing the variable is between '...'), the 
scheduler won't. Refer to csh man pages. If you have doubts on the correct execution of your command, you can 
simulate the submission and manually check the script.</xs:documentation>
		</xs:annotation>
	</xs:element>
	<!--
Some examples of the command element are:
	<command>echo test</command>
	<command>root4star -q -b numberOfEventsList.C\(\"$FILELIST\"\)</command>
	<command>
	stardev
	root4star -q -b findTheHiggs.C\(234,\"b\",\"$JOBID\",\"$FILELIST\"\)
	</command>

//TODO add loop example
	-->
</xs:schema>
