RDL Comparison Homework

Here is a comparison between the U-JDL and the new RDL

Example 1: simple shell script task. RDL 0.51 and comments.

Note that a shell script can be either the name of a shell script or (would be nice that this understood that way) a list of shell commands.

<?xml version="1.0" encoding="UTF-8"?>
<job name="Test1" maxFilesPerProcess="50" minFilesPerProcess="25" minMemory="5" maxMemory="10" maxStorageSpace="40" minStorageSpace="30" simulateSubmission="true" filesPerHour="4" fileListSyntax="rootd" minWallTime="12" maxWallTime="30" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="SUMS_UJDL.xsd">
        <command>
                 echo This is a test
                 date > $JOBID.root
                 java load 100
        </command>
        <stdout URL="file:/star/u/xxx/temp/shed$JOBID.out"/>
        <input URL="file://rcas6181.rcf.bnl.gov/data1/starlib/reco/dAuMinBias/ReversedFullField/P03ia/2003/049/st_physics_4049029_raw_0040079.MuDst.root"/>
        <output fromScratch="*.root" toURL="file:/star/u/xxx/temp"/>
        <input URL="catalog:star.bnl.gov?collision=dAu200,generator=hijing,trgsetupname=minbias,filetype=MC_reco_MuDst,storage!=hpss" nFiles="10"/>
</job>

This would now become (using the new RDL) something like this (note the comments, the schema had to be modified to make it fit).

<?xml version="1.0" encoding="UTF-8"?>
<AbstractRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="RDL0.52.xsd">

        <!-- It is unclear is how this RequestName will be used in -->
        <!-- regards of multiple job (splitting). May be an implementation -->
        <!-- question and answer -->
        <Request taskRef="Simple" appRef="csh" datasetRef="DAll" resourceRef="R1">
                <RequestName>Test1</RequestName>
                <OutputHandling>
                        <Copy>
                                <Source>*.root</Source>
                                <Destination>file:/star/u/xxx/temp</Destination>
                        </Copy>
                </OutputHandling>
        </Request>

        <ScriptTask ID="Simple">
                <!-- The $JOBID Mechanism would not change comparing to current scheme                   -->
                <STDOUT>file:/star/u/xxx/temp/shed$JOBID.out</STDOUT>
                <Script>
                        echo This is a test
                        date > $JOBID.root
                        java load 100
                </Script>
        </ScriptTask>

        <!-- Seems to be a shame to specify this all the time. After all, it is csh                  -->
        <!-- Issue 1: where is the program executable specified ??                                   -->
        <Application ID="csh" name="csh"/>


        <!--  Data sets definition starts here  -->
        <GenericDataset ID="DAll" type="Physical">
                <DataSetIDRef>Data1</DataSetIDRef>
                <DataSetIDRef>Data2</DataSetIDRef>
        </GenericDataset>

        <!-- First is a Catalog query with missing implementation                                    -->
        <GenericDataset ID="Data1" type="Physical">
                <Catalog>
                        <Type>MySQL</Type>
                        <Name>star.bnl.gov</Name>
                        <Query>collision=dAu200,generator=hijing,trgsetupname=minbias,filetype=MC_reco_MuDst,storage!=hpss
                        </Query>
                </Catalog>
        </GenericDataset>

        <!-- Then a single file or a sequence of files                                               -->
        <GenericDataset ID="Data2" type="Physical">
                <Name>file://rcas6181.rcf.bnl.gov/data1/starlib/reco/dAuMinBias/ReversedFullField/P03ia/2003/049/st_physics_4049029_raw_0040079.MuDst.root
                </Name>
        </GenericDataset>

        <!-- Resource specification will require providing assistance to users -->
        <ResourceSpecifications ID="R1">
                <MemoryPerProcess>
                        <MininumUnitsRequired>5</MininumUnitsRequired>
                        <MaximumUnitsEstimated>10</MaximumUnitsEstimated>
                        <AverageSizeOfUnit>1</AverageSizeOfUnit>
                </MemoryPerProcess>
                <InputFiles>
                        <MininumUnitsRequired>25</MininumUnitsRequired>
                        <MaximumUnitsEstimated>50</MaximumUnitsEstimated>
                        <AverageSizeOfUnit/>
                </InputFiles>
                <OutputResource>
                        <!-- This is clearly bogus as per the definition -->
                        <MininumUnitsRequired>30</MininumUnitsRequired>
                        <MaximumUnitsEstimated>40</MaximumUnitsEstimated>
                        <!-- Approximately 4 files per hour -->
                        <UnitsPerSeconds>0.067</UnitsPerSeconds>
                </OutputResource>
                <Time>12</Time>
        </ResourceSpecifications>
</AbstractRequest>

Several comments are needed immediately pointing and weaknesses.

The proposed RDL is available in HTML form ; here is XSD.

 

Example 2: A more complex and full example

Does not seem worth getting to a more complex example at this stage. First one requires schema changes already.


Page was last modified