gov.bnl.star.offline.scheduler.dataset.datasetManipulators
Class SortByRegX

java.lang.Object
  extended by gov.bnl.star.offline.scheduler.dataset.datasetManipulators.SortByRegX
All Implemented Interfaces:
DatasetManipulator

public class SortByRegX
extends java.lang.Object
implements DatasetManipulator

This is a sorter for large datasets that do not fit into memory. It sorts subsets of the main dataset, and writes each one of these to a file in step one. The in step two these are read back and mered into the final sorted dataset. $Id

Author:
Leevnte B. Hajdu

Constructor Summary
SortByRegX(java.lang.String sortByCaptureGroup)
          This object is used to sort datasets by a parameter of the values in the dataset.
SortByRegX(java.lang.String sortByCaptureGroups, java.lang.String captureGroupsOrder)
          Sort entries by a set of capture groups Example: Given: Typcal entry: 9404397::NFS::BNL::localhost::/star/data41/reco/productionMinBias/ReversedFullField/P05ic/2004/023::st_physics_adc_5023001_raw_1050014.MuDst.root::155 Regular expression for eantry : "[0-9]*::[a-zA-Z]*::[a-zA-Z]*::[a-zA-Z0-9:.]*::[^:]*::[^:/]*::[0-9]*.*" Sort by -storage- then by -path- then by -fileName- : SortByRegX("[0-9]*::([a-zA-Z]*)::[a-zA-Z]*::[a-zA-Z0-9:.]*::([^:]*)::([^:/]*)::[0-9]*.*", "$1$2$3")
 
Method Summary
 int getMaxBufferSize()
           
 void modify(Dataset dataset, Request request)
          Used to pass the dataset to the dataset manipulator
 boolean requirementsSatisfied()
           
 void setMaxBufferSize(int maxBufferSize)
          Sets how big the the max buffer size in memory sould be.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SortByRegX

public SortByRegX(java.lang.String sortByCaptureGroup)
This object is used to sort datasets by a parameter of the values in the dataset. To create the object it must be supply the parameter SortByCaptureGroup. This is a regular expression string with a capture group. The dataset will be sorted on the capture group. Examples: "(.*)" Sort the whole string lexicographically ".*:{2}.*:{2}.*:{2}(.*):{2}.*:{2}.*:{2}.*" Sort by the parameter host.

Parameters:
sortByCaptureGroup - A regular expression capture group.

SortByRegX

public SortByRegX(java.lang.String sortByCaptureGroups,
                  java.lang.String captureGroupsOrder)
Sort entries by a set of capture groups Example: Given: Typcal entry: 9404397::NFS::BNL::localhost::/star/data41/reco/productionMinBias/ReversedFullField/P05ic/2004/023::st_physics_adc_5023001_raw_1050014.MuDst.root::155 Regular expression for eantry : "[0-9]*::[a-zA-Z]*::[a-zA-Z]*::[a-zA-Z0-9:.]*::[^:]*::[^:/]*::[0-9]*.*" Sort by -storage- then by -path- then by -fileName- : SortByRegX("[0-9]*::([a-zA-Z]*)::[a-zA-Z]*::[a-zA-Z0-9:.]*::([^:]*)::([^:/]*)::[0-9]*.*", "$1$2$3")

Method Detail

modify

public void modify(Dataset dataset,
                   Request request)
Used to pass the dataset to the dataset manipulator

Specified by:
modify in interface DatasetManipulator
Parameters:
dataset - The dataset to be modifyed
request - The request object of the current request for with will use the dataset

setMaxBufferSize

public void setMaxBufferSize(int maxBufferSize)
Sets how big the the max buffer size in memory sould be. The min value must be bigger then 1,000. The bigger the value the faster the list wi;; be sorted. The default value is 1000 entrys.


getMaxBufferSize

public int getMaxBufferSize()

requirementsSatisfied

public boolean requirementsSatisfied()
Specified by:
requirementsSatisfied in interface DatasetManipulator


Copyright © 2002-2004 STAR collaboration - Brookhaven National Laboratory. All Rights Reserved.