Persistent HBT microDST
This page describes the upgrades to the StHbt software to allow flexible
microDST creation and use by the HBT PWG, as well as a first implementation
of a specific microDST.
Please read the following topics:
Background
Pretty soon we (the HBT PWG) are going to be facing the issue
of the definition and creation of persistent (i.e. file-based) microDSTs.
These would be files that we could analyze at RCF, or take home to our
institutions to play with (or both). They would presumably be smaller than
the STAR DSTs, and the data stored in them may already have undergone user-defined
cuts (e.g. relatively rare high-pT kaons may be selected, in order
to speed up a later HBT analysis).
At an early meeting in July at Brookhaven, we decided two main things
regarding the HBT microDST:
- The file format (ASCII, root Ntuple, binary...) would not be fixed.
We should maintain flexibility to generate various file
types as personal tastes / computing environments dictate.
- The transient microDST structure and content would be fixed.
This allows for a stable HBT framework, and for interchangable files
among HBT PWG members. The transient microDST structures are currently
(Sept 1999) being defined and are approaching a stable state. They
are described here.
In StHbt, the interface between these two above goals is through the
StHbtEventReader classes. For any given data source, one of these
classes must be written to read from the source and build a StHbtEvent
object for processing. This is done through the StHbtEventReader::ReturnHbtEvent()
method.
If you have used the StHbt software, none of this is news.
Extention of StHbtEventReader functionality
In order to allow the writing of data, the StHbtEventReader
classes have been extended. You can look in the Base/ directory to see the methods.
- All StHbtEventReader classes must implement a Report()
method, similar to the various Cut and the CorrFctn classes.
The Report returns a StHbtString which should identify the Reader and
include the Reports of any cuts being done on the data.
- As alluded to above, the StHbtEventReader classes may perform cuts on the
data, even before a StHbtEvent is sent to the Analyses. This has not
been done until now, but has always been in the plans since our 13july99
meeting at BNL. This should increase
efficiency, but should be used carefully.
So, the reader may throw away tracks (and V0's) and
not even give the Analyses a chance to cut on them. Indeed, a Reader
may decide to throw away the whole event, and not even construct a
StHbtEvent from it!
- The above possibility, along with the new possibility of hitting an End-Of-File
in a microDST file, requires something else of the Reader. All Readers
(it is in the base class StHbtEventReader) now have a member datum
mReaderStatus and a method Status() to access it. mReaderStatus=0
means "good status". Here's how it works:
- If the StHbtEventReader::ReturnHbtEvent() method decides, based
on some internal cuts, to throw away the whole event, it returns
a null pointer (return 0), and leaves mReaderStatus=0. That way,
the HbtManager can tell that there is no event this
time, but it should continue to ask for one in the future.
- If the StHbtEventReader::ReturnHbtEvent() method decides that
there are not going to be any more events (like, it hits the end
of file), it returns a null pointer, but sets mReaderStatus!=0.
Then, the HbtManager knows to stop trying to process events.
- There is now a method int StHbtEventReader::WriteHbtEvent(StHbtEvent*).
It should return zero if there was no problem with the write. The
HbtManager now invokes this method right after getting the HbtEvent
from its Reader (see below).
Implementation of this method is optional
in contrast to the mandatory Report() and ReturnHbtEvent() methods.)
- Finally, two more optional methods of StHbtEventReader are provided and
invoked by the HbtManager at appropriate times.
- int StHbtEventReader::Init(const char* ReadWrite, StHbtString Message=" ")
Here is where one might open a file, if appropriate. The first argument will
be "r" for a Reader and "w" for a Writer. (They are two aspects of the
StHbtEventReader classes now.) The second argument is relevant for Writers.
It passes the result of Report() from the Reader of the
Manager. The purpose of this is to allow the inclusion of a Report of any
pre-performed cuts at the beginning of any microDST written out. We will
thank ourselves for this, if we use it well.
- void StHbtEventReader::Finish()
Typically, one would close files here.
Extension of StHbtManager::ProcessEvent() activity
Several small changes were done to StHbtManager (like invoking Init() and Finish()
for its Reader and Writer), but the most important one is that
the Manager now has a Reader (which it has always had) and a Writer (which is new).
Note:Both the Reader and the Writer are of type StHbtEventReader.
If a Writer is not plugged in to the Manager (see below), event writing is skipped.
Here is what the Manager::ProcessEvent() now does (new parts are in Red):
- Requests the next StHbtEvent from the Reader it is pointing to, via
ReturnStHbtEvent().
- If a null pointer is returned, the Status() of the Reader is checked,
and appropriate action taken (see above).
- If a Writer has been plugged in, its WriteHbtEvent(StHbtEvent*)
method is invokes.
- It then loops over all Analyses in its AnalysisCollection, and does lots of
stuff for each.--
For details, see
http://duvall.star.bnl.gov/STAR/comp/pkg/dev/StRoot/StHbtMaker/doc/#ManagerProcess
To "plug in" a StHbtEventReader object as the Writer at run-time, it is very similar to
plugging in the Reader. Here is an snip from a macro using StHbtAsciiReader (see below). New
parts are in red
...
cout << "StHbtMaker::Init - setting up Reader and Analyses..." << endl;
StHbtManager* TheManager = hbtMaker->HbtManager();
// here, we instantiate the appropriate StHbtEventReader
// for STAR analyses in root4star, we instantiate StStandardHbtEventReader
StStandardHbtEventReader* Reader = new StStandardHbtEventReader;
Reader->SetTheEventMaker(eventMaker); // gotta tell the reader where it should read from
// here would be the palce to plug in any "front-loaded" Event or Particle Cuts...
TheManager->SetEventReader(Reader);
StHbtAsciiReader* Writer = new StHbtAsciiReader;
Writer->SetFileName("FirstMicroDst.asc");
TheManager->SetEventWriter(Writer);
// 0) now define an analysis...
StHbtAnalysis* anal = new StHbtAnalysis;
// 1) set the Event cuts for the analysis
mikesEventCut* evcut = new mikesEventCut; // use "mike's" event cut object
evcut->SetEventMult(0,10000); // selected multiplicity range
...
StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs
Since some time, there have been two StHbtEventReader classes in the Reader/ subdirectory.
- StStandardHbtEventReader - designed to read STAR DSTs, and takes its input from StEvent.
- StHbtMcEventReader - a very useful tool written by Frank Laue, takes its input from StMcEvent (geant info).
it is unlikely that a WriteHbtEvent() method would be written for these classes.
A new StHbtEventReader has recently been committed, called StHbtAsciiReader. It reads and
writes StHbtEvent objects to/from ASCII-based files. The objects are read/written by overloading
the ">>" and "<<" operators in the file StHbtMaker/Infrastructure/StHbtIO.cc.
This makes reading/writing the StHbtEvent (and its StHbtTrackCollection and StHbtV0Collection) trivial!
Aside from details of checking the health of the I/O stream, here is the meat of the WriteHbtEvent()
method:
(*mOutputStream) << (*event);
When writing a persistent microDST, StHbtAsciiReader puts the Report of the Manager's Reader
at the beginning of the file. That way, if any cuts were imposed by the Reader (like, maybe it took
only kaons from central collisions), you will know it. Upon reading the microDST, the Report is
extracted from the file and printed to the screen.
The I/O operators "<<" and ">>" are defined to write things in space-delimited ASCII format. This makes the
created microDSTs human-readable (more or less) as well as platform-independent and root-independent.
It is a major step in getting the StHbt to run on a Macintosh, as Frank once said he wanted to do!
StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs
Here is the first few lines of the first microDST I made:
This is the StStandardHbtEventReader
---> EventCuts in Reader: NONE
---> ParticleCuts in Reader: NONE
-*-*-*-* End of Input Reader Report
42280 16623 35656 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
7845
36 158 -2.35927 -0.704536 -0.765364 1.15801e-06 0.00999201 -0.00379657 4.13801 2.40085 -0.38842 0.777676 2.26078 0.00172437 1.20372 -2.69147 -3.3598 6.80424 -9.96443 -1
36 205 0.989362 -3.3903 -7.98776 1.31375e-06 0.136461 -0.00901088 0.17658 3.41094 -0.0965261 0.181955 0.508187 0.00727746 1.18572 -2.70862 -3.48132 6.67857 -11.1471 -1
ÿ 39 244 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
37 314 -0.362025 -2.77058 -7.14638 1.19593e-06 0.195594 -0.0112428 1.67892e-11 -24.2164 -0.201485 -0.147778 0.642532 0.00599899 1.19991 -0.981633 -6.1067 -4.05513 -11.0046 -1
36 374 8.44525 7.45611 0.818802 2.1435e-06 0.134382 -0.0231064 3.59706 4.90645 0.238941 -0.235346 0.866927 0.00446943 1.20167 0.761155 5.11064 -5.05205 -11.3244 -1
29 464 3.66598 -4.32868 -8.25992 1.64661e-06 0.147011 0.0281962 5.83549 3.9979 -0.0844109 -0.105444 0.344877 0.0110978 1.19752 -0.75574 -4.6307 -5.52921 -11.1167 -1
28 581 -0.812219 0.356133 -0.989622 1.25606e-06 0.0447296 -0.0117595 2.10197 7.70046 -0.495659 0.279654 1.48808 0.00263388 1.20552 -2.10403 -6.47878 3.67679 -10.3141 -1
What do you see?
- First, you see the (rather dull) report from the Manager's Reader,
which was just the STAR DST Reader in this case.
In this case, there were no EventCuts or ParticleCuts attached to the
Reader, but if there were, the Report() of each of them would be included
in the overall Report().
- Then, you see a line
-*-*-*-* End of Input Reader Report
(recognized by the StHbtAsciiReader) that denotes the
end of the Reader's Report() and the beginning of data.
- Now comes the first event. You would have to look at the StHbtIO.cc file
for specifics, but the first line of the event has the reactionplane orientation,
and the number of TPC hits in the event, etc... All the stuff in our
transient microDST definition.
- After that stuff we read in the StHbtTrackCollection. The first line
(which says 7845 in the example above) tells how many StHbtTracks are in this
event. It is a lot because no cuts were done!
- Next follows the track-wise information for each of the 7845 StHbtTracks, one
per line. Again, you would have to look at the StHbtIO.cc for details. However,
the first entry in each row is the particle's charge. It is defined as type char,
and prints funny: ÿ for positive tracks, and unprintable (ctrl-A I think) for negative.
- Now it goes off the screen, but what will follow next is the StHbtV0Collection,
once it is defined. Tom Humanic and Helen Caines are working on that.
Savings in disk space and cpu
The HBT microDST can be read with or without the star root environment, and is immune against
changes in the star software (making "us" our own worst enemy :). However, one
would hope for further advantages, including reduced storage and cpu requirements.
The persistent microDST described here does indeed offer these advantages, but they are difficult
to quantify in the absence of context.
For example, the StHbt framework allows for several Analyses to run simultaneously, and for several
CorrelationFunctions to run simultaneously within each Analysis. If one has a large number of
Analyses/CorrelationFunctions running, the cpu overhead associated with I/O is negligible, and so will
be any gain from a faster I/O.
Similarly, if one writes all particles from all events into the
microDST (as is done below), then the gains in reduced storage and cpu requirements will not be as great
as if one wrote out only high-pT kaons with a small DCA from mid-central collisions, assuming
that this is the type of analysis you wish to do.
That said, here are some numbers for our "standard" example of running two simultaneous Analyses.
(This is the example from the StHbtExample.C macro.)
The first Analysis selects on negative pions, with no real event cut, and loose phasespace cuts. It constructs
two simultaneous 1-D correlation functions. The second Analysis looks at correlations between positive and
negative kaons with stricter cuts. It constructs one 1-D correlation function. Each Analysis mixes 5 previous
events.
First, space.
Writing a persistent ASCII-based microDST with no pre-selections on events or particles (all
are written out). Input (STAR DST) file is
/disk00000/star/test/dev/tfs_Solaris/Wed/year_2a/psc0208_01_40evts.dst.root, which has 8 central Venus events.
|
|
---|
STAR DST file | HBT microDST
|
64.0 MB | 9.9 MB
|
So, a modest size savings of a factor of 6.5. This will get worse once we start saving
V0's in our microDST, and would get better if we made some cuts (like track DCA) on particles
we write out.
It is not unreasonable. It is 1.24 MB/event, with each event having on the order 8000 particles. So it
is like 155 bytes/particle. Not too unreasonable. If we want fewer bytes, we will cut away more particles.
Now, time. On solaris (rmds03) and linux (rcas0222), using current /dev (99g), I ran 3, 5, and 8 events.
As mentioned above, two Analyses were run simultaneously. In this environment, I found:
rmds03 (solaris) (cpu sec)
| |
|
---|
| STAR DST file input | HBT microDST input
| diff
|
"startup time" | negligible | negligible | 0
|
time/non-mixing event | 19.6 | 5.3 | 14.3
|
time/mixing event | 60.0 | 44.9 | 15.1
|
So on rmds03, we save about 15 sec/event, which means cpu time falls by factor of 3.7 for non-mixing
events, and by ~25% for mixing events. This is only a small gain, since most events will be mixing.
However, large gains can be made by making pre-cuts (see above).
rcas0222 (linux) (cpu sec)
| |
|
---|
| STAR DST file input | HBT microDST input
| diff
|
"startup time" | negligible | negligible | 0
|
time/non-mixing event | 8.9 | 3.0 | 5.9
|
time/mixing event | 26.8 | 21.4 | 5.4
|
So on rcas0222, we save about 5.5 sec/event, which means cpu time falls by factor of 3 for non-mixing
events, and by ~20% for mixing events.
Example macros to read and write persistent HBT microDSTs
StHbtWriteDst.C is a slightly modified version of StHbtExample.C. It still
runs the two "standard" Analyses, but there are two modifications (both can be found just after the instantiation
of the StStandardHbtEventReader which will read the data from StEvent):
- The StStandardHbtEventReader (which reads data from StEvent) has an EventCut and a ParticleCut
associated with it. Note that only "reasonably central" events make it into the microDST, and that
only negative particles that are "roughly" pions with a not-too-big DCA make it in.
- The Manager has a Writer, of type StHbtAsciiReader, associated with it. This is what writes out the
microDST.
Note that the microDST made by this macro (which has some cuts) is 10X smaller than that made without cuts
(which means ~76X smaller than the STAR DST), and so is an example of the big gains possible (alluded to above)
if front-loaded cuts are performed.
Note as well that the EventCuts and ParticleCuts should be looser than the cuts imposed
by Analyses taking this microDST as input.
The first few lines of the microDST file generated are:
This is the StStandardHbtEventReader
---> EventCuts in Reader: Multiplicity: 1000-100000
Vertex Z-position: -3.500000E+01-3.500000E+01
Number of events which passed: 0 Number which failed: 0
---> ParticleCuts in Reader: Particle charge: -1
Particle Nsigma from pion: -1.000000E+01 - 1.000000E+01
Particle Nsigma from kaon: -1.000000E+03 - 1.000000E+03
Particle Nsigma from proton: -1.000000E+03 - 1.000000E+03
Particle #hits: 10 - 50
Particle pT: 5.000000E-02 - 1.000000E+00
Particle rapidity: -5.000000E+00 - 5.000000E+00
Particle DCA: 0.000000E+00 - 5.000000E-01
Number of tracks which passed: 0 Number which failed: 0
-*-*-*-* End of Input Reader Report
42280 16623 36736 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
468
ÿ 39 158 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
ÿ 36 205 0.896414 -2.58651 -7.39043 1.31001e-06 0.32631 0.126736 0.00565092 4.17321 -0.193334 -0.136038 0.571922 0.00634082 1.17884 2.25654 -8.93645 -6.67261 -1.88712 1
ÿ 17 244 -1.63562 -5.51637 -6.76244 1.01184e-06 0.449883 -0.127745 1.15457 3.25323 0.226628 0.0445647 -0.169943 0.0064899 -0.634341 -0.981514 55.7191 23.4867 -74.6302 1
So, you can see that the cuts performed in the creation of the persistent microDST are clearly labelled within the
file.
ReadMicroDST.C is a macro that will read the persistent microDST written out
by the previous macro, and run the "standard" Analyses.
Note that in this case no Maker other than StHbtMaker need be instantiated; our Maker is the whole chain. If you request
"too many" events, the End-of-file behaviour should take care of things.
But note that Frank is out of luck with his phi (K+K-)
correlation function. The ParticleCut in the Reader
of used when creating the microDST selected and wrote out only negative particles!! Thus we see the power and the danger
of the "front-loaded" cuts.
Other persistent DSTs wanted!!
Surely, improvements in terms of storage and cpu performance can be made in this persistent microDST
model. I ask software experts in the HBT PWG to take a look please!.
Even besides the ASCII-based microDST, however, we need more microDST formats, and Readers! Dave
Hardtke, for example, has proposed writing a root-ntuple-based persistent HBT microDST. This would
be great. If there would be a universally accessible Reader for this, we could all take advantage of
this fast and efficient format. The flexibility in the StHbt framework is meant to be utilized!
Michael A. Lisa
Last modified: Wed Sep 8 23:08:43 EDT 1999