Persistent HBT microDST

This page describes the upgrades to the StHbt software to allow flexible microDST creation and use by the HBT PWG, as well as a first implementation of a specific microDST.

Please read the following topics:


Background

Pretty soon we (the HBT PWG) are going to be facing the issue of the definition and creation of persistent (i.e. file-based) microDSTs. These would be files that we could analyze at RCF, or take home to our institutions to play with (or both). They would presumably be smaller than the STAR DSTs, and the data stored in them may already have undergone user-defined cuts (e.g. relatively rare high-pT kaons may be selected, in order to speed up a later HBT analysis).

At an early meeting in July at Brookhaven, we decided two main things regarding the HBT microDST:

  1. The file format (ASCII, root Ntuple, binary...) would not be fixed. We should maintain flexibility to generate various file types as personal tastes / computing environments dictate.
  2. The transient microDST structure and content would be fixed. This allows for a stable HBT framework, and for interchangable files among HBT PWG members. The transient microDST structures are currently (Sept 1999) being defined and are approaching a stable state. They are described here.
In StHbt, the interface between these two above goals is through the StHbtEventReader classes. For any given data source, one of these classes must be written to read from the source and build a StHbtEvent object for processing. This is done through the StHbtEventReader::ReturnHbtEvent() method.

If you have used the StHbt software, none of this is news.


Extention of StHbtEventReader functionality

In order to allow the writing of data, the StHbtEventReader classes have been extended. You can look in the Base/ directory to see the methods.

Extension of StHbtManager::ProcessEvent() activity

Several small changes were done to StHbtManager (like invoking Init() and Finish() for its Reader and Writer), but the most important one is that the Manager now has a Reader (which it has always had) and a Writer (which is new).
Note:Both the Reader and the Writer are of type StHbtEventReader.
If a Writer is not plugged in to the Manager (see below), event writing is skipped.

Here is what the Manager::ProcessEvent() now does (new parts are in Red):

  1. Requests the next StHbtEvent from the Reader it is pointing to, via ReturnStHbtEvent().
  2. If a null pointer is returned, the Status() of the Reader is checked, and appropriate action taken (see above).
  3. If a Writer has been plugged in, its WriteHbtEvent(StHbtEvent*) method is invokes.
  4. It then loops over all Analyses in its AnalysisCollection, and does lots of stuff for each.-- For details, see http://duvall.star.bnl.gov/STAR/comp/pkg/dev/StRoot/StHbtMaker/doc/#ManagerProcess

To "plug in" a StHbtEventReader object as the Writer at run-time, it is very similar to plugging in the Reader. Here is an snip from a macro using StHbtAsciiReader (see below). New parts are in red

    ...
    cout << "StHbtMaker::Init - setting up Reader and Analyses..." << endl;
 
    StHbtManager* TheManager = hbtMaker->HbtManager();
 
    // here, we instantiate the appropriate StHbtEventReader
    // for STAR analyses in root4star, we instantiate StStandardHbtEventReader
    StStandardHbtEventReader* Reader = new StStandardHbtEventReader;
    Reader->SetTheEventMaker(eventMaker);     // gotta tell the reader where it should read from
    // here would be the palce to plug in any "front-loaded" Event or Particle Cuts...
    TheManager->SetEventReader(Reader); 
 
    StHbtAsciiReader* Writer = new StHbtAsciiReader;
    Writer->SetFileName("FirstMicroDst.asc");
    TheManager->SetEventWriter(Writer);
 
    // 0) now define an analysis...
    StHbtAnalysis* anal = new StHbtAnalysis;
    // 1) set the Event cuts for the analysis
    mikesEventCut* evcut = new mikesEventCut;  // use "mike's" event cut object
    evcut->SetEventMult(0,10000);      // selected multiplicity range

    ...

StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs

Since some time, there have been two StHbtEventReader classes in the Reader/ subdirectory. it is unlikely that a WriteHbtEvent() method would be written for these classes.

A new StHbtEventReader has recently been committed, called StHbtAsciiReader. It reads and writes StHbtEvent objects to/from ASCII-based files. The objects are read/written by overloading the ">>" and "<<" operators in the file StHbtMaker/Infrastructure/StHbtIO.cc.

This makes reading/writing the StHbtEvent (and its StHbtTrackCollection and StHbtV0Collection) trivial! Aside from details of checking the health of the I/O stream, here is the meat of the WriteHbtEvent() method:

  (*mOutputStream) << (*event);

When writing a persistent microDST, StHbtAsciiReader puts the Report of the Manager's Reader at the beginning of the file. That way, if any cuts were imposed by the Reader (like, maybe it took only kaons from central collisions), you will know it. Upon reading the microDST, the Report is extracted from the file and printed to the screen.

The I/O operators "<<" and ">>" are defined to write things in space-delimited ASCII format. This makes the created microDSTs human-readable (more or less) as well as platform-independent and root-independent. It is a major step in getting the StHbt to run on a Macintosh, as Frank once said he wanted to do!


StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs

Here is the first few lines of the first microDST I made:
 This is the StStandardHbtEventReader
---> EventCuts in Reader: NONE
---> ParticleCuts in Reader: NONE

-*-*-*-* End of Input Reader Report
42280 16623 35656 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
7845
 36 158 -2.35927 -0.704536 -0.765364 1.15801e-06 0.00999201 -0.00379657 4.13801 2.40085 -0.38842 0.777676 2.26078 0.00172437 1.20372 -2.69147 -3.3598 6.80424 -9.96443 -1
 36 205 0.989362 -3.3903 -7.98776 1.31375e-06 0.136461 -0.00901088 0.17658 3.41094 -0.0965261 0.181955 0.508187 0.00727746 1.18572 -2.70862 -3.48132 6.67857 -11.1471 -1
ÿ 39 244 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
 37 314 -0.362025 -2.77058 -7.14638 1.19593e-06 0.195594 -0.0112428 1.67892e-11 -24.2164 -0.201485 -0.147778 0.642532 0.00599899 1.19991 -0.981633 -6.1067 -4.05513 -11.0046 -1
 36 374 8.44525 7.45611 0.818802 2.1435e-06 0.134382 -0.0231064 3.59706 4.90645 0.238941 -0.235346 0.866927 0.00446943 1.20167 0.761155 5.11064 -5.05205 -11.3244 -1
 29 464 3.66598 -4.32868 -8.25992 1.64661e-06 0.147011 0.0281962 5.83549 3.9979 -0.0844109 -0.105444 0.344877 0.0110978 1.19752 -0.75574 -4.6307 -5.52921 -11.1167 -1
 28 581 -0.812219 0.356133 -0.989622 1.25606e-06 0.0447296 -0.0117595 2.10197 7.70046 -0.495659 0.279654 1.48808 0.00263388 1.20552 -2.10403 -6.47878 3.67679 -10.3141 -1
What do you see?
  1. First, you see the (rather dull) report from the Manager's Reader, which was just the STAR DST Reader in this case.
    In this case, there were no EventCuts or ParticleCuts attached to the Reader, but if there were, the Report() of each of them would be included in the overall Report().
  2. Then, you see a line -*-*-*-* End of Input Reader Report (recognized by the StHbtAsciiReader) that denotes the end of the Reader's Report() and the beginning of data.
  3. Now comes the first event. You would have to look at the StHbtIO.cc file for specifics, but the first line of the event has the reactionplane orientation, and the number of TPC hits in the event, etc... All the stuff in our transient microDST definition.
  4. After that stuff we read in the StHbtTrackCollection. The first line (which says 7845 in the example above) tells how many StHbtTracks are in this event. It is a lot because no cuts were done!
  5. Next follows the track-wise information for each of the 7845 StHbtTracks, one per line. Again, you would have to look at the StHbtIO.cc for details. However, the first entry in each row is the particle's charge. It is defined as type char, and prints funny: ÿ for positive tracks, and unprintable (ctrl-A I think) for negative.
  6. Now it goes off the screen, but what will follow next is the StHbtV0Collection, once it is defined. Tom Humanic and Helen Caines are working on that.

Savings in disk space and cpu

The HBT microDST can be read with or without the star root environment, and is immune against changes in the star software (making "us" our own worst enemy :). However, one would hope for further advantages, including reduced storage and cpu requirements.

The persistent microDST described here does indeed offer these advantages, but they are difficult to quantify in the absence of context.
For example, the StHbt framework allows for several Analyses to run simultaneously, and for several CorrelationFunctions to run simultaneously within each Analysis. If one has a large number of Analyses/CorrelationFunctions running, the cpu overhead associated with I/O is negligible, and so will be any gain from a faster I/O.
Similarly, if one writes all particles from all events into the microDST (as is done below), then the gains in reduced storage and cpu requirements will not be as great as if one wrote out only high-pT kaons with a small DCA from mid-central collisions, assuming that this is the type of analysis you wish to do.

That said, here are some numbers for our "standard" example of running two simultaneous Analyses. (This is the example from the StHbtExample.C macro.) The first Analysis selects on negative pions, with no real event cut, and loose phasespace cuts. It constructs two simultaneous 1-D correlation functions. The second Analysis looks at correlations between positive and negative kaons with stricter cuts. It constructs one 1-D correlation function. Each Analysis mixes 5 previous events.

First, space. Writing a persistent ASCII-based microDST with no pre-selections on events or particles (all are written out). Input (STAR DST) file is /disk00000/star/test/dev/tfs_Solaris/Wed/year_2a/psc0208_01_40evts.dst.root, which has 8 central Venus events.
STAR DST fileHBT microDST
64.0 MB9.9 MB
So, a modest size savings of a factor of 6.5. This will get worse once we start saving V0's in our microDST, and would get better if we made some cuts (like track DCA) on particles we write out.
It is not unreasonable. It is 1.24 MB/event, with each event having on the order 8000 particles. So it is like 155 bytes/particle. Not too unreasonable. If we want fewer bytes, we will cut away more particles.

Now, time. On solaris (rmds03) and linux (rcas0222), using current /dev (99g), I ran 3, 5, and 8 events. As mentioned above, two Analyses were run simultaneously. In this environment, I found:
rmds03 (solaris) (cpu sec)
STAR DST file inputHBT microDST input diff
"startup time"negligiblenegligible0
time/non-mixing event19.65.314.3
time/mixing event60.044.915.1
So on rmds03, we save about 15 sec/event, which means cpu time falls by factor of 3.7 for non-mixing events, and by ~25% for mixing events. This is only a small gain, since most events will be mixing. However, large gains can be made by making pre-cuts (see above).

rcas0222 (linux) (cpu sec)
STAR DST file inputHBT microDST input diff
"startup time"negligiblenegligible0
time/non-mixing event8.93.05.9
time/mixing event26.821.45.4
So on rcas0222, we save about 5.5 sec/event, which means cpu time falls by factor of 3 for non-mixing events, and by ~20% for mixing events.


Example macros to read and write persistent HBT microDSTs

StHbtWriteDst.C is a slightly modified version of StHbtExample.C. It still runs the two "standard" Analyses, but there are two modifications (both can be found just after the instantiation of the StStandardHbtEventReader which will read the data from StEvent):
  1. The StStandardHbtEventReader (which reads data from StEvent) has an EventCut and a ParticleCut associated with it. Note that only "reasonably central" events make it into the microDST, and that only negative particles that are "roughly" pions with a not-too-big DCA make it in.
  2. The Manager has a Writer, of type StHbtAsciiReader, associated with it. This is what writes out the microDST.

Note that the microDST made by this macro (which has some cuts) is 10X smaller than that made without cuts (which means ~76X smaller than the STAR DST), and so is an example of the big gains possible (alluded to above) if front-loaded cuts are performed.

Note as well that the EventCuts and ParticleCuts should be looser than the cuts imposed by Analyses taking this microDST as input.

The first few lines of the microDST file generated are:

 This is the StStandardHbtEventReader
---> EventCuts in Reader: Multiplicity:  1000-100000
Vertex Z-position:       -3.500000E+01-3.500000E+01
Number of events which passed:  0  Number which failed: 0

---> ParticleCuts in Reader: Particle charge:   -1
Particle Nsigma from pion:      -1.000000E+01 - 1.000000E+01
Particle Nsigma from kaon:      -1.000000E+03 - 1.000000E+03
Particle Nsigma from proton:    -1.000000E+03 - 1.000000E+03
Particle #hits: 10 - 50
Particle pT:    5.000000E-02 - 1.000000E+00
Particle rapidity:      -5.000000E+00 - 5.000000E+00
Particle DCA:   0.000000E+00 - 5.000000E-01
Number of tracks which passed:  0  Number which failed: 0


-*-*-*-* End of Input Reader Report
42280 16623 36736 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
468
ÿ 39 158 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
ÿ 36 205 0.896414 -2.58651 -7.39043 1.31001e-06 0.32631 0.126736 0.00565092 4.17321 -0.193334 -0.136038 0.571922 0.00634082 1.17884 2.25654 -8.93645 -6.67261 -1.88712 1
ÿ 17 244 -1.63562 -5.51637 -6.76244 1.01184e-06 0.449883 -0.127745 1.15457 3.25323 0.226628 0.0445647 -0.169943 0.0064899 -0.634341 -0.981514 55.7191 23.4867 -74.6302 1
So, you can see that the cuts performed in the creation of the persistent microDST are clearly labelled within the file.

ReadMicroDST.C is a macro that will read the persistent microDST written out by the previous macro, and run the "standard" Analyses.

Note that in this case no Maker other than StHbtMaker need be instantiated; our Maker is the whole chain. If you request "too many" events, the End-of-file behaviour should take care of things.

But note that Frank is out of luck with his phi (K+K-) correlation function. The ParticleCut in the Reader of used when creating the microDST selected and wrote out only negative particles!! Thus we see the power and the danger of the "front-loaded" cuts.


Other persistent DSTs wanted!!

Surely, improvements in terms of storage and cpu performance can be made in this persistent microDST model. I ask software experts in the HBT PWG to take a look please!.
Even besides the ASCII-based microDST, however, we need more microDST formats, and Readers! Dave Hardtke, for example, has proposed writing a root-ntuple-based persistent HBT microDST. This would be great. If there would be a universally accessible Reader for this, we could all take advantage of this fast and efficient format. The flexibility in the StHbt framework is meant to be utilized!
Michael A. Lisa
Last modified: Wed Sep 8 23:08:43 EDT 1999