Persistent HBT microDST

This page describes the upgrades to the StHbt software to allow flexible microDST creation and use by the HBT PWG, as well as a first implementation of a specific microDST.

Please read the following topics:

Background - what is the idea of a microDST?
Expanded StHbtEventReader - what has been done to StHbtEventReader to allow microDST usage?
New activities of Manager - what has been done to the StHbtManager to allow microDST usage?
The ASCII-based HBT microDST Reader - an existence proof!
The ASCII persistent microDST - actual HBT files!
Example Macros to read and write persistent HBT microDSTs
Other (future) persistent microDSTs

Background

Pretty soon we (the HBT PWG) are going to be facing the issue of the definition and creation of persistent (i.e. file-based) microDSTs. These would be files that we could analyze at RCF, or take home to our institutions to play with (or both). They would presumably be smaller than the STAR DSTs, and the data stored in them may already have undergone user-defined cuts (e.g. relatively rare high-p_T kaons may be selected, in order to speed up a later HBT analysis).

At an early meeting in July at Brookhaven, we decided two main things regarding the HBT microDST:

The file format (ASCII, root Ntuple, binary...) would not be fixed. We should maintain flexibility to generate various file types as personal tastes / computing environments dictate.
The transient microDST structure and content would be fixed. This allows for a stable HBT framework, and for interchangable files among HBT PWG members. The transient microDST structures are currently (Sept 1999) being defined and are approaching a stable state. They are described here.

In StHbt, the interface between these two above goals is through the StHbtEventReader classes. For any given data source, one of these classes must be written to read from the source and build a StHbtEvent object for processing. This is done through the StHbtEventReader::ReturnHbtEvent() method.

If you have used the StHbt software, none of this is news.

Extention of StHbtEventReader functionality

In order to allow the writing of data, the StHbtEventReader classes have been extended. You can look in the Base/ directory to see the methods.

All StHbtEventReader classes must implement a Report() method, similar to the various Cut and the CorrFctn classes. The Report returns a StHbtString which should identify the Reader and include the Reports of any cuts being done on the data.
As alluded to above, the StHbtEventReader classes may perform cuts on the data, even before a StHbtEvent is sent to the Analyses. This has not been done until now, but has always been in the plans since our 13july99 meeting at BNL. This should increase efficiency, but should be used carefully.
So, the reader may throw away tracks (and V0's) and not even give the Analyses a chance to cut on them. Indeed, a Reader may decide to throw away the whole event, and not even construct a StHbtEvent from it!
The above possibility, along with the new possibility of hitting an End-Of-File in a microDST file, requires something else of the Reader. All Readers (it is in the base class StHbtEventReader) now have a member datum mReaderStatus and a method Status() to access it. mReaderStatus=0 means "good status". Here's how it works:
- If the StHbtEventReader::ReturnHbtEvent() method decides, based on some internal cuts, to throw away the whole event, it returns a null pointer (return 0), and leaves mReaderStatus=0. That way, the HbtManager can tell that there is no event this time, but it should continue to ask for one in the future.
- If the StHbtEventReader::ReturnHbtEvent() method decides that there are not going to be any more events (like, it hits the end of file), it returns a null pointer, but sets mReaderStatus!=0. Then, the HbtManager knows to stop trying to process events.
There is now a method int StHbtEventReader::WriteHbtEvent(StHbtEvent*). It should return zero if there was no problem with the write. The HbtManager now invokes this method right after getting the HbtEvent from its Reader (see below).
Implementation of this method is optional in contrast to the mandatory Report() and ReturnHbtEvent() methods.)
Finally, two more optional methods of StHbtEventReader are provided and invoked by the HbtManager at appropriate times.
1. int StHbtEventReader::Init(const char* ReadWrite, StHbtString Message=" ")
  Here is where one might open a file, if appropriate. The first argument will be "r" for a Reader and "w" for a Writer. (They are two aspects of the StHbtEventReader classes now.) The second argument is relevant for Writers. It passes the result of Report() from the Reader of the Manager. The purpose of this is to allow the inclusion of a Report of any pre-performed cuts at the beginning of any microDST written out. We will thank ourselves for this, if we use it well.
2. void StHbtEventReader::Finish()
  Typically, one would close files here.

Extension of StHbtManager::ProcessEvent() activity

Several small changes were done to StHbtManager (like invoking Init() and Finish() for its Reader and Writer), but the most important one is that the Manager now has a Reader (which it has always had) and a Writer (which is new).
Note:Both the Reader and the Writer are of type StHbtEventReader.
If a Writer is not plugged in to the Manager (see below), event writing is skipped.

Here is what the Manager::ProcessEvent() now does (new parts are in Red):

Requests the next StHbtEvent from the Reader it is pointing to, via ReturnStHbtEvent().
If a null pointer is returned, the Status() of the Reader is checked, and appropriate action taken (see above).
If a Writer has been plugged in, its WriteHbtEvent(StHbtEvent*) method is invokes.
It then loops over all Analyses in its AnalysisCollection, and does lots of stuff for each.-- For details, see http://duvall.star.bnl.gov/STAR/comp/pkg/dev/StRoot/StHbtMaker/doc/#ManagerProcess

To "plug in" a StHbtEventReader object as the Writer at run-time, it is very similar to plugging in the Reader. Here is an snip from a macro using StHbtAsciiReader (see below). New parts are in red

    ...
    cout << "StHbtMaker::Init - setting up Reader and Analyses..." << endl;
 
    StHbtManager* TheManager = hbtMaker->HbtManager();
 
    // here, we instantiate the appropriate StHbtEventReader
    // for STAR analyses in root4star, we instantiate StStandardHbtEventReader
    StStandardHbtEventReader* Reader = new StStandardHbtEventReader;
    Reader->SetTheEventMaker(eventMaker);     // gotta tell the reader where it should read from
    // here would be the palce to plug in any "front-loaded" Event or Particle Cuts...
    TheManager->SetEventReader(Reader);

 
    StHbtAsciiReader* Writer = new StHbtAsciiReader;
    Writer->SetFileName("FirstMicroDst.asc");
    TheManager->SetEventWriter(Writer);

    // 0) now define an analysis...
    StHbtAnalysis* anal = new StHbtAnalysis;
    // 1) set the Event cuts for the analysis
    mikesEventCut* evcut = new mikesEventCut;  // use "mike's" event cut object
    evcut->SetEventMult(0,10000);      // selected multiplicity range

    ...

StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs

Since some time, there have been two StHbtEventReader classes in the Reader/ subdirectory.

StStandardHbtEventReader - designed to read STAR DSTs, and takes its input from StEvent.
StHbtMcEventReader - a very useful tool written by Frank Laue, takes its input from StMcEvent (geant info).

it is unlikely that a WriteHbtEvent() method would be written for these classes.

A new StHbtEventReader has recently been committed, called StHbtAsciiReader. It reads and writes StHbtEvent objects to/from ASCII-based files. The objects are read/written by overloading the ">>" and "<<" operators in the file StHbtMaker/Infrastructure/StHbtIO.cc.

This makes reading/writing the StHbtEvent (and its StHbtTrackCollection and StHbtV0Collection) trivial! Aside from details of checking the health of the I/O stream, here is the meat of the WriteHbtEvent() method:

  (*mOutputStream) << (*event);

When writing a persistent microDST, StHbtAsciiReader puts the Report of the Manager's Reader at the beginning of the file. That way, if any cuts were imposed by the Reader (like, maybe it took only kaons from central collisions), you will know it. Upon reading the microDST, the Report is extracted from the file and printed to the screen.

The I/O operators "<<" and ">>" are defined to write things in space-delimited ASCII format. This makes the created microDSTs human-readable (more or less) as well as platform-independent and root-independent. It is a major step in getting the StHbt to run on a Macintosh, as Frank once said he wanted to do!

StHbtAsciiReader - Reader/Writer for ASCII-based microDSTs

Here is the first few lines of the first microDST I made:

 This is the StStandardHbtEventReader
---> EventCuts in Reader: NONE
---> ParticleCuts in Reader: NONE

-*-*-*-* End of Input Reader Report
42280 16623 35656 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
7845
 36 158 -2.35927 -0.704536 -0.765364 1.15801e-06 0.00999201 -0.00379657 4.13801 2.40085 -0.38842 0.777676 2.26078 0.00172437 1.20372 -2.69147 -3.3598 6.80424 -9.96443 -1
 36 205 0.989362 -3.3903 -7.98776 1.31375e-06 0.136461 -0.00901088 0.17658 3.41094 -0.0965261 0.181955 0.508187 0.00727746 1.18572 -2.70862 -3.48132 6.67857 -11.1471 -1
ÿ 39 244 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
 37 314 -0.362025 -2.77058 -7.14638 1.19593e-06 0.195594 -0.0112428 1.67892e-11 -24.2164 -0.201485 -0.147778 0.642532 0.00599899 1.19991 -0.981633 -6.1067 -4.05513 -11.0046 -1
 36 374 8.44525 7.45611 0.818802 2.1435e-06 0.134382 -0.0231064 3.59706 4.90645 0.238941 -0.235346 0.866927 0.00446943 1.20167 0.761155 5.11064 -5.05205 -11.3244 -1
 29 464 3.66598 -4.32868 -8.25992 1.64661e-06 0.147011 0.0281962 5.83549 3.9979 -0.0844109 -0.105444 0.344877 0.0110978 1.19752 -0.75574 -4.6307 -5.52921 -11.1167 -1
 28 581 -0.812219 0.356133 -0.989622 1.25606e-06 0.0447296 -0.0117595 2.10197 7.70046 -0.495659 0.279654 1.48808 0.00263388 1.20552 -2.10403 -6.47878 3.67679 -10.3141 -1

What do you see?

First, you see the (rather dull) report from the Manager's Reader, which was just the STAR DST Reader in this case.
In this case, there were no EventCuts or ParticleCuts attached to the Reader, but if there were, the Report() of each of them would be included in the overall Report().
Then, you see a line -*-*-*-* End of Input Reader Report (recognized by the StHbtAsciiReader) that denotes the end of the Reader's Report() and the beginning of data.
Now comes the first event. You would have to look at the StHbtIO.cc file for specifics, but the first line of the event has the reactionplane orientation, and the number of TPC hits in the event, etc... All the stuff in our transient microDST definition.
After that stuff we read in the StHbtTrackCollection. The first line (which says 7845 in the example above) tells how many StHbtTracks are in this event. It is a lot because no cuts were done!
Next follows the track-wise information for each of the 7845 StHbtTracks, one per line. Again, you would have to look at the StHbtIO.cc for details. However, the first entry in each row is the particle's charge. It is defined as type char, and prints funny: ÿ for positive tracks, and unprintable (ctrl-A I think) for negative.
Now it goes off the screen, but what will follow next is the StHbtV0Collection, once it is defined. Tom Humanic and Helen Caines are working on that.

Savings in disk space and cpu

The HBT microDST can be read with or without the star root environment, and is immune against changes in the star software (making "us" our own worst enemy :). However, one would hope for further advantages, including reduced storage and cpu requirements.

The persistent microDST described here does indeed offer these advantages, but they are difficult to quantify in the absence of context.
For example, the StHbt framework allows for several Analyses to run simultaneously, and for several CorrelationFunctions to run simultaneously within each Analysis. If one has a large number of Analyses/CorrelationFunctions running, the cpu overhead associated with I/O is negligible, and so will be any gain from a faster I/O.
Similarly, if one writes all particles from all events into the microDST (as is done below), then the gains in reduced storage and cpu requirements will not be as great as if one wrote out only high-p_T kaons with a small DCA from mid-central collisions, assuming that this is the type of analysis you wish to do.

That said, here are some numbers for our "standard" example of running two simultaneous Analyses. (This is the example from the StHbtExample.C macro.) The first Analysis selects on negative pions, with no real event cut, and loose phasespace cuts. It constructs two simultaneous 1-D correlation functions. The second Analysis looks at correlations between positive and negative kaons with stricter cuts. It constructs one 1-D correlation function. Each Analysis mixes 5 previous events.

First, space. Writing a persistent ASCII-based microDST with no pre-selections on events or particles (all are written out). Input (STAR DST) file is /disk00000/star/test/dev/tfs_Solaris/Wed/year_2a/psc0208_01_40evts.dst.root, which has 8 central Venus events.

STAR DST file HBT microDST
64.0 MB 9.9 MB
So, a modest size savings of a factor of 6.5. This will get worse once we start saving V0's in our microDST, and would get better if we made some cuts (like track DCA) on particles we write out.
It is not unreasonable. It is 1.24 MB/event, with each event having on the order 8000 particles. So it is like 155 bytes/particle. Not too unreasonable. If we want fewer bytes, we will cut away more particles.


`STAR DST file`	`HBT microDST`
`64.0 MB`	`9.9 MB`

Now, time. On solaris (rmds03) and linux (rcas0222), using current /dev (99g), I ran 3, 5, and 8 events. As mentioned above, two Analyses were run simultaneously. In this environment, I found:
rmds03 (solaris) (cpu sec)

STAR DST file input HBT microDST input diff
"startup time" negligible negligible 0
time/non-mixing event 19.6 5.3 14.3
time/mixing event 60.0 44.9 15.1
So on rmds03, we save about 15 sec/event, which means cpu time falls by factor of 3.7 for non-mixing events, and by ~25% for mixing events. This is only a small gain, since most events will be mixing. However, large gains can be made by making pre-cuts (see above).

**rmds03 (solaris) (cpu sec)**

	`STAR DST file input`	`HBT microDST input`	`diff`
`"startup time"`	`negligible`	`negligible`	`0`
`time/non-mixing event`	`19.6`	`5.3`	`14.3`
`time/mixing event`	`60.0`	`44.9`	`15.1`

rcas0222 (linux) (cpu sec)

STAR DST file input HBT microDST input diff
"startup time" negligible negligible 0
time/non-mixing event 8.9 3.0 5.9
time/mixing event 26.8 21.4 5.4
So on rcas0222, we save about 5.5 sec/event, which means cpu time falls by factor of 3 for non-mixing events, and by ~20% for mixing events.

**rcas0222 (linux) (cpu sec)**

	`STAR DST file input`	`HBT microDST input`	`diff`
`"startup time"`	`negligible`	`negligible`	`0`
`time/non-mixing event`	`8.9`	`3.0`	`5.9`
`time/mixing event`	`26.8`	`21.4`	`5.4`

Example macros to read and write persistent HBT microDSTs

StHbtWriteDst.C is a slightly modified version of StHbtExample.C. It still runs the two "standard" Analyses, but there are two modifications (both can be found just after the instantiation of the StStandardHbtEventReader which will read the data from StEvent):

The StStandardHbtEventReader (which reads data from StEvent) has an EventCut and a ParticleCut associated with it. Note that only "reasonably central" events make it into the microDST, and that only negative particles that are "roughly" pions with a not-too-big DCA make it in.
The Manager has a Writer, of type StHbtAsciiReader, associated with it. This is what writes out the microDST.

Note that the microDST made by this macro (which has some cuts) is 10X smaller than that made without cuts (which means ~76X smaller than the STAR DST), and so is an example of the big gains possible (alluded to above) if front-loaded cuts are performed.

Note as well that the EventCuts and ParticleCuts should be looser than the cuts imposed by Analyses taking this microDST as input.

The first few lines of the microDST file generated are:

 This is the StStandardHbtEventReader
---> EventCuts in Reader: Multiplicity:  1000-100000
Vertex Z-position:       -3.500000E+01-3.500000E+01
Number of events which passed:  0  Number which failed: 0

---> ParticleCuts in Reader: Particle charge:   -1
Particle Nsigma from pion:      -1.000000E+01 - 1.000000E+01
Particle Nsigma from kaon:      -1.000000E+03 - 1.000000E+03
Particle Nsigma from proton:    -1.000000E+03 - 1.000000E+03
Particle #hits: 10 - 50
Particle pT:    5.000000E-02 - 1.000000E+00
Particle rapidity:      -5.000000E+00 - 5.000000E+00
Particle DCA:   0.000000E+00 - 5.000000E-01
Number of tracks which passed:  0  Number which failed: 0


-*-*-*-* End of Input Reader Report
42280 16623 36736 2519 93 8615 8615 1.33123e-43 1.34525e-43 -0.00882977 -0.0195637 -29.7065
468
ÿ 39 158 -0.397498 1.03464 -0.740154 1.31471e-06 0.146101 -0.0370312 1.26072 1.76109 -0.458401 -0.343533 1.45763 0.00261672 1.19634 2.24284 -8.8974 -6.74094 -1.62357 1
ÿ 36 205 0.896414 -2.58651 -7.39043 1.31001e-06 0.32631 0.126736 0.00565092 4.17321 -0.193334 -0.136038 0.571922 0.00634082 1.17884 2.25654 -8.93645 -6.67261 -1.88712 1
ÿ 17 244 -1.63562 -5.51637 -6.76244 1.01184e-06 0.449883 -0.127745 1.15457 3.25323 0.226628 0.0445647 -0.169943 0.0064899 -0.634341 -0.981514 55.7191 23.4867 -74.6302 1

So, you can see that the cuts performed in the creation of the persistent microDST are clearly labelled within the file.

ReadMicroDST.C is a macro that will read the persistent microDST written out by the previous macro, and run the "standard" Analyses.

Note that in this case no Maker other than StHbtMaker need be instantiated; our Maker is the whole chain. If you request "too many" events, the End-of-file behaviour should take care of things.

But note that Frank is out of luck with his phi (K⁺K^-) correlation function. The ParticleCut in the Reader of used when creating the microDST selected and wrote out only negative particles!! Thus we see the power and the danger of the "front-loaded" cuts.

Other persistent DSTs wanted!!

Surely, improvements in terms of storage and cpu performance can be made in this persistent microDST model. I ask software experts in the HBT PWG to take a look please!.
Even besides the ASCII-based microDST, however, we need more microDST formats, and Readers! Dave Hardtke, for example, has proposed writing a root-ntuple-based persistent HBT microDST. This would be great. If there would be a universally accessible Reader for this, we could all take advantage of this fast and efficient format. The flexibility in the StHbt framework is meant to be utilized!

Michael A. Lisa

Last modified: Wed Sep 8 23:08:43 EDT 1999