STAR Computing | |
Using Datasets and Tables | |
Offline computing tutorial | Maintained by G. Van Buren |
For the current implementation of the STAR Data Model, all data is collected in tables for each event. To provide a hierarchy for the data, these tables are collected into datasets which represent a subsystem or particular physics analysis. Just like these tables are members of datasets, each of these datasets is then a member of a larger event dataset.
event dataset _________________|___________..._______ | | | | sub-dataset sub-dataset table sub_dataset _____|__...__ | | | table table tableFor STAR, each table is defined by a class (e.g. St_table_name) which is derived from a base class called St_Table. It is this St_Table class which provides the functionality common to all tables, like finding the number of rows in the table. You should begin to recognize that STAR-specific classes generally start with St.
Similarly, each dataset is an instance of STAR's St_DataSet class. An associated class, the St_DataSetIter class, provides the functionality of dataset navigation. For instance, you would use an St_DataSetIter to find the tpc dataset within an event, and then you would use another St_DataSetIter to find the particular table in the tpc dataset you want.
Follow along below to learn how these datasets and tables work. To use STAR datasets, you will need to load the St_base shared library. For tables you will additonally need St_Tables:
gSystem->Load("St_base"); gSystem->Load("St_Tables");
St_XDFFile xdf_file("file_name","r"); // the "r" is for "read" St_DataSet *event = xdf.NextEventGet();The pointer event now points to the dataset for the first event. This allows you to loop over the events in a file sequentially:
while (event = xdf_file.NextEventGet()) { ... //Do something with the event }
TFile root_file("file_name"); root_file.ls();
TKey *eventKey = root_file.GetKey("event_set_name",i); // i = cycle number St_DataSet *event = (St_DataSet *) eventKey->ReadObj();Here, eventKey is a pointer to the key for event number i. The ReadObj() member function of TKey retrieves a pointer to the event dataset. Looping through the events on file is then simple:
for (Int_t i=1; i<root_file.GetNKeys(); i++) { St_DataSet *event = (St_DataSet *) root_file.GetKey("event_set_name",i)->ReadObj(); ... //Do something with the event }
St_DataSet *event = root_file.Get("event_set_name;i"); // i = cycle number
St_DataSetIter iter_name(parent_set); St_DataSet *child_set = iter_name("child_set_name");Here, parent_set is a pointer to the parent dataset, and "child_set_name" is the actual name associated with the child dataset.
parent_set->Add(child_set);parent_set is a pointer to the dataset which you want to make the parent of the sub-dataset pointed to by child_set.
Tables store their rows as array elements (with the first row being array element zero) - notice I said rows, not columns. Each individual row of the St_Table-derived class is an instance of the StAF table class. The first row can be reached with the St_Table's GetTable() member function, and the number of rows can be found with GetNRows(). Nomenclature is as follows: for a table of type St_table_type_name which is an St_Table-derived table class, its rows are instances of the StAF table class table_type_name_st. In the examples below, I will associate aaa with an St_Table-derived table, and bbb with the associated StAF table.
St_DataSetIter iter_name(gStChain->DataSet("set_name"); St_aaa *aaa_name = (St_aaa *) iter_name["table_name"];Here, "set_name" is the name of the dataset where the table is located, "table_name" is the name of the table, St_aaa is the St_Table-derived class of the table, and aaa_name is the new pointer to the table. This example shows one aspect of using chains: each dataset associated with a particular maker can be located with the gStChain->DataSet() method. The first line could have been separated into two, getting a pointer to the dataset first (St_DataSet *set = gStChain->DataSet("set_name");), followed by the iterator definition. The pointer which the dataset iterator returns is a St_DataSet pointer, so it must be cast to the table class pointer.
One more point here is that you will notice I used square brackets when finding the table with the iterator. Both () and [] can be used in place of the Find() member function of St_DataSetIter. However, [] finds only tables. If you feed in a name which is actually a sub-dataset, it will return zero. The moral is to use () when using the iterator to find a sub-dataset, and [] to find a table.
bbb_st *bbb_name = aaa_name->GetTable(); for (Int_t i=0; i<aaa_name->GetNRows(); i++) { hist->Fill(bbb_name[i].value1); printf("Value 2: %d\n",bbb_name[i].value2); }In this example, bbb_st is the StAF table class associated with the aaa_name St_Table-derived class, hist is a pointer to a previously define one dimensional histogram (inserted only for example purposes), and value1,value2 are table entries. C++ does provide an alternative method, because the pointer bbb_name can be incremented over the rows (this is standard C++ for using a pointer with an array):
bbb_st *bbb_name = aaa_name->GetTable(); for (Int_t i=0; i<aaa_name->GetNRows(); i++) { hist->Fill(bbb_name->value1); printf("Value 2: %d\n",bbb_name->value2); bbb_name++; }
St_aaa *aaa_name = new St_aaa("name",rows)where St_aaa is a St_Table-derived table class, "name" is the name the table will have, and rows is the number of rows allocated.
bbb_st bbb_name; bbb_name.value1 = 100.; bbb_name.value2 = 3.14159; aaa_name->AddAt(&bbb_name,n);n is the row of the aaa_name table where these values will be inserted (zero is the first row!). If you want to add another row, you do not need to re-instantiate the StAF table class - just reassign the values and add at the next row number you want. This time, bbb_name was actually used as the name of the StAF table object. It could have been used as a pointer if it were defined with:
bbb_st *bbb_name = new bbb_st();
bbb_st *bbb_name = aaa_name->GetTable(); for (Int_t i=0;i<n;i++) { bbb_name->value1 = 100.; bbb_name->value2 = 3.14159; bbb_name++; }This time, bbb_name is a pointer and is assigned to point to the first row of the table upon its declaration. In this example, values for the first n rows of the table are entered.